[Seaside] Squeak async i/o performrance, was Re: HTTP Performance

Andreas Raab andreas.raab at gmx.de
Thu Nov 20 14:32:36 UTC 2003


Hi Bruce,

> I poked at this today and I don't quite get the math to work.

Interesting you should say this. My impression after running a bunch of
tests with Avi and Stephen yesterday was very similar. I can say for sure
that there's something wrong here but it's not quite clear what. I've been
running an interesting little benchmark which might be good to run on other
machines as well. Here it is:

"--- the server (run in one image) -----"
server := Socket newTCP.
server listenOn: 12345 backlogSize: 4.

"wait for the first connection (this allows us to profile both ends)"
[client := server waitForAcceptFor: 10 ifTimedOut:[nil].
client == nil] whileTrue.
client destroy.

"Now do 1k connections"
MessageTally spyOn:[
	1 to: 1000 do:[:i|
		client := server waitForAcceptFor: 10 ifTimedOut:[nil].
		client destroy.
	].
].
"clean up"
server destroy.


"--- the client (run in another one) ---"
addr := NetNameResolver addressForName: '127.0.0.1'.

"wait for the first connection (to sync with server)"
[client := Socket newTCP.
[client connectTo: addr port: 12345] on: ConnectionTimedOut do:[:ex| ex
return].
client isConnected] whileFalse:[client destroy].

"Now do 1k connections"
MessageTally spyOn:[
	client := Socket newTCP.
	1 to: 1000 do:[:i|
		client := Socket newTCP.
		client connectTo: addr port: 12345.
		client destroy.
	].
].

Running this shows profiles on both server and client side as well as
(indirectly) the number of connections established. It turns out that this
runs in ~4secs on my machine which makes for about 250 connections per
second. Looking closer, it looked as if much of the time is spent in socket
creation and destroy (which admittedly is a heavy-weight operation on
Windows) and measuring it shows this clearly:

[1 to: 1000 do:[:i| Socket newTCP destroy]] timeToRun

This takes about a second (after some warmup time ;) which means that we
need about 1ms for creating and destroying a single socket. Since the above
was run on a single machine it means that 50% of the time in the benchmark
is spent in socket creation/destroy (2k sockets used which means 2secs
overhead out of the 4secs the benchmark took).

Definitely something to improve but it still doesn't say why Comanche
doesn't get more than 50rps.

Cheers,
  - Andreas

> -----Original Message-----
> From: seaside-bounces at lists.squeakfoundation.org 
> [mailto:seaside-bounces at lists.squeakfoundation.org] On Behalf 
> Of Bruce ONeel
> Sent: Thursday, November 20, 2003 1:58 PM
> To: The Squeak Enterprise Aubergines Server - general discussion.
> Cc: squeak-dev at lists.squeakfoundation.org
> Subject: [Seaside] Squeak async i/o performrance, was Re: 
> HTTP Performance
> 
> 
> Avi Bryant <avi at beta4.com> wrote:
> 
> > Like Stephen, I'm intrigued by your 50rps figure for 
> Comanche - that's 
> > the number that I always seem to come up with as well.  
> There *must* be 
> > something throttling that, and we need to get to the bottom of it.
> > 
> 
> Hi,
> 
> I poked at this today and I don't quite get the math to work.
> I looked at unix Squeak 3.4-1 so that might distort things.
> 
> The basis of the clock in interp.c seems to be in milisecs.  I'm
> getting idea from comments and the definition of ioiMsecs in 
> sqXWindow.c.  So, given that:
> 
> checkForInterrupts (interp.c) is called approx every 1000 byte codes
> executed.  If it hasn't been 3 milisecs since the last call this
> 1000 is scalled up.   If it's been more it is scalled down.  We seem
> to be trying to check for interrupts every 3 milisecs.  This is
> controlled by interruptChecksEveryNms.
> 
> Once this happens if it has been 500 milisecs (this seems high)
> after the last time we called checkForInterrupts we call
> ioProcessEvents().
> This comes from the bits in checkForInterrupts()
> if (now >= nextPollTick) {
> 	ioProcessEvents();
> 	nextPollTick = now + 500;
> }
> 
> ioProcessEvents (sqXWindow.c)  calls aioPoll.
> 
> aioPoll (aio.c) calls select and then calls the aio handler for 
> each descriptor which has pending i/o waiting.
> 
> sqSocketListenOnPortBacklogSize has set up the socket with a listen
> and then enabled an aio handler.
> 
> TcpListener has created a socket with a backlog default of 10.
> 
> This implies that every 500 milisecs (ie, 2 times per sec) we can 
> get 10 accepts for 20 connections per sec when processing
> is happening.
> 
> On top of that events are polled when we call 
> relinquishProcessorForMicroseconds which is called by the idle
> process.  This would be called with a wait of 1 milisec 
> (1000 microsecs).  I'm not sure how often this is called and it
> depends on the load in Squeak as well.
> 
> I guess then that the additional 30 connections per sec
> would be accepted during the idle process running.
> 
> It would seem then that a higher backlog in TcpListener might
> give higher connections per second.  If someone has a test bed
> set up which shows the 50 connections per sec, what happens if
> you doIt
> 
> TcpListener backLogSize: 1000
> 
> do things work?  Do you get more connections pe sec?  Do things
> hang?  
> 
> I'm obviously not sure the above is correct.  With luck someone
> with a chance of knowing will speak up :-)
> 
> cheers
> 
> bruce
> _______________________________________________
> Seaside mailing list
> Seaside at lists.squeakfoundation.org
> http://lists.squeakfoundation.org/listinfo/seaside
> 




More information about the Squeak-dev mailing list