[Seaside] Squeak async i/o performrance,
was Re: HTTP Performance
Bruce ONeel
edoneel at sdf.lonestar.org
Thu Nov 20 16:12:25 UTC 2003
Hmm, I poked for a while but I didn't find enlightment yet.
I'll poke some more.
THanks!
cheers
bruce
Stephen Pair <stephen at pairhome.net> wrote:
> Date: Thu, 20 Nov 2003 09:53:23 -0500
> From: Stephen Pair <stephen at pairhome.net>
> Subject: Re: [Seaside] Squeak async i/o performrance, was Re: HTTP Performance
> To: The general-purpose Squeak developers list <squeak-dev at lists.squeakfoundation.org>
> Cc: "'The Squeak Enterprise Aubergines Server - general discussion.'" <seaside at lists.squeakfoundation.org>
> reply-to: "The Squeak Enterprise Aubergines Server - general discussion." <seaside at lists.squeakfoundation.org>
>
> This test really only measures socket connection performance. RPS as
> measured by ab (apache benchmark tool) is a full HttpRequest and
> HttpResponse. So they are quite different things than what this test
> measures. The ab tests we were running only open a relatively few
> socket connections (I was using 4, 8, 16, 32, etc). The number of
> simultaneous socket connections seems to have little effect on the HTTP
> RPS throughput. Measuring HTTP rps on both inbound connections (to
> comanche) and outbound (to an apache server) yield similar results in
> terms of RPS...even though ab can hit apache with over 1000 RPS, squeak
> can only hit apache with ~50 RPS...and, we seem to spend a lot of time
> (~70%) during these benchmarks waiting on socket semaphores. My guess
> is that something is causing an excessive latency between the time a
> socket semaphore is marked for signaling in the VM and when the waiting
> squeak process actually wakes up.
>
> - Stephen
>
> Andreas Raab wrote:
>
> >Hi Bruce,
> >
> >
> >
> >>I poked at this today and I don't quite get the math to work.
> >>
> >>
> >
> >Interesting you should say this. My impression after running a bunch of
> >tests with Avi and Stephen yesterday was very similar. I can say for sure
> >that there's something wrong here but it's not quite clear what. I've been
> >running an interesting little benchmark which might be good to run on other
> >machines as well. Here it is:
> >
> >"--- the server (run in one image) -----"
> >server := Socket newTCP.
> >server listenOn: 12345 backlogSize: 4.
> >
> >"wait for the first connection (this allows us to profile both ends)"
> >[client := server waitForAcceptFor: 10 ifTimedOut:[nil].
> >client == nil] whileTrue.
> >client destroy.
> >
> >"Now do 1k connections"
> >MessageTally spyOn:[
> > 1 to: 1000 do:[:i|
> > client := server waitForAcceptFor: 10 ifTimedOut:[nil].
> > client destroy.
> > ].
> >].
> >"clean up"
> >server destroy.
> >
> >
> >"--- the client (run in another one) ---"
> >addr := NetNameResolver addressForName: '127.0.0.1'.
> >
> >"wait for the first connection (to sync with server)"
> >[client := Socket newTCP.
> >[client connectTo: addr port: 12345] on: ConnectionTimedOut do:[:ex| ex
> >return].
> >client isConnected] whileFalse:[client destroy].
> >
> >"Now do 1k connections"
> >MessageTally spyOn:[
> > client := Socket newTCP.
> > 1 to: 1000 do:[:i|
> > client := Socket newTCP.
> > client connectTo: addr port: 12345.
> > client destroy.
> > ].
> >].
> >
> >Running this shows profiles on both server and client side as well as
> >(indirectly) the number of connections established. It turns out that this
> >runs in ~4secs on my machine which makes for about 250 connections per
> >second. Looking closer, it looked as if much of the time is spent in socket
> >creation and destroy (which admittedly is a heavy-weight operation on
> >Windows) and measuring it shows this clearly:
> >
> >[1 to: 1000 do:[:i| Socket newTCP destroy]] timeToRun
> >
> >This takes about a second (after some warmup time ;) which means that we
> >need about 1ms for creating and destroying a single socket. Since the above
> >was run on a single machine it means that 50% of the time in the benchmark
> >is spent in socket creation/destroy (2k sockets used which means 2secs
> >overhead out of the 4secs the benchmark took).
> >
> >Definitely something to improve but it still doesn't say why Comanche
> >doesn't get more than 50rps.
> >
> >Cheers,
> > - Andreas
> >
> >
> >
> >>-----Original Message-----
> >>From: seaside-bounces at lists.squeakfoundation.org
> >>[mailto:seaside-bounces at lists.squeakfoundation.org] On Behalf
> >>Of Bruce ONeel
> >>Sent: Thursday, November 20, 2003 1:58 PM
> >>To: The Squeak Enterprise Aubergines Server - general discussion.
> >>Cc: squeak-dev at lists.squeakfoundation.org
> >>Subject: [Seaside] Squeak async i/o performrance, was Re:
> >>HTTP Performance
> >>
> >>
> >>Avi Bryant <avi at beta4.com> wrote:
> >>
> >>
> >>
> >>>Like Stephen, I'm intrigued by your 50rps figure for
> >>>
> >>>
> >>Comanche - that's
> >>
> >>
> >>>the number that I always seem to come up with as well.
> >>>
> >>>
> >>There *must* be
> >>
> >>
> >>>something throttling that, and we need to get to the bottom of it.
> >>>
> >>>
> >>>
> >>Hi,
> >>
> >>I poked at this today and I don't quite get the math to work.
> >>I looked at unix Squeak 3.4-1 so that might distort things.
> >>
> >>The basis of the clock in interp.c seems to be in milisecs. I'm
> >>getting idea from comments and the definition of ioiMsecs in
> >>sqXWindow.c. So, given that:
> >>
> >>checkForInterrupts (interp.c) is called approx every 1000 byte codes
> >>executed. If it hasn't been 3 milisecs since the last call this
> >>1000 is scalled up. If it's been more it is scalled down. We seem
> >>to be trying to check for interrupts every 3 milisecs. This is
> >>controlled by interruptChecksEveryNms.
> >>
> >>Once this happens if it has been 500 milisecs (this seems high)
> >>after the last time we called checkForInterrupts we call
> >>ioProcessEvents().
> >>This comes from the bits in checkForInterrupts()
> >>if (now >= nextPollTick) {
> >> ioProcessEvents();
> >> nextPollTick = now + 500;
> >>}
> >>
> >>ioProcessEvents (sqXWindow.c) calls aioPoll.
> >>
> >>aioPoll (aio.c) calls select and then calls the aio handler for
> >>each descriptor which has pending i/o waiting.
> >>
> >>sqSocketListenOnPortBacklogSize has set up the socket with a listen
> >>and then enabled an aio handler.
> >>
> >>TcpListener has created a socket with a backlog default of 10.
> >>
> >>This implies that every 500 milisecs (ie, 2 times per sec) we can
> >>get 10 accepts for 20 connections per sec when processing
> >>is happening.
> >>
> >>On top of that events are polled when we call
> >>relinquishProcessorForMicroseconds which is called by the idle
> >>process. This would be called with a wait of 1 milisec
> >>(1000 microsecs). I'm not sure how often this is called and it
> >>depends on the load in Squeak as well.
> >>
> >>I guess then that the additional 30 connections per sec
> >>would be accepted during the idle process running.
> >>
> >>It would seem then that a higher backlog in TcpListener might
> >>give higher connections per second. If someone has a test bed
> >>set up which shows the 50 connections per sec, what happens if
> >>you doIt
> >>
> >>TcpListener backLogSize: 1000
> >>
> >>do things work? Do you get more connections pe sec? Do things
> >>hang?
> >>
> >>I'm obviously not sure the above is correct. With luck someone
> >>with a chance of knowing will speak up :-)
> >>
> >>cheers
> >>
> >>bruce
> >>_______________________________________________
> >>Seaside mailing list
> >>Seaside at lists.squeakfoundation.org
> >>http://lists.squeakfoundation.org/listinfo/seaside
> >>
> >>
> >>
> >
> >
> >
> >
>
>
> _______________________________________________
> Seaside mailing list
> Seaside at lists.squeakfoundation.org
> http://lists.squeakfoundation.org/listinfo/seaside
More information about the Squeak-dev
mailing list
|