HTTP server choices?

Wed Jan 4 18:45:10 UTC 2006

Andreas Raab wrote:

> David Shaffer wrote:
>
>> Each of the comparison server technologies was capable of responding sub
>> millisecond even under loads which correspond to Kom's maximum.  Each of
>> the comparison server technologies could handle over 300 requests per
>> second and in most cases well beyond that.  The net effect (IN MY
>> OPINION) is that Squeak web servers feel noticably sluggish even under
>> no load.  Most people I've talked to say the same thing.  I've heard
>> claims about "saturating a 10Mb/s network"...let me just say that under
>> load with the current VM latency has a much larger effect on
>> responsiveness and robustness than throughput.
>
>
> Has anyone ever thoroughly benchmarked this? There are certainly
> platform differences but it seems like having an acceptable benchmark
> suite would make finding hot spots much easier.

Yes.  There are several issues depending on what you're trying to
improve.  The typ. >1ms response time in Linux has to do with the
frequency at which aio polls for activity.  I have details and all the
tools needed to do further benchmarking.  What I lack is confidence that
it is worth the time.  As recent activity on the VM list has indicated,
the I/O-related plugins and classes are a mess.  I am more likely to
build an orthogonal I/O system rather than trying to fix the existing
design.  Also I'm uncertain what would happen during image saves or
changes file writes if Squeak file I/O were asynchronous.

>
>
> It's actually not quite so obvious. One issue (that I learned over
> time) is that you really don't want to use native threads (at least
> not without thread pooling) since the creation of those threads is to
> heavy-weight. Like, right now creating a socket (Socket new) takes
> almost a millisecond and most of this is in setting up two new threads
> (never mind that this also takes significant resources which further
> limits the max throughput). Try this for example:
>
>     [1 to: 100 do:[:i| Socket new destroy]] timeToRun
>
> This gives you a feel for what basically "unavoidable overhead" is
> (creating and destroying a socket) and unless I'm mistaken it will
> show quite clearly why you can't possible get sub-millisecond response
> on Windows (dunno about other platforms but on Win32 this takes about
> 2ms per socket - which I believe is exclusively due to dealing with
> threads).

I'm not talking about a thread per socket (or file descriptor).  That
has been (long ago) proven to not perform well.  Most of the web server
literature points to select (and poll/epoll) as better performing than
thread-per-socket.  The problem with aio's use of select is that it is
polling in the VM thread.  What we need, IMHO, is an I/O pump: a single
thread which sits in a call to select or epoll.  A pool of such threads
might be OK too.  Such a pool might improve overall throughput....but
again, I think the first thing to fix is latency.

Web server performance literature is abundant...start with this rather
old but helpful one if you're interested:
http://www.kegel.com/c10k.html.  It may be a bit dated but has great links.

>
> For #2 are you aware that there's AsyncFile for asynchronous I/O
> access? This may actually give you a cross-platform solution.

That's what I'm using...I created a FileStream like wrapper around
AsyncFile.  I think I have source on the wiki page:
http://minnow.cc.gatech.edu/squeak/539.  There are problems getting this
to work with ModFile though:

    1) ModFile checks for file existance by   (FileStream isAFileNamed:
fullFilePath).  #isAFileNamed: opens the file which is a VM blocking
call...bechmarks showed that to be a problem.
    2) Someone along the response chain sends #size to the stream which
means that I have to compute the file size...again a VM blocking call. 
I added a #size method to AsyncFile since Linux is fairly efficient at
fstat() (versus the now-forgetten method the FilePlugin uses to compute
the size).

David