Lots of concurrent IO.

Wed Aug 15 20:53:32 UTC 2001

On Wed, 15 Aug 2001, Lex Spoon wrote:

> >
> > > Those arguments aside, here are the numbers.  I just checked, and my
> > > modest 366 MHz computer can do 95,000 #dataAvailable's per second on a
> > > socket that is open but idle.  That's 95 checks per millisecond, and 950
> > > times as fast as the 10 millisecond threshold mentioned above.  Thus, on
> > > a completely idle system, you check 950 sockets 100 times per second.
> >
> > These numbers are superb; I hadn't realized they were quite this good..
> > But at, say, 1-3 commands/sec/socket, say, 500 sockets polled at 10ms,
> > thats 50,000 polls/sec, of which only about 500-1500 will have any data.
> > So, I'm spending about half of that CPU polling, and half constructing a
> > Then add in the time and polling for output.. :/
>
>
> Hmm.  Yes, 50% is being used to poll for input in this extreme case.
> But it *is* the extreme case!  It's a large number of connections, with
> a not-too-fast computer, where you are expecting very few active sockets
> but for those sockets to consume a lot of CPU.
>

Which is fair, for some sort of centralized chat-server type process. You
need one socket/user, and those sockets will move maybe 100 bytes coming
in every 10-20 seconds, and move 100 bytes out once a second or so.

> Also, #dataAvailable hasn't been optimized at all.  Currently it is
> making a system call into the kernel (this was on Linux, by the way),
> but probably it could be improved to cache the results if nothing has
> changed about the socket.  In fact, it might be worth revisiting the
> whole VM<->image network interface with a mind to servers -- the current
> setup is really optimized, it seems, for clients that have a single
> socket open and which have nothing to do while waiting for the socket to
> become active.
>

Someone mentioned  http://netjam.org/flow/  to me. Is it any good?

>
>
> You mentioned also that you are worried about output as well as input.
> This only matters for backlogged sockets, though.  Again, this depends
> on your server, but in a lot of cases output will usually be consumed
> immediately into the kernel's buffer; typically a kernel will hold a few
> tens of kilobytes for you before write()  (and thus #sendSomeData:)
> starts to fail.  (Well, I say "typically".  This is what I've observed
> on Linux....)
>

Thats what should happen in the typical case, but if IRC teaches anyone
anything, its that there are a lot of anal F***'s out there who will try
to cause things to break. So someone may disable their incoming queue so
that the linux socket buffer does overflow.

I was trying to build something that would be relatively immune to such
attacks.

> > But it seems like squeak implements this waitForDataUntil internally as
> > being a busywait loop anyways.. Oh well.
>
> Actually, there's a semaphore wait buried in the loop somewhere.  It's
> just that the semaphore getting signalled isn't a guarantee that any
> particular event has happened, and so the code must query the socket
> after the semaphore is signalled.

Its as if that semaphore gets signalled only when there's no data there,
and gets unsignaled right after data arrives.

>
> In fact, if you don't mind devoting a thread to each socket, then you
> can do a "wait" in each thread and achive your 0 polling goal.  You
> could even single-thread things again after the waits by having a
> message queue (SharedQueue in Squeak) with active sockets, which is
> queried from a main thread.
>

This is exactly my design. Two threads per socket, with a central thread
handling the combined parsed input stream. Everything in the IO subsystem
is blocked on either shared queues or its supposed to be blocked on the
lowlevel socket primitives.

Also, this problem appears to hit my listener process:
  Try running:

  [socket := listener waitForAcceptUntil: (Socket deadlineSecs: 5).
   socket == il]
  whileTrue.

Which is basically supposed to wait until someone connects. If I leave
just this running, squeak consumes all CPU.

If I connect, then kill the listener process, but keep the socket running
with no input and invoke:
   socket waitForDataUntil: (Socket deadlineSecs: 10)
It also sucks up all CPU.

But, if I feed it a line of data, then during the 10 second pause before
it accepts any more data. (The one I mentioned above.), the CPU usage does
go down to zero.

Thanks.

Scott