[Vm-dev] unix SqueakSource socket 'too many open files' problem

David T. Lewis lewis at mail.msen.com
Fri Feb 26 13:00:05 UTC 2021


On Fri, Feb 26, 2021 at 08:40:17AM +0100, Tobias Pape wrote:
> 
> > On 26. Feb 2021, at 05:04, tim Rowledge <tim at rowledge.org> wrote:
> > 
> >> On 2021-02-25, at 3:17 PM, David T. Lewis <lewis at mail.msen.com> wrote:
> >> 
> >> Assuming you have access to the server, the best place to start
> >> is by looking in the directory /proc/<pid>/fd/* (where <pid> is
> >> the process ID of the Squeak VM (you can find it with something
> >> like "$ ps -aef | grep squeak"). If you look at that once a day
> >> for a week, you will be able to see a growing number of file
> >> descriptors hanging around. Eventually it gets to be too many,
> >> and bad things happen.
> > 
> > After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!
> 
> Maybe sockets the other end closed but we did not notice, so hang on to them.
> I'm quite sure I saw that not too few times when caring for Seaside on Squeak.
> There was often something strange with the sockets going on when the talking 
> between browser and Seaside was already done but somehow the Squeak hang on to
> the socket???
>

It may be that the socket leak problem on squeaksource.com is "fixed"
now simply because the it is so much faster now that it is running on
64-bit Spur, so timeouts and other errors simply do not happen any
more. In other words, an underlying problem may still be there, but
it rarely happens for squeaksource on the faster VM.

The responsibility for closing the sockets is entirely on the Squeak
side. Regardless of what the client does or does not do, the Squeak
image must always close its socket in order to free the file descriptor.

Tim, one other thing to look out for is the event handling in the VM.
Since last October it now uses the Linux-specific epoll mechanism.
If the problem you see now started happening after updating to a
newer VM, then consider this as a possible suspect.

Dave



More information about the Vm-dev mailing list