On Fri, Feb 26, 2021 at 08:40:17AM +0100, Tobias Pape wrote:
On 26. Feb 2021, at 05:04, tim Rowledge tim@rowledge.org wrote:
On 2021-02-25, at 3:17 PM, David T. Lewis lewis@mail.msen.com wrote:
Assuming you have access to the server, the best place to start is by looking in the directory /proc/<pid>/fd/* (where <pid> is the process ID of the Squeak VM (you can find it with something like "$ ps -aef | grep squeak"). If you look at that once a day for a week, you will be able to see a growing number of file descriptors hanging around. Eventually it gets to be too many, and bad things happen.
After an hour or so there are 350 entries in that directory, virtually all sockets. I imagine somehow something has caused an issue that ... well, no idea. Pretty sure it isn't going to last for a week at this rate!
Maybe sockets the other end closed but we did not notice, so hang on to them. I'm quite sure I saw that not too few times when caring for Seaside on Squeak. There was often something strange with the sockets going on when the talking between browser and Seaside was already done but somehow the Squeak hang on to the socket???
It may be that the socket leak problem on squeaksource.com is "fixed" now simply because the it is so much faster now that it is running on 64-bit Spur, so timeouts and other errors simply do not happen any more. In other words, an underlying problem may still be there, but it rarely happens for squeaksource on the faster VM.
The responsibility for closing the sockets is entirely on the Squeak side. Regardless of what the client does or does not do, the Squeak image must always close its socket in order to free the file descriptor.
Tim, one other thing to look out for is the event handling in the VM. Since last October it now uses the Linux-specific epoll mechanism. If the problem you see now started happening after updating to a newer VM, then consider this as a possible suspect.
Dave