I loaded all your semaphore related patches a couple of months ago and squeaksource.com ran quietly and happily up to a few weeks ago. Then suddenly we got many processes hanging in Semaphore>>#critical:.
If you could send a couple of complete stack dumps from the affected image it might be interesting. There is a possibility you were affected by the problem of primitiveSuspend (which we discussed earlier) but that's difficult to tell from a stack dump. Much much easier if you can go into the image and check whether the doIt I sent comes up empty or not.
The doIt you sent comes out empty, I've never seen a case where it actually returned a process. For the stack dumps I've got only the attached screenshot from the process browser that I took December 5., roughly a month after loading your patches.
What we've experienced was basically that after the first commit, when our image went to saving the data model in a reference stream (via SSFileSystem; takes about two minutes or so), a second commit would wreck havoc on the system. You can probably simulate this by generating enough load from different clients on the network with or without SSFileSystem. And I don't like the idea of saving the image very much because it's probably not feasible to save multiple versions of that image which ultimately means that any data corruption kills the whole data model.
We save the image every hour, what only takes a couple of seconds. We also recently fixed some bugs that caused it to block for minutes afterwards.
Interesting thought. It may be possible for some strange things to happen if Seaside doesn't take precautions of not accepting connections while in the midst of a save. The problem is that the image save/startup runs with whatever priority it's being issued at, so if there's another process running at the same time there is a chance this process interrupts the image save with the potential for strange things happening. Here is one way in which I could see this happening: A critical lock held by a process waiting for network traffic to occur when the image is saved. When the image is restored later on, that socket is no longer valid but the process could still wait on the semaphore, blocking the critical section for all other uses.
Current versions of the Kom server adapter for Seaside stop listening while saving the image, but I have to check if this is also the case with the version of Seaside used in squeaksource.com.
Cheers, Lukas