[squeak-dev] OSProcess endless loop

Marcel Taeumel marcel.taeumel at hpi.de
Thu Apr 14 05:50:38 UTC 2022


Hi Chris --

Didn't we fix some AIO related stuff in the OSVM?
https://github.com/OpenSmalltalk/opensmalltalk-vm/releases/download/latest-build/squeak.cog.spur_linux64x64.tar.gz


Best,
Marcel
Am 14.04.2022 04:37:45 schrieb Chris Muller <ma.chris.m at gmail.com>:
Hi Dave,

I've been having occasional issues with OSProcess somehow getting
hosed up and becoming unusable in my image. Tonight it started again,
and I'm rather stuck in the water at the moment, unsure how to break
out of it other than building a new image.

Magma now uses OSProcess "outputOf: 'free -wb'" every few seconds to
get ahead of any potential OutOfMemory signals. But somehow my image
got into a state where the simplest uses of OSProcess lock up.
Whenever it happens, I see these messages in the console:

364147968:982663168:[] in
AioEventHandler>>initializeForExceptions:readEvents:writeEvents::aio
event forwarding not supported
364147968:37728768:[] in
AioEventHandler>>initializeForExceptions:readEvents:writeEvents::aio
event forwarding not supported

>From my feeble debugging, it seems #primRead:into:startingAt:count: is
returning a 0 count, which is what leads to OSProcess's
logic-flow to never be able to break out of the while loop in
BufferedAsyncFileReadStream>>#upToEndOfFile. Here's a rough stack
trace of that loop:

BufferedAsyncFileReadStream>>#upToEndOfFile
BufferedAsyncFileReadStream>>#atEndOfFile
(readBuffer atEnd = true, OSProcess accessor isAtEndOfFile:
fileID returns false)
BufferedAsyncFileReadStream>>#readAvailableDataFrom:into:
primRead:into:startingAt:count: (---> answers 0)
OSProcessAccessor>>#isAtEndOfFile: (---> answers false)
(restart loop in BufferedAsyncFileReadStream>>#upToEndOfFile)

It would be nice if OSProcess could detect this situation and signal
some kind of error. With the endless loop, it sometimes takes a while
to get to the bottom of why something isn't responsive.

I'm running production 5.3 with the latest OSProcess and CommandShell.
I thought it might be a resource issue on my laptop, but rebooting
didn't help. Rebuilding from fresh 5.3 image always works, however,
the weirdest thing is, the problem seems to clear ITSELF up. Like,
OMG, right now, it's working again! I had just run a test in a fresh
image to test multiple processes hitting OSProcess outputOf:. It
worked fine and when I came back to my problem image, it's suddenly
working again!

Can you think of anything I might be doing to get into this situation
and/or how to break out of it? Something to avoid or initialize?

The other thing I noticed, when I would break into the locked up
OSProcess with Cmd+. (dot), there were TWO processes stuck
in the loop, one from my DoIt, the other originating from the line:

"self changed: #childProcessStatus"

of #grimReaperProcess. Sigh.. I apparently already closed those
debuggers and now can't reproduce the issue to paste their bug report
stack traces! Sorry.

I hope I'm not the only one experiencing this issue so we can
hopefully track this down. It's insidious when it happens.

Thanks,
Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20220414/c33353a7/attachment.html>


More information about the Squeak-dev mailing list