On Apr 30, 2005, at 8:00 PM, Andreas Raab wrote:
Hi Tim -
After having problem trying to debug some TK4 code that blew up with lowspace problems but never let me catch and debug, I spent some time adding the lowspace-process stuff we recently discussed. I had to make a few alterations to match it up with the latest 64bit clean code but no problems with that part.
What am I missing? I don't remember low-space stuff - I only remember interrupt-related stuff.
There was a mantis bug about low-space issues and some patchs to record which process caused the lowspace signal. Mind this in my opinion is wrong.
Depending upon the exact size of object memory in use the 200kb used as the lowSpaceThreshold can be gobbled up in one swallow by the initializeMemoryFirstFree: method making sure there is a byte per object that survived the markPhase. In using useUpMemory we can get to having 4 bytes of free space when the next allocate is attempted.... Ka-Boom.
Well, so don't eat up the memory. There is no reason why initializeMemoryFirstFree: would have to reserve that much memory - like the comment says the reserve "should" be chosen so that compactions can be done in one pass but there is absolutely no such requirement. Multi-pass compactions have happened in the past and there is nothing wrong with them (in a low-space situation).
This assumes that we really need to have one byte per object of course. The original rationale was to keep the number of compact loops down to eight (see Dan's comment in initializeMemoryFirstFree:) for Alan's large demo image. The nicest solution would be to come up with a way to do our GC & compacting without needing any extra space. Commence headscratching now... John suggested making sure the fwd gets less than the byte-per-object if things are tight, and accpting the extra compaction loops.
Yes. That's the only reasonable way of dealing with it.
What happens is the fwdblocks calculation grabs all the available free memory when it's recalculated after the full GC, the check for this condition actually backs it off to allow one object header free, 4 or 6 bytes I believe, usually you die right away because someone attempts to allocate a new context record and we don't have 98ish bytes free. I gave Tim a change set that attempts to maximise freespace to 100K by reducing fwdblocks down to 32k, once you hit the 32k limit freespace then heads towards zero of course.
Note that once freespace goes under 200,000 we do signal the lowspace semaphore btw.
These changes do require a VM change, but we did notice as Tim points out if you increase the lowspace threshold, say to 1MB in my testing the other night we'll get the semaphore signaled with a current VM, this would not occur before in an unaltered VM.
Bad news- consider Tweak. With lots of processes whizzing away, merely stopping the one that did the allocation and triggered the lowspace is not going to be much good. Stopping everything except the utterly essential stuff to debug the lowspace will be needed. Probably.
Uh, oh. Are you telling me that the "low space stuff" you are referring to above actually suspends the process that triggers the low-space condition? Bad, bad, bad idea. Ever considered that this might be the timer process? The finalization process? Low-space is *not* a per-process condition; suspending the currently running process is something that should be done with great care (if at all).
Please, don't suspend that process - put it away for the image to examine but by all means do NOT suspend it. If you give me a nice clean semaphore signal for Tweak to handle a low-space condition I know perfectly well what to do but if you just suspend a random process which may have absolutely nothing with the low space condition, then, yes, we are in trouble (if this were a tweak scheduler process you'd be totally hosed).
Tim and I were considering to suspend all user processes and others we don't have knowledge of being untouchable, then I pointed out Tweak spawns all these process, what do we do about them? Certainly we can call something to say lowspace Mr Tweak beware...
The Process Browser logic has a table identifying processes of the VM, we assume a process the user created is causing the problem. The earlier fix suggested to stop the process that was running when the lowspace condition occurred, but I doubt you can 100% say that is the process in question and could as you know be the finalization process or other critical task. Still this is not harmful because the evil process in question is still running and will terminate your image in short order.
Cheers,
- Andreas
-- ======================================================================== === John M. McIntosh johnmci@smalltalkconsulting.com 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== ===