A problem with standardTime:

Tue Feb 19 09:50:20 UTC 2002

On Tuesday, February 19, 2002, at 02:40 AM, John M McIntosh wrote:

[standardTime checking for available memory and grabbing most of it]

> But on a machine that only *has* 512MB of memory, lots of things are 
> paged out and a historically significant event called *page thrashing* 
> occurs.

Yes ;-)

Incidentally, this will also happen on a system with more RAM if that 
RAM is actually used by other applications.  With an OS like Mach, which 
always tries to utilize all available RAM (minus a safety), grabbing 
memory like that is always a significant detriment to overall 
performance, because that memory won't  be avilalbe for disk caching etc.

> Now I think Dan wrote this, and perhaps it needs to be rethinked.

Maybe.  OTOH, I think it points out a flaw in the memory-grow logic, 
namely...

> I'm not sure on unix machines you can really ask what the 'safe' limit 
> it,

...exactly!  That was one of the points I tried to get across in my 
discussion with Andreas, although I think I didn't do a very good job of 
it.  What is safe is an extremely fluid concept.

At the very least, memory you use is not available to the system for 
other things such as file mapping.  Mach memory maps all files, even the 
ones accessed via read()/write() and will use all of memory as a disk 
cache.  It will keep files in memory even if they aren't currently being 
accessed.  So there is no such thing as zero cost.

Somewhat more costly (overall) is when you start displacing other 
programs data or code.  At first the inactive parts, and as long as 
you're only displacing 'clean' pages such as code or read-only data, 
things are still fairly cheap.  If you displace 'dirty' data, things 
start getting expensive because you have to move stuff to disk.  It is 
still somewhat OK if that data is not part of the active set, because 
then you page it out to the swapfile and forget about it.

Displacing active memory starts getting nasty, because then you have to 
page it back in soon, with the worst being active read/write memory, 
because you have to both page it out and read it back in.

So I think John is right in saying that the term 'lmit' is largely 
meaningless on such systems, at least for those limits that are readily 
available.   The lower limit of currently available free memory is 
meaningless because a good VM subystem will keep this close to zero.  If 
Squeak is the only significant process running, ( installed real 
memory - headrom ) may have some meaning, though you wouldn't really 
want to grow that much gratuitously.  Available swap-space is also 
largely meaningless ( 20+ gig on my system) because a system like Squeak 
that actually walks through all of its memory every once in a while 
(full GC) will page-thrash way before that.

Meaningful parameters are probably best expressed in some kind of 
weighed pressure model.  A small pressure against Squeak keeps it from 
gratuitously filling memory.  If fullGCs become more frequent, the 
pressure exerted by Squeak to grow its memory space increases.  
Counter-pressure increases with the paging rate and as (real-memory  - 
headroom) is approached.  One potentially very elegant way of handling 
this sort of thing would be to have Squeak involved in managing its own 
memory via an external pager.  Let's hope this facility gets re-enabled 
at some point.

>  after all the objective of VM operating systems is to allow you to run 
> applications that don't fit into real memory boundaries.

...while keeping the working set in real memory.  One problem with 
Squeak (and other GCed systems) is the full-scan that is performed by 
the copying/compacting fullGCs.  This means that when significant memory 
activity is happening, which is also the point at which the VM would be 
most useful, you defeat the VM by making all of your memory the working 
set, and what's worse, a read/write working set.  (OTOH, copying GC is 
probably good for incremental GC because it keeps the hot/active area 
constant and therefore inside the CPU caches).

I am pretty sure that a non-copying GC would significantly reduce the 
(real) memory requirements of Squeak on machines with decent VM 
systems.  Better yet would be one that can avoid scanning as well.  I 
think the Boehm collector has an option for using MMU hardware to avoid 
unnecesary scans.  With that, not only would stuff that's not currently 
used not occupy real memory, it would also just sit on disk without 
causing any paging or other activity.

> So I'll welcome a change set

;-)

Marcel