do I have to garbageCollect every time I create a large object?

Fri Aug 10 00:41:13 UTC 2001

Another trick, that is described in the Oopsla'86 Tektronix Smalltalk 
paper, is to allow the indexable part of an object to be physically remote 
from the main part.  Then large binary object can consist of an object 
header that is allocated in a normal generation space and a data portion 
that is allocated in a separate area.  The header gets flipped and copied 
up the generation hierarchy while the big hunk just sits where it was 
allocated. It turn out that remote indexable parts like this are fairly 
easy to implement in a Smalltalk vm because the only way to access the 
indexable part is via at: and at:put: primitives (and a few other 
primitives) so these are the only places that have to know about the the 
remote part.

As an aside, the design heuristics of modern garbage collectors just don't 
work very well with highly dynamic very large objects. I spent considerable 
time this week working on this vary problem in the context of the garbage 
collector that is part of our JOVE Java Compiler. A customer had an video 
capture application that (from the perspective of the garbage collector) 
was nothing but a tight loop that allocated and discarded very large 
objects.  When should it collect?  For each large object allocation?  That 
has a very high computational overhead.  If not, how much space is the user 
willing to use to buffer large garbage 100MB? 200MB??

The "theoretical" conclusion I came to is that large binary objects like 
this should be managed with reference counts (binary object = no 
circularities).  The only tricky part is figuring out which slots will 
reference such objects and hence need reference counting logic.

Allen_Wirfs-Brock at instantiations.com

At 10:27 AM 8/9/2001 -0700, Tim Rowledge wrote:
>There are plenty of ways to arrange memory and garbage collection to
>alter the effect of large objects; obviously they will have an effect on
>other parts of the system. At the most trivual level, if you implement
>a separate space for large objects (a relatively common case in most of
>the commercial Smalltalks) you have a more complex problem when tracking
>inter-space pointers. You also have more of a problem when growing your
>spaces, working out how much is free (if you have 1Mb in space A, 400kb
>in space B and 25kb in space C, does that mean you can allocate a
>1.425Mb object? Doubt it...) and so on.
>
>My experience is that it can often be worth the trouble of having a
>space exclusively for large non-pointer objects such as big bitmaps etc.
>You don't have to scan these for tracing purposes in gcs and that makes
>life much simpler.
>
>Of course, another approach is to avoid allocating huge monolithic
>objects. If you look at your needs it's going to be pretty rare that you
>actually need one.
>
>tim
>--
>Tim Rowledge, tim at sumeru.stanford.edu, http://sumeru.stanford.edu/tim
>The whole is the sum of its parts, plus one or more bugs