[Vm-dev] Trying to understand tenuring

Eliot Miranda eliot.miranda at gmail.com
Fri May 8 17:12:02 UTC 2015


Hi Norbert,

    RRDTool looks nice.  Is it available for Mac OS X?  Was there much
configuration?  Are you willing to make your set-up generally available?

On Fri, May 8, 2015 at 2:31 AM, Bert Freudenberg <bert at freudenbergs.de>
wrote:

>
> On 08.05.2015, at 08:43, Norbert Hartl <norbert at hartl.name> wrote:
> > I would assume that an object needs to stay connected 60 seconds until
> it gets tenured. But all of the objects in the request don't live that long.
>
> The object memory does not track how long an object was alive, or even how
> many incremental GCs it survived. Once the tenuring threshold is reached
> (after X allocations), all objects alive in new space at that point get
> tenured.
>
> This is how it worked before Spur, anyway. Not sure what the tenuring
> policy is in Spur?
>

There are two policies.  The main one is as described in

    An adaptive tenuring policy for generation scavengers
    David Ungar, Frank Jackson ParcPlace Systems
    ACM Transactions on Programming Languages and Systems
    Volume 14 Issue 1, Jan. 1992

This is very simple.  Based on how many objects survived the previous
scavenge (how full the current survivor space is), an "age" above which
objects will be tenured is determined.  If lots of objects (survivor space
>= 90% full) have survived the previous scavenge, a proportion of the
oldest objects in the survivor space will be tenured.  Because scavenging
uses a breadth-first traversal, the order of objects in the survivor and
eden spaces reflect their age.  The oldest are at the start of the spaces,
the youngest at the end.  Hence the age is simply a pointer into the
previous survivor space.

The proportion is read and written via vm parameter 6.  In good times (less
than 90% of the survivor space is full) the proportion is zero, so that
objects are only tenured if the survivor space overflows.  One can set the
size of new space (default 4Mb) but the ratios of the spaces are fixed, 5/7
for eden, and 1/7 for each survivor space, as per David's original paper.

The second policy kicks in when the remembered set is very large.  When the
remembered set is greater than a limit dependent on the size of new space
(a 4Mb default eden sets a limit of about 750 entries in the remembered
set), or when it is over 3/4 full (which ever is the larger), the scavenger
uses a policy that attempts to shrink the remembered set by half.  The
scavenger identifies those objects in new space that are referenced from
the remembered set using a 3-bit reference count.  It then chooses a
reference count that includes half of that population of new space objects,
and then tenures all objects with at least that reference count.

This policy finds those new space objects that are referenced from many
remembered set entries, and tenures those, hence statistically freeing
those remembered table entries that reference the most new space objects.
This policy may seem a little elaborate, but
- the naïve policy of merely tenuring everything when the remembered set is
full usually ends up tenuring lots of objects that themselves contain
references to new objects and hence merely fills the remembered set with
fresh new objects and hence simply tenures lots of objects
- I invented this policy to fix GC behaviour in a real world network
monitoring application running on VisualWorks, so I know it works ;-), and
I designed the Spur object header format to make its implementation a
little simpler than VW's


Right now there /isn't/ a good policy for invoking the global mark-sweep
garbage collector, and its compaction algorithm is slow.  The system merely
remembers how many objects there are in old space, and does a full GC
whenever tenuring results in the number of live objects in old space grows
by 50%.  Of course, the image can decide to run the full GC, and does if a
new: fails (which schedules a scavenge) and fails again after the immediate
scavenge.  But we can do better.  Spur is young and there is lots of scope
for adding intelligent (but please, simpler than VisualWorks') memory
policy.

The thing that will really help is an incremental old space mark-sweep
collector.  I'm looking both at

    Very concurrent mark-&-sweep garbage collection without fine-grain
synchronization
    Lorenz Huelsbergen, Phil Winterbottom
    ISMM '98 Proceedings of the 1st international symposium on Memory
management
    Pages 166 - 175 (ACM SIGPLAN Notices,  Volume 34 Issue 3, March 1999)

and the four-colour incremental mark-sweep in the LuaJIT VM, see
http://wiki.luajit.org/New-Garbage-Collector#Quad-Color-Optimized-Incremental-Mark-&-Sweep
.

The incremental GC would likely run in increments after each scavenge, and,
if there was work to do, in the idle loop.  It could conceivably run in its
own thread, but there are good arguments against that (basically a good GC
costs very little, so making it concurrent doesn't gain much performance,
but introduces complexity).  However, I've not got many cycles to address
this and would love a collaborator who was motivated and knowledgeable to
have a go at either of these, preferrably a combination of the two.
-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20150508/239fd45b/attachment.htm


More information about the Vm-dev mailing list