[V4][VM] My personal ideas list for the V4 VM.

Scott A Crosby crosby at qwes.math.cmu.edu
Mon Feb 18 15:19:45 UTC 2002


Here's my personal todo/idea list of suggestions and stuff to make sure to
do, or at least consider for the V4 VM, and the estimated/actual gains of
doing it. (As judged by profiling, or guessed if no concrete profiling is
available. References available upon request.)

----

VM Performance TODO's

---

Ideas for more performance in the VM (for V4)

--- IMPLEMENTED, NEED INTEGRATION ---------------
30% New methodcache. (Crosby)
 3% Root table overflow. (Raab/Crosby)
? % BC image (with its faster frame creation) ~10-40% (?) (Hannan)
 5% Have primitiveResponse not check the clock 40k/sec. (Either Raab's
    patch, or a reimplementation/variant of it.)
? % Larger GC parameters [*]  ~1-5% (?) (Crosby)
? % Larger or redone #at cache. (which pays off more assuming larger GC
    parameters.)

---------- SIMPLE UNIMPLEMENTED IDEAS/TODOS -------------
 2% Cost of decoding compact classes array during method dispatch.
 2% Cost of decoding header to get header-length during GC.
? % Simmon's suggestion of a seperate thread for timing/interrupts.
? % ITIMER under UNIX (?)
? % Adaptive GC parameter configuration and sizing.


For the first two of these, we seem to be taking a hit with branch
prediction. Simplifying the header to something like what was suggested:

  Date: Sun, 3 Feb 2002 03:55:25 -0500 (EST)
  From: Scott A Crosby <crosby at qwes.math.cmu.edu>
  Subject: Re: Image format proposals...


--

[*] Alter GC paramaters, make incrGC's every 40k allocs (which leads to
latencies of <10ms on a P2-450). This means flushing the method&at caches
1/10 as often Since they're flushed less often, we can make them larger
without increasing the per-allocation time spent in cache flushing. (And,
as a bonus, higher hit rates). We could make the parameters 400k
allocs/GC.

-----------  BIG UNIMPLEMENTED IDEAS ---------------------

? % Better/newer GC. ~1-15% (?)

Copy-collector for the oldest generation? Use the train incremental
algorithm for the tenured oldspace. Slower, but no long pauses for the
fullGC.  http://www.daimi.aau.dk/~beta/Papers/Train/train.html


-----------------------------------

Scott






More information about the Squeak-dev mailing list