[Vm-dev] Re: [Pharo-dev] Instance mutation [Was threading in Pharo]

Tue Mar 25 17:06:35 UTC 2014

Eliot,

On Tue, Mar 25, 2014 at 5:46 PM, Eliot Miranda <eliot.miranda at gmail.com>wrote:

> Hi Phil, Hi ClassBuilder people,
>
> On Mar 25, 2014, at 5:16 AM, "phil at highoctane.be" <phil at highoctane.be>
> wrote:
>
> On Tue, Mar 25, 2014 at 1:05 PM, Igor Stasenko <siguctua at gmail.com> wrote:
>
>>
>>
>>
>> On 24 March 2014 22:54, phil at highoctane.be <phil at highoctane.be> wrote:
>>
>>> On Mon, Mar 24, 2014 at 8:23 PM, Alexandre Bergel <
>>> alexandre.bergel at me.com> wrote:
>>>
>>>> >> I am working on a memory model for expandable collection in Pharo.
>>>> Currently, OrderedCollection, Dictionary and other expandable collections
>>>> use a internal array to store their data. My new collection library recycle
>>>> these array instead of letting the garbage collector dispose them. I simply
>>>> insert the arrays in an ordered collection when an array is not necessary
>>>> anymore. And I remove one when I need one.
>>>> >
>>>> > Hm, is that really going to be worth the trouble?
>>>>
>>>> This technique reduces the consumption of about 15% of memory.
>>>>
>>>> >> At the end, #add:  and #remove: are performed on these polls of
>>>> arrays. I haven't been able to spot any problem regarding concurrency and I
>>>> made no effort in preventing them. I have a simple global collection and
>>>> each call site of "OrderedCollection new" can pick an element of my global
>>>> collection.
>>>> >>
>>>> >> I have the impression that I simply need to guard the access to the
>>>> global poll, which is basically guarding #add:  #remove: and #includes:
>>>> >
>>>> > One of the AtomicCollections might be the right things for you?
>>>>
>>>> I will have a look at it.
>>>>
>>>> >> What is funny, is that I did not care at all about multi-threading
>>>> and concurrency, and I have not spotted any problem so far.
>>>> >
>>>> > There isn't any 'multi-threading' like in Java, you got a much more
>>>> control version: cooperative on the same priority, preemptive between
>>>> priorities.
>>>> > So, I am not surprised. And well, these operations are likely not to
>>>> be problematic when they are racy, except when the underling data structure
>>>> could get into an inconsistent state itself. The overall operations
>>>> (adding/removing/searing) are racy on the application level anyway.
>>>> >
>>>> > However, much more interesting would be to know what kind of benefit
>>>> do you see for such reuse?
>>>> > And especially, with Spur around the corner, will it still pay off
>>>> then? Or is it an application-specific optimization?
>>>>
>>>> I am exploring a new design of the collection library of Pharo. Not all
>>>> the (academic) ideas will be worth porting into the mainstream of Pharo.
>>>> But some of them yes.
>>>>
>>>> Thanks for all your help guys! You're great!
>>>>
>>>> Cheers,
>>>> Alexandre
>>>>
>>>> --
>>>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
>>>> Alexandre Bergel  http://www.bergel.eu
>>>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
>>>>
>>>>
>>>>
>>>>
>>> An interesting method I stumbled upon which may help in understanding
>>> how these things do work.
>>>
>>> BlockClosure>>valueUnpreemptively
>>>  "Evaluate the receiver (block), without the possibility of preemption
>>> by higher priority processes. Use this facility VERY sparingly!"
>>> "Think about using Block>>valueUninterruptably first, and think about
>>> using Semaphore>>critical: before that, and think about redesigning your
>>> application even before that!
>>>  After you've done all that thinking, go right ahead and use it..."
>>> | activeProcess oldPriority result semaphore |
>>>  activeProcess := Processor activeProcess.
>>> oldPriority := activeProcess priority.
>>> activeProcess priority: Processor highestPriority.
>>>  result := self ensure: [activeProcess priority: oldPriority].
>>>
>>>
>> I would not recommend you to use this method for anything.
>> This method heavily relies on how process scheduler works, and in case of
>> any changes, it may break everything.
>> For the sake of programming, one shall never assume there is a way to
>> "stop the world while i busy doing something".
>>
>
> If you reshape the world, it makes sense. I was looking at how classes
> were migrated, that's why I found it.
> And all of the new Pharo way of doing these things.
>
> Hey, it is becoming really cool down there. Martin and Camille have been
> hard at work. Kudos!
>
> migrateClasses: old to: new using: anInstanceModification
>  instanceModification := anInstanceModification.
> old ifEmpty:  [ ^ self ].
> [
>  1 to: old size do: [ :index |
> self updateClass: (old at: index) to: (new at: index)].
> old elementsForwardIdentityTo: new.
>  " Garbage collect away the zombie instances left behind in garbage
> memory in #updateInstancesFrom: "
> " If we don't clean up this garbage, a second update would revive them
> with a wrong layout! "
>  " (newClass rather than oldClass, since they are now both newClass) "
> Smalltalk garbageCollect.
>  ] valueUnpreemptively
>
>
>
> The global GC here is pretty unfortunate.  It is there because the VM used
> to leave old instances lying around.  It works like this:
>
>
Uh,oh, yes, GC's are expensive things.
At the moment I am working with TimeSeries data and that's a lot of
entries. And the entries are under developement, so, morphing a lot.
With a 64-bit VM, all hell will break loose I think. Even on Java VM, they
advise not to use too much memory with -Xmx for that reason. (Well,the new
GCs are multithreaded as in:

*Which garbage collector should I use for a very large 64-bit heaps?*

The major advantage of a 64-bit Java implementation is to be able to create
and use more Java objects.  It is great to be able to break these 2GB
limits.  Remember, however, that this additional heap must be garbage
collected at various points in your application's life span.   This
additional garbage collection can cause large pauses in your Java
application if you do not take this into consideration.   The Hotspot VM
has a number of garbage collection implementations which are targetted at
Java applications with large heaps.  We recommend enabling one of the
Parallel or Concurrent garbage collectors when running with very large
heaps.  These collectors attempt to minimize the overhead of collection
time by either collecting garbage concurrent with the execution of your
Java application or by utilizing multiple CPUs during collections to ge the
job done faster.
For more information on these garbage collection modes and how to select
them please refer to the Hotspot GC tuning guide which can be found
here:  Tuning
Garbage Collection with the 5.0 Java Virtual
Machine<http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html>

in
http://www.oracle.com/technetwork/java/hotspotfaq-138619.html#64bit_performance

and in
http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html

A) The Parallel GC

With the -XX:+UseParallelOldGC option, the GC is both a multithreaded young
generation collector and multithreaded old generation collector. It is also
a multithreaded compacting collector. HotSpot does compaction only in the
old generation. Young generation in HotSpot is considered a copy collector;
therefore, there is no need for compaction.

Compacting describes the act of moving objects in a way that there are no
holes between objects. After a garbage collection sweep, there may be holes
left between live objects. Compacting moves objects so that there are no
remaining holes. It is possible that a garbage collector be a
non-compacting collector. Therefore, the difference between a parallel
collector and a parallel compacting collector could be the latter compacts
the space after a garbage collection sweep. The former would not.

B) The Concurrent Mark Sweep (CMS) Collector

The Concurrent Mark Sweep (CMS) collector (also referred to as the
concurrent low pause collector) collects the tenured generation. It
attempts to minimize the pauses due to garbage collection by doing most of
the garbage collection work concurrently with the application threads.
Normally the concurrent low pause collector does not copy or compact the
live objects. A garbage collection is done without moving the live objects.
If fragmentation becomes a problem, allocate a larger heap.

Note: CMS collector on young generation uses the same algorithm as that of
the parallel collector.

Are you looking into such things? (Or maybe they are in already...)

Multicore may be put to good use there I guess.

> we want to reshape instances of class C, e.g. by adding an inst var, and so
>
> 1. create C', which is C plus an inst var
> 2. create a parallel set of instances of class C', one for each instance
> of class C
> 3. for each corresponding pair of instances copy state from the instance
> of C to the instance of C'
> 4. forward-become the instances of C to the instances of C' (now no
> references to the instances of C remain)
> 5. become C to C' (now C' is the new C)
>
> The bug is that the old instances of C are still in the heap.  Because of
> the become in 5. they look like instances of the new C, but are the wrong
> size; they lack space for the new inst var.  They're not reachable (4.
> replaced all references to them with references to the instances of C') but
> they can be resurrected through allInstances (someInstance,nextInstance)
> which works not by following references from the roots (Smalltalk and the
> activeProcess) but by scanning objects in the heap.
>
> However, this was "fixed" in
>
> Name: VMMaker.oscog-eem.254
>  Author: eem
> Time: 11 January 2013, 7:05:37.389 pm
> UUID: 74e6a299-691e-4f7d-986c-1a7d3d0ec02c
>  Ancestors: VMMaker.oscog-eem.253
>
> Fix becomeForward: so that objects whose references are deleted are
>  freed and can no longer be resurrected via allObjects or allInstances.
>
> The change is to free the objects replaced in a forwardBecome so they are
> no longer objects (effectively their class is null (not nil, but 0)).  So
> they can't be resurrected and hence the global GC is un necessary.  The
> Newspeak folks, in particular Ryan Macnak, spotted this and encouraged me
> to make the change.  It of course speeds up instance mutation considerably.
>
> I say fixed because there was a bug tail:
>
> Name: VMMaker.oscog-eem.258
> Author: eem
>  Time: 18 January 2013, 11:01:23.072 am
> UUID: da1433f1-de50-475f-be33-f462b300a2ea
>  Ancestors: VMMaker.oscog-eem.257
>
> Fix becomeForward: when the rootTable overflows.  There were two
>  bugs here.  One is that initializeMemoryFirstFree: used to clear the
> needGCFlag so if the rootTable overflowed noteAsRoot:headerLoc:'s setting
> of the needGCFlag would be undone after the sweep.
>  The other is that rootTable overflow was indicated by
> rootTableCount >= RootTableSize which could be undone by
>  becomeForward: freeing roots which need to be removed from
> the rootTable.  At some point in becomeForward the rootTable would
>  fill but at a later point a root would be freed, causing the table to
> become not full.
>
> The fix is two fold.  1. Add an explicit rootTableOverflowed flag
> instead of relying on rootTableCount >= RootTableSize.
>  2. move the clearing of the needGCFlag to the GC routines.
> Remove unnecessary senders of needGCFlag: false, and remove
>  the accessor.
>
> Name: VMMaker.oscog-eem.255
> Author: eem
>  Time: 12 January 2013, 6:28:41.398 pm
> UUID: 51e53ec1-8caf-41f6-9293-1088ef4b82d8
>  Ancestors: VMMaker.oscog-eem.254
>
> [New[Co]]ObjectMemory:
> Fix freeing of objects for becomeForward:.  Remove freed young
>  roots from the rootsTable.  Filter freed objects pointet to from the
> extraRootsTable (because these locations can change it is wrong
>  to remove entries from the extraRootsTable).
>
> But the bottom line is that, at least on the current Cog VM, that global
> GC is unnecessary.  David, Tim, this still needs to be folded into
> ObjectMemory in the standard interpreter. But doing so is very worth-while.
>  Monticello loads are noticeably faster.
>

Current version of the VM: (don't know if the VMMaker.oscog-eem.258 is in
there. Esteban? Igor?)

Smalltalk vm version 'NBCoInterpreter
NativeBoost-CogPlugin-GuillermoPolito.19 uuid:
acc98e51-2fba-4841-a965-2975997bba66 Mar 17 2014
NBCogit NativeBoost-CogPlugin-GuillermoPolito.19 uuid:
acc98e51-2fba-4841-a965-2975997bba66 Mar 17 2014
https://github.com/pharo-project/pharo-vm.git Commit:
6e08ad296c0df6c1a4215a5dada5380c897dc2fe Date: 2014-03-17 14:45:12 +0100
By: Esteban Lorenzano <estebanlm at gmail.com> Jenkins build #14811

>
> KR
> Phil
>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko.
>>
>
>
> Eliot (phone)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20140325/d76029d1/attachment-0001.htm