[Vm-dev] [Pharo-dev] shallowCopy problem on 64 bit Pharo ?

Clément Bera bera.clement at gmail.com
Wed Mar 1 11:00:00 UTC 2017


Hi,

I tried to analyse the problem and I think I found the cause and a
potential solution.

I have just tried Ben's script in the 32 bits latest VM (Squeak.cog.spur)
and at the bottom of the mail are the results in [1]. I modified the script
to print the error codes. The ratio are a bit different from 64 bits, but
the same pattern is present. The primitive fails once every 40-60
allocations in 32 bits instead of every 10-15 allocations in 64 bits, with
every ~15 failures allocations working better for a short while. The
primitive always fails for 'insufficient object memory'.

The allocation strategy is different for objects which size cannot be
encoded in 16 bits (in our case, array larger than 65535 fields). Large
objects are directly allocated in old space. The failures in shallowCopy
happen in this case. I believe the case where many large objects are
allocated in a row is not really optimised because it supposed to be
uncommon. If it's common in someone's usecase, I am pretty sure we can do
something about it.

Because the memory is in bytes and array fields are twice bigger in 64
bits, I would expect the failures to be twice more frequent in 64 bits than
32 bits. They seem to be 4 times more frequent, but different persons did
the 64 bits measurements on different machines, so it could be that other
side-effects require to be considered.

One solution I see is the following (Pharo version, in Squeak use directly
vmParameterAt:) :

coef := 2.
Smalltalk vm parameterAt: 25 put: (Smalltalk vm parameterAt: 25) * coef.
Smalltalk vm parameterAt: 24 put: (Smalltalk vm parameterAt: 24) * coef.

Basically, I change the old space heuristics to allocate bigger segments
and not to shrink too aggressively.

With a coef of 2, I see the primitive failing once every 58-87 times
instead of once every 40-60 allocations.
With a coef of 10, I see the primitive failing once every 350-700
allocations. The results for coef 10 are in [2] at the bottom of the mail.

Obviously with these settings the image is using a bit more RAM, but I
guess in the use-case of Ciprian where images are 6.8Gb large it does not
really matter to waste a dozen extra Mb.

Coef 2 may lead to a waste of ~15Mb
Coef 10 may lead to a waste of ~150Mb

I don't think there is a generic magic solution for 64 bits. We could
consider having twice bigger segments by default in 64 bits ? I don't know
if it makes sense.

I have on my TODO list to build a GC object for Pharo (normally
Squeak-compatible) to provide convenient APIs and documentation on how to
adapt the GC policy in Spur for both growing and large heaps. Hopefully I
will do that around June.

[1]
65631 65631 #'insufficient object memory'
65689 58 #'insufficient object memory'
65747 58 #'insufficient object memory'
...
65979 58 #'insufficient object memory'
66616 637 #'insufficient object memory'
66673 57 #'insufficient object memory'
66730 57 #'insufficient object memory'
...
67243 57 #'insufficient object memory'
67698 455 #'insufficient object memory'
67754 56 #'insufficient object memory'
67810 56 #'insufficient object memory'
...
68538 56 #'insufficient object memory'
68817 279 #'insufficient object memory'
68872 55 #'insufficient object memory'
...
99860 38 #'insufficient object memory'

[2]
66720 66720 #'insufficient object memory'
68303 1583 #'insufficient object memory'
69850 1547 #'insufficient object memory'
70231 381 #'insufficient object memory'
70610 379 #'insufficient object memory'
71363 753 #'insufficient object memory'
72107 744 #'insufficient object memory'
72844 737 #'insufficient object memory'
73574 730 #'insufficient object memory'
74296 722 #'insufficient object memory'
74654 358 #'insufficient object memory'
75011 357 #'insufficient object memory'
75719 708 #'insufficient object memory'
76071 352 #'insufficient object memory'
...
98404 816 #'insufficient object memory'
98945 541 #'insufficient object memory'
99214 269 #'insufficient object memory'


On Tue, Feb 7, 2017 at 7:58 PM, Levente Uzonyi <leves at caesar.elte.hu> wrote:

>
> What's the error code when the primitive fails?
>
> Levente
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170301/b6d8a966/attachment.html>


More information about the Vm-dev mailing list