[Vm-dev] pinning GC

Wed Jan 12 17:34:45 UTC 2011

On Wed, Jan 12, 2011 at 1:59 AM, Steve Rees <
squeak-vm-dev at vimes.worldonline.co.uk> wrote:

>
> Aren't named ivars only accessed by methods of the receiver?

 In classical Smalltalk notionally inst vars are only accessed directly by
methods in the receiver's class or superclasses that have an inst size > 0.
 However, become: can rebind the receiver so that an illegal direct inst var
access can be made.  e.g. create a ByteArray and become it to a Point;
sending x to the point accesses a potentially bogus pointer, the raw bits at
0 in the ByteArray.  So if there are extant activations whose receiver is
changed through become or changeClass one can observe strange effects and/or
crash the system.  The VM make go to some lengths to hide or mitigate these
effects.  In my BrouHaHa Smalltalk-80 VM the JIT (bytecode to threaded code)
did copy-down so that it could know that the class of self in a threaded
code method was constant and so sends to self (~ 40% of all sends) didn't
have to be checked.  But a become could change the class of self in an
activation and so become also had to scan all activations and rebind the
threaded code method in activations on the becommed objects so that self
sends remained correct.

> Or does Squeak use access sends similar to Strongtalk?
>

One can of course use accessors as a style, and easily modify the compiler
to access them this way, but the system (along with most other Smalltalks)
does provide direct access and it is used extensively.

>
> In the former case a check to the topmost frame of each Process' stack, and
> a check for corpsed objects on each method return should be enough, no?
>

Checking on return is very expensive.  See my paper on context management in
VisualWorks 5i.  Instead one could probably scan as part of of become/change
class, but scan only activations in the stack zone and defer scanning
contexts in the heap until they were faulted into the stack zone.  Faulting
in is expensive anyway so adding a test for a corpse won't add much
overhead.

In the latter case the class in the inline cache will differ, giving an
> opportunity in the lookup to fixup the original. One might also check the
> corpse bit in ivar accesses to allow an opportunity to replace corpsed
> references ahead of a full GC.
>

Again that kind of check isn't cheap.  Remember that the check must be made
on inst var reads as well as writes.  When I added immutability to
VisualWorks, which tests only writes, the total cost was about 3% to 5%,
which was more than acceptable for the benefit, and it was so low precisely
because an inst var write required a store check and so part of the
immutability test could be folded into the store check, bringing down its
overall cost.  But adding a similar check for corpses to inst var reads
would probably add costs above 10% and that's getting expensive.  Since inst
var access is common and become is relatively rare (we've got to be talking
millions to one in normal code, right?) it makes sense to me to put the cost
in become and rare operations such as faulting contexts into the stack zone.

> The only case where I think this might still be a problem is when Cog does
> method inlining and has inlined access sends, though even here it would
> presumably have a type test ahead of the inlined code which the corpse would
> fail because of the changed class, triggering an uncommon branch or falling
> back to a traditional non-lined send (depending on the approach Cog uses - I
> haven't looked at the code). Both of which give an opportunity to handle the
> corpsed reference, so maybe there won't be a problem here either.
>

Right.  If one is doing adaptive optimization then become/change class has
to take care to preserve optimized code invariants and/or dynamically
deoptimize when it violates them.  But Cog doesn't do this /yet/ :)

>
> I think you can also use this for two-way become by cloning both objects,
> marking each as a corpse and having each corpse refer to the clone of the
> other.
>

Right, noting the caveats we've discussed here.  It's all just a small
matter of programming :)

best
Eliot

>
> Regards, Steve
>
>
> On 12/01/2011 09:37, Josh Gargus wrote:
>
>>
>>
>> On Jan 11, 2011, at 11:48 PM, Igor Stasenko wrote:
>>
>>  Eliot, one thing about 'forwarded' objects, which you calling the
>>> forwarding corpse is that it can be used not only for pinning,
>>> but also with #becomeForward primitive, making it work a lot faster,
>>> since its not require to scan whole heap to update references,
>>> and update can be done during GC.
>>> The only problem, as you pointed out, is the objects which don't have
>>> enough space for forwarding pointer. But for this case, i think the
>>> primitive can fall back and use old slow scheme.
>>>
>> I was thinking the same thing.  But wouldn't there remain the problem that
>> Eliot mentioned about objects with named variables?
>>
>> Cheers,
>> Josh
>>
>>
>>
>>
> --
> You can follow me on twitter at http://twitter.com/smalltalkhacker
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20110112/f9093078/attachment-0001.htm