[Vm-dev] Cog VM Crash on Windows

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Wed Mar 13 23:51:14 UTC 2013


2013/3/14 Eliot Miranda <eliot.miranda at gmail.com>:
>
>
>
> On Sun, Mar 10, 2013 at 9:53 AM, Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com> wrote:
>>
>>
>> I'd like to understand this.
>
>
> Me too.
>
>>
>>
>> MC mecahnism should invoke primitive 116 (primitiveFlushCacheByMethod,
>> CompiledMethod>>flushCache) on OLD CompiledMethod when recompiled in
>> newClass.
>> This will invoke
>>
>> cogit unlinkSendsTo: self stackTop andFreeIf: false
>>
>> where stackTop is the old CM.
>
>
> Can we step back and move to the abstract?  What the Cog VM does in response to aMethod primitiveFlushCacheByMethod is remove all cache entries pointing to aMethod.  That's both in the first-level method lookup cache, and in any inline caches that point to aMethod.  What it won't (and can't do) is alter any existing activations of aMethod.
>

Right, I understand that, but to me, it's not the case I saw.

>>
>> But for some reason - ClassBuilder is relatively complex - the
>> recompilation is invoked with an empty MethodDictionary, so this
>> primitive is not invoked.
>>
>> primitive 119 (primitiveFlushCacheSelective, Symbol>>flushCache) is
>> invoked then, but will apparently not cogit unlinkAnything.
>
>
> But in response to aSelector primitiveFlushCacheSelective Cog will remove all cache entries using aSelector  That's both in the first-level method lookup cache, and in any inline caches that point to a method whose selector is aSelector.   What it won't (and can't do) is alter any existing activations of methods whose selectors are aSelector.
>

Sorry, my bad, I totally missinterpreted  the action of COG just
because primitive 119 is associated to a different selector in
Interpreter and StackInterpreter.

>>
>> The super hammer primitive 89 (primitiveFlushCache,
>> Behavior>>flushCache) would unlink thru flushMethodCache which:
>>
>> cogit unlinkAllSends
>>
>> But, primitive 89 is not invoked in MC createClass process.
>>
>> If I add Object flushCache at the end of ClassBuilder>>mutate:to:, I
>> can download various versions of Compiler without trouble.
>> Is my cache analysis correct and could it be the origin of the problem?
>
>
> We still need to understand what the bug is.  What we saw was that an activation of a Parser method that was compiled for Parser before it was shape-changed was still on the stack after Parser had been shape-changed.  So the old method ended up reading off the end of the Parser instance.  So the first question is
>
> - is the method on the stack there before shape-change or not?
>
> If it is, then the bug is not with the VM but in the system keeping that old method in use whole it shape-changes Parser.
>
> If it is not, then it would seem to be a VM bug or a cache-flushing bug and the question is how did the old method get activated?
>
> The second question would be simple to answer if it was the case that the method's selector didn't get flushed.  But you're saying the class builder did flush the method's selector.
>
> (sorry I haven't had enough time to focus on this)
>

My analysis is this one:
- the obsolete compiled method was NOT on the stack before invocation
- so it was incorrectly invoked by an invalid cache

So I first thought it was a COG bug, but I changed my mind.
It is just that the compiledMethod are re-added to the cache during compilation:
The scenario is quite simple:
- (NewParser>>selectorA) is compiled which triggers
(OldParser>>selectorA) flushCache
- (NewParser>>selectorB) is compiled, bet this uses
(OldParser>>selectorA) which is added to cache again !

Once compilation is finished, ClassBuilder mutates OldParser
becomeForward: NewParser.
Unfortunately, at next compilation the obsolete cache will let us
invoke  (OldParser>>selectorA), BING !

My fix consists in flushing the cache again just before mutation.
But I used primitive 116 only, not primitive 119, so I'm not
absolutely sure it covers all cases, thus my late question.

Nicolas

>>
>> Nicolas
>>
>> 2013/3/8 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>> > 2013/3/8 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>> >> To illustrate it, see the attached stack trace when I attempt to load
>> >> an OrderedCollection(a MCVersion(Compiler-eem.252)) from an up to date
>> >> trunk image:
>> >>
>> >
>> > Grrr, sorry, I tried so many combinations...
>> > This was not an up-to-date Squeak image, I at least reverted my last
>> > version of Metacello in order to trigger this simple bad behaviour.
>> > So the recipe would be to load first Monticello-fbs.532 then
>> > Compiler-eem.252 from an up-to-date trunk image with a COG VM (no
>> > crash this way back, beacuse we add slots) and attempt the same with
>> > an Interpreter VM (no problem at all).
>> >
>> > Nicolas
>> >
>> >> No obsolete Parser/Compiler compiledMethod sender on the call stack,
>> >> but an obsolete Parser compiledMethod disagreeing on cue position,
>> >> sent AFTER the recompilation of Parser, which is not expected.
>> >>
>> >> No such Behavior with interpreter VM, for me it is clearly a COG VM
>> >> bug, some cache is not properly flushed or something...
>> >>
>> >> Nicolas
>> >>
>> >> 2013/3/8 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>> >>> 2013/3/8 Jeff Gonis <jeff.gonis at gmail.com>:
>> >>>>
>> >>>> On Fri, Mar 8, 2013 at 10:49 AM, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>> >>>>>
>> >>>>> This is known.  It is a bug in the update. not the VM.  The update
>> >>>>> causes the VM to read off the end of a Parser instance.  It would
>> >>>>> crash the Interpreter VM too.
>> >>>>>
>> >>>>> best,
>> >>>>> Eliot
>> >>>>
>> >>>
>> >>> Hmm, looking at it again, I'm not convinced.
>> >>> First, I tried several times from an Interpreter Vm and that never
>> >>> triggered any bug.
>> >>> Then I don't see how the update could fail.
>> >>>
>> >>> In MCClassDefinition>>createClass we use
>> >>> class := (ClassBuilder new)
>> >>>                         name: name
>> >>>                         inEnvironment: superClass environment
>> >>>                         subclassOf: superClass
>> >>>                         type: type
>> >>>                         instanceVariableNames: self instanceVariablesString
>> >>>                         classVariableNames: self classVariablesString
>> >>>                         poolDictionaries: self sharedPoolsString
>> >>>                         category: category.
>> >>>
>> >>> This is a quite robust process, because it creates a newClass clone
>> >>> first, compile all the methods from oldClass into newClass, and then
>> >>> mutate allInstances of oldClass into instances of newClass.
>> >>>
>> >>> What can fail is having an obsolete CompiledMethod on stack with wrong
>> >>> inst var slots.
>> >>>
>> >>> What I observed with COG VM is different.
>> >>> I change Parser with above ClassBuilder snippet.
>> >>> Then in previous version of createClass, this snippet fails in
>> >>> (Compiler evaluate: ...) part :
>> >>>
>> >>>         composition := Compiler evaluate: (self traitComposition ifNil:['{}']).
>> >>>         (composition isCollection and:[composition isEmpty and:[class
>> >>> traitComposition isEmpty]]) ifFalse:[
>> >>>                 class setTraitComposition: composition asTraitComposition.
>> >>>         ].
>> >>>
>> >>> with an obsolete Parser method...
>> >>> It's like COG VM is still using an old Parser methodDictionary, while
>> >>> we just changed it...
>> >>>
>> >>> Nicolas
>> >>>
>> >>>> Ok thanks, I wasn't sure if this was the same bug and I figured it was
>> >>>> safer to send in a report than not.  Didn't mean to spam the list.
>> >>>>
>> >>>> Thanks for your time,
>> >>>> Jeff
>
>
>
>
> --
> best,
> Eliot
>


More information about the Vm-dev mailing list