[Vm-dev] Re: Reproducible VM crash when loading FFI

Tue Mar 26 16:33:48 UTC 2013

Ok, as a summary. I've tried

1) not recreating the compact classes array without voiding vm caches:
works ok.

2) making another compact classes array, voiding the vm caches
(primitive 214): random crashes :/. Sometimes it works, some times it does
not...
I've tried, just for the record, to void the cache before, after and
(before and after) the become forward of the special objects array:

a)
| newSpecialObjectsArray |
self vm voidCogVMState.
newSpecialObjectsArray := self newSpecialObjectsArray.
self specialObjectsArray becomeForward: newSpecialObjectsArray.

b)
| newSpecialObjectsArray |
newSpecialObjectsArray := self newSpecialObjectsArray.
self specialObjectsArray becomeForward: newSpecialObjectsArray.
self vm voidCogVMState.

c)
| newSpecialObjectsArray |
self vm voidCogVMState.
newSpecialObjectsArray := self newSpecialObjectsArray.
self specialObjectsArray becomeForward: newSpecialObjectsArray.
self vm voidCogVMState.

All with different results when executing several times... :/

voidCogVMState is the following in Pharo:

VirtualMachine class>>voidCogVMState
"Void any internal caches the VM maintains other than the method lookup
caches.
 These comprise
- the stack zone, where method activations are stored, and
- the machine code zone, where the machine code form of CompiledMethods is
held."
<primitive: 214>
^self primitiveFailed

The comment says it does not clean method lookup caches...

any ideas?
Guille

On Tue, Mar 26, 2013 at 11:48 AM, Guillermo Polito <
guillermopolito at gmail.com> wrote:

> Ok, first I'm sorry for the evil upload. That was the first at hand
> :$. I'll get a more trustful way to do it the next one...
> Second, thanks for taking the time to look at it :).
>
> Now, I've some questions:
>
> - this "bug" is particular to Cog, isn't it? I mean, only related to
> inlining with JIT, so a Stack vm does not play with these rules, right?
> - Why is voiding the cache hackish? I mean, replacing the special objects
> array is a very low level operation, and a very special one which is not
> commonly performed. Voiding the cache looks like normal to me in such a
> case...
>
> Tx!
> Guille
>
> On Mon, Mar 25, 2013 at 11:56 PM, Eliot Miranda <eliot.miranda at gmail.com>wrote:
>
>>
>>
>>
>> On Mon, Mar 25, 2013 at 3:22 PM, Camillo Bruni <camillobruni at gmail.com>wrote:
>>
>>>
>>> Thanks for the findings!
>>>
>>
>> you're welcome :)
>>
>>
>>>
>>> I opened an issue http://bugs.pharo.org/issues/id/10134 with the
>>> contents
>>> of your solution.
>>>
>>> I think for completeness we should implement both solution, and of course
>>> use the one that does not #becomeForward: in the standard case.
>>>
>>
>> To be clear, becomeForward: is the correct way to install the
>> specialObjectsArray.  The issue is what entries in the specialObjectsArray
>> need to remain constant.  The Character table, the compactClassesArray, all
>> the classes and a few other entries (semaphores, etc) must remain the same.
>>  This could do with better documenting of the method :-/
>>
>>
>>
>>>
>>> On 2013-03-25, at 19:52, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>>> > Hi Guille,
>>> >
>>> >    thanks for this.  The problem is that Pharo's
>>> > recreateSpecialObjectsArray uses newCompactClassesArray to recreate the
>>> > compact classes array and on Cog that is not supported.  Cog caches the
>>> > compactClassesArray (specialObjectsArray at: 29) in every jitted method
>>> > prolog to reduce the time taken to derive the receiver's class in
>>> message
>>> > lookup.  The Pharo 2.0 code for recreating the specialObjectsArray can
>>> > therefore create a dangling pointer where the only reference to the old
>>> > compactClassesArray is in machine code, and the machine code GC doesn't
>>> > cope with there being roots in machine code.  Note that Cog also
>>> caches the
>>> > characterTable and class SmallInteger in machine code.
>>> >
>>> > There are two solutions to this.
>>> > One would be to reuse the compactClassesArray (my recommendation).  So
>>> that
>>> > recreateSpecialObjectsArray looks like
>>> >
>>> > ...
>>> > "An array of the 255 Characters in ascii order.
>>> > Cog inlines table into machine code at: prim so do not regenerate it."
>>> > newArray at: 25 put: Character characterTable.
>>> > ...
>>> > "A 32-element array with up to 32 classes that have compact instances.
>>> > Cog inlines table into machine code class lookup so do not regenerate
>>> it."
>>> > newArray at: 29 put: self compactClassesArray.
>>> > ...
>>> >
>>> > where compactClassesArray, like Character characterTable, answers the
>>> > existing object.
>>> >
>>> > The second solution (rather hackish) is to void all machine code on
>>> > installing the new specialObjectsArray.  e.g.
>>> > recreateSpecialObjectsArray
>>> > "Smalltalk recreateSpecialObjectsArray"
>>> > "To external package developers:
>>> > **** DO NOT OVERRIDE THIS METHOD.  *****
>>> > If you are writing a plugin and need additional special object(s) for
>>> your
>>> > own use,
>>> > use addGCRoot() function and use own, separate special objects
>>> registry "
>>> > "The Special Objects Array is an array of objects used by the Squeak
>>> > virtual machine.
>>> > Its contents are critical and accesses to it by the VM are unchecked,
>>> so
>>> > don't even
>>> > think of playing here unless you know what you are doing."
>>> > "Replace the interpreter's reference in one atomic operation.
>>> > Void machine code to avoid crashing Cog."
>>> > | newSpecialObjectsArray |
>>> > newSpecialObjectsArray := self newSpecialObjectsArray.
>>> > self specialObjectsArray becomeForward: newSpecialObjectsArray.
>>> > Smalltalk vm voidCogVMState
>>> >
>>> >
>>> > this is my fault for not ensuring recreateSpecialObjectsArray was
>>> properly
>>> > commented.  I had commented the inlining of Character table, but not
>>> the
>>> > inlining of the CompactClasses array. Apologies.
>>> >
>>> > HTH,
>>> > Eliot
>>> >
>>> > On Sat, Mar 23, 2013 at 10:19 AM, Guillermo Polito <
>>> > guillermopolito at gmail.com> wrote:
>>> >
>>> >> Hi!
>>> >>
>>> >> In my quest to crash the vm, i've found an ugly common case :(.
>>> >> I am trying to port the opendbx driver to 2.0, but I'm getting vm
>>> crashes
>>> >> when my configuration loads FFI :(. I updated my configuration to load
>>> >> version 1.7 of FFI (which I assume is the latest).
>>> >>
>>> >> I tried to do it in Pharo 2.0 with latest pharovm, and with eliot's
>>> Cog
>>> >> 2701 from his website, failing in both cases.
>>> >> The snippet of code that gets a sistematic crash is:
>>> >>
>>> >> Gofer it
>>> >> smalltalkhubUser: 'DBXTalk' project: 'DBXTalkDriver';
>>> >> package: 'ConfigurationOfOpenDBXDriver';
>>> >> load.
>>> >> ((Smalltalk at: #ConfigurationOfOpenDBXDriver) project
>>> version:#stable)
>>> >> load
>>> >>
>>> >>
>>> >> This snippet crashes always with a segmentation fault (at least the
>>> >> fifteen times i tried :), with several different output in the
>>> console...
>>> >>
>>> >> Surprisingly, if I load FFI alone and not from the OpenDBXDriver
>>> >> configuration, it loads well... :/
>>> >>
>>> >> So I'm deferring the OpenDBX port a bit longer :(
>>> >>
>>> >> Thanks!
>>> >> Guille
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > best,
>>> > Eliot
>>>
>>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20130326/18b846ed/attachment.htm