[Vm-dev] BUG? A problem with callbacks that shows up in 64bits (but is on 32bits too)

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Tue Mar 14 15:57:03 UTC 2017


Hi Eliot,
For the DisplayScreen bits, since it is a rather large data usually, I
would be in favour to have it pinned in old space, and this could happen
from image side.
For every other use case, I don't know. b) looks simpler to me, but you
seem to say it's a question of performance.
>From what I understand, this could happen thru any kind of callback and
also in threaded primitives/FFI.
I don't have enough neuron available to think of d)

2017-03-14 16:46 GMT+01:00 Eliot Miranda <eliot.miranda at gmail.com>:

>
> Hi Esteban, Hi Igor, Hi All,
>
> On Fri, Mar 10, 2017 at 7:35 AM, Esteban Lorenzano <estebanlm at gmail.com>
> wrote:
>
>>
>> Hi,
>>
>> I’m tumbling into an error in Pharo, because we use callbacks
>> intensively, in Athens(cairo)-to-World conversion in particular, and people
>> is sending always their crash reports… we made the whole conversion a lot
>> more robust since problems started to arise, but now I hit a wall I cannot
>> solve: I think problem is in something in callbacks.
>>
>> And problem is showing very easy on 64bits (while in 32bits it takes time
>> and is more random).
>>
>
>  I responded in the "image not opening" thread, but it's the same
> problem.  I really want to hear what y'all think because I'll happily
> implement a fix, but I want to know which one y'all think is a good idea.
> Here's my reply (edits between [ & ] to add information):
>
>
> On Mon, Mar 13, 2017 at 9:11 PM, Eliot Miranda <eliot.miranda at gmail.com>
>  wrote:
>
> I'm pretty confident [I know] this is to do with bugs in the Athens
> surface code which assumes that callbacks can be made in the existing
> copyBits and warpBits primitive.  They can't do this safely because a GC
> (scavenge) can happen during a callback, which then causes chaos when the
> copyBits primitive tries to access objects that have been moved under its
> feet.
>
> I've done work to fix callbacks so that when there is a failure it is the
> copyBits primitive that fails, instead of apparently the callback return
> primitive.  One of the apparent effects of this fix is to stop the screen
> opening up too small; another is getting the background colour right, and
> yet another is eliminating bogus pixels in the VGTigerDemo demo.  But more
> work is required to fix the copyBits and warpBits primitives.  There are a
> few approaches one might take:
>
> a)  fixing the primitive so that it saves and restores oops around the
> callbacks using the external oop table [InterpreterProxy>>addGCRoot: &
> removeGCRoot:].  That's a pain but possible. [It's a pain because all the
> derived pointers (the start of the destForm, sourceForm, halftoneForm and
> colorMapTable) must be recomputed also, and of course most of the time the
> objects don't move; we only scavenge about once every 2 seconds in normal
> running]
>
> b) fixing the primitive so that it pins the objects it needs before ever
> invoking a callback [this is a pain because pinning an object causes it to
> be tenured to old space if it is in new space; objects can't be pinned in
> new space, so instead the pin operation forwards the new space object to an
> old space copy if required and answers its location in old space, so a
> putative withPinnedObjectsDo: operation for the copyBits primitive looks
> like
> withPinnedFormsDo: aBlock
> <inline: #always>
> self cppIf: SPURVM & false
> ifTrue:
> [| bitBltOopWasPinned destWasPinned sourceWasPinned halftoneWasPinned |
>  (bitBltOopWasPinned := interpreterProxy isPinned: bitBltOop) ifFalse:
> [bitBltOop := interpreterProxy pinObject: bitBltOop].
> (destWasPinned := interpreterProxy isPinned: destForm) ifFalse:
> [destForm := interpreterProxy pinObject: destForm].
> (sourceWasPinned := interpreterProxy isPinned: sourceForm) ifFalse:
> [sourceForm := interpreterProxy pinObject: sourceForm].
> (halftoneWasPinned := interpreterProxy isPinned: halftoneForm) ifFalse:
> [halftoneForm := interpreterProxy pinObject: halftoneForm].
> aBlock value.
>  bitBltOopWasPinned ifFalse: [interpreterProxy unpinObject: bitBltOop].
> destWasPinned ifFalse: [interpreterProxy unpinObject: destForm].
> sourceWasPinned ifFalse: [interpreterProxy unpinObject: sourceForm].
> halftoneWasPinned ifFalse: [interpreterProxy unpinObject: halftoneForm]]
> ifFalse: [aBlock value]
>    and tenuring objects to old space is not ideal because they are only
> collected by a full GC, so doing this would at least tenure the bitBltOop
> which is very likely to be in new space]
>
> c) fixing the primitive so that it uses the scavenge and fullGC counters
> in the VM to detect if a GC occurred during one of the callbacks and would
> fail the primitive [if it detected that a GC had occurred in any of the
> surface functions].   The primitive would then simply be retried.
>
> d) ?
>
> I like c) as it's very lightweight, but it has issues.  It is fine to use
> for callbacks *before* cop[yBits and warpBits move any bits (the
> lockSurface and querySurface functions).  But it's potentially erroneous
> after the unlockSurface primitive.  For example, a primitive which does an
> xor with the screen can't simply be retried as the first, falling pass,
> would have updated the destination bits but not displayed them via
> unlockSurface.  But I think it could be arranged that no objects are
> accessed after unlockSurface, which should naturally be the last call in
> the primitive (or do I mean showSurface?).  So the approach would be to
> check for GCs occurring during querySurface and lockSurface, failing if so,
> and then caching any and all state needed by unlockSurface and showSurface
> in local variables.  This way no object state is accessed to make the
> unlockSurface and showSurface calls, and no bits are moved before the
> queryDurface and lockSurface calls.
>
> If we used a failure code such as #'object may move' then the primitives
> could answer this when a GC during callbacks is detected and then the
> primitive could be retried only when required.
>
>
> [Come on folks, please comment.  I want to know which idea you like best.
> We could fix this quickly.  But right now it feels like I'm talking to
> myself.]
>
>
>> Here is the easiest way to reproduce it (in mac):
>>
>> wget files.pharo.org/get-files/60/pharo64-mac-latest.zip
>> wget files.pharo.org/get-files/60/pharo64.zip
>> wget files.pharo.org/get-files/60/sources.zip
>> unzip pharo64-mac-latest.zip
>> unzip pharo64.zip
>> unzip sources.zip
>> ./Pharo.app/Contents/MacOS/Pharo ./Pharo64-60438.image eval "VGTigerDemo
>> runDemo"
>>
>> eventually (like 5-6 seconds after, if not immediately), you will have a
>> stack like this:
>>
>> SmallInteger(Object)>>primitiveFailed:
>> SmallInteger(Object)>>primitiveFailed
>> SmallInteger(VMCallbackContext64)>>primSignal:andReturnAs:fromContext:
>> GrafPort>>copyBits
>> GrafPort>>image:at:sourceRect:rule:
>> FormCanvas>>image:at:sourceRect:rule:
>> FormCanvas(Canvas)>>drawImage:at:sourceRect:
>> FormCanvas(Canvas)>>drawImage:at:
>> VGTigerDemo>>runDemo
>> VGTigerDemo class>>runDemo
>> UndefinedObject>>DoIt
>> OpalCompiler>>evaluate
>> OpalCompiler(AbstractCompiler)>>evaluate:
>> [ result := Smalltalk compiler evaluate: aStream.
>> self hasSessionChanged
>>         ifFalse: [ self stdout
>>                         print: result;
>>                         lf ] ] in EvaluateCommandLineHandler>>evaluate:
>> in Block: [ result := Smalltalk compiler evaluate: aStream....
>> BlockClosure>>on:do:
>> EvaluateCommandLineHandler>>evaluate:
>> EvaluateCommandLineHandler>>evaluateArguments
>> EvaluateCommandLineHandler>>activate
>> EvaluateCommandLineHandler class(CommandLineHandler class)>>activateWith:
>> [ aCommandLinehandler activateWith: commandLine ] in
>> PharoCommandLineHandler(BasicCommandLineHandler)>>activateSubCommand: in
>> Block: [ aCommandLinehandler activateWith: commandLine ]
>> BlockClosure>>on:do:
>> PharoCommandLineHandler(BasicCommandLineHandler)>>activateSubCommand:
>> PharoCommandLineHandler(BasicCommandLineHandler)>>handleSubcommand
>> PharoCommandLineHandler(BasicCommandLineHandler)>>handleArgument:
>> [ self
>>         handleArgument:
>>                 (self arguments
>>                         ifEmpty: [ '' ]
>>                         ifNotEmpty: [ :arguments | arguments first ]) ]
>> in PharoCommandLineHandler(BasicCommandLineHandler)>>activate in Block:
>> [ self...
>> BlockClosure>>on:do:
>> PharoCommandLineHandler(BasicCommandLineHandler)>>activate
>> PharoCommandLineHandler>>activate
>> PharoCommandLineHandler class(CommandLineHandler class)>>activateWith:
>> [ super activateWith: aCommandLine ] in PharoCommandLineHandler
>> class>>activateWith: in Block: [ super activateWith: aCommandLine ]
>>
>> Any idea?
>>
>> thanks!
>> Esteban
>
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170314/7931d6ca/attachment.html>


More information about the Vm-dev mailing list