[Vm-dev] [Pharo-dev] Anyone else seen crashes like these ?

Eliot Miranda eliot.miranda at gmail.com
Tue Nov 13 01:53:04 UTC 2018


Hi Ben,
On Mon, Nov 12, 2018 at 8:51 AM Ben Coman <btc at openinworld.com> wrote:

>
>
> On Mon, 12 Nov 2018 at 21:37, Sven Van Caekenberghe <sven at stfx.eu> wrote:
>
>> Hi,
>>
>> I run Pharo 7 64-bit on a macOS laptop, where the images are kept running
>> across sleep/wake cycles.
>>
>
> Just to confirm, you mean your laptop sleep/wake cycle.
> On my Window 10 laptop I do the same quite often with no problems.
>
>
>> For many weeks, it often happens that an image crashes before/after such
>> a sleep/wakeup (not all the time, just regularly).
>>
>> Here is a crash dump from today (fresh image/vm from WE, nothing special
>> loaded).
>>
>> Related to scheduling ? Event handling ?
>
>
> In trying to understand the last few moments, is the recent primitives
> list strongly ordered in time?...
>   Most recent primitives
>   signal
>   nowTick
>   primSignal:atUTCMicroseconds:
>   wait
>   millisecondClockValue
>   @
>   actualScreenSize
>

Yes, it's a log of the most recent 256 named primitives.  Note that they're
irrelevant in this case.  The crash is always rooted in vmIOProcessEvents.
i.e. the VM is responding to some input event, and it calls pumpRunLoop to
do (Objective-C) [NSRunLoop mainRunLoop] runMode:NSDefaultRunLoopMode
beforeDate:[NSDate distantPast], which in Smalltalk would be written
NSRunLoop mainRunLoop runMode: NSDefaultRunLoopMode beforeDate: NSDate
distantPast.  And somewhere within this a display update occurs which
crashes, presumably because we're using stale data that should have been
invalidated oil sleep and refreshed on wake.


>
>
> Starting at #actualScreenSize, instrumenting the code by adding...
>      Transcript crShow: thisContext sender printString.
> indicates the following call chain...
>   DisplayScreen-class>>actualScreenSize
>   MorphicUIManager>>checkForNewDisplaySize
>   DisplayScreen class>>checkForNewScreenSize
>   WorldState>>doOneCycleNowFor:
>   WorldState>>doOneCycleFor:
>   WorldMorph>>doOneCycle
>   WorldMorph class>>doOneCycle
>   MorphicUIManager>>spawnNewProcess
>
> In detail from a static analysis of the code...
>            > WorldState>>doOneCycleNowFor:
>               > DisplayScreen class>>checkForNewScreenSize
>                   > MorphicUIManager>>checkForNewDisplaySize
>                       > DisplayScreen-class>>actualScreenSize
> PRIMITIVE
>                   < MorphicUIManager>>checkForNewDisplaySize
>                < DisplayScreen class>>checkForNewScreenSize
>             < WorldState>>doOneCycleNowFor
>          < WorldState>>doOneCycleFor:
>       < WorldMorph>>doOneCycle
>    < WorldMorph class>>doOneCycle
> < MorphicUIManager>>spawnNewProcess
>    > WorldMorph class>>doOneCycle
>       > WorldMorph>>doOneCycle
>          > WorldState>>doOneCycleNowFor:
>             > WorldState>>interCyclePause:
>                > Time>>millisecondClockValue       PRIMITIVE
>             < WorldState>>interCyclePause:
>                > Delay>>schedule
>                   > DelaySemaphoreScheduler>>schedule:
>                       > Semaphore>>wait
> PRIMITIVE "readyToSchedule variable"
>                       > Semaphore>>signal
>  PRIMITIVE "timingSemaphore variable"   NOT RECORDED
>
>         maybe context change causes this to be recorded later in "Most
> recent primitives"
>                              >
> DelayMicrosecondTicker>>waitForUserSignalled:orExpired:      waking up from
> Semaphore>>wait
>                                     the primSignal:atUTCMicroseconds:
> PRIMITIVE  immediately before the #wait seems to have been logged now in
> "Most recent primitives"
>                              >
> DelaySemaphoreScheduler>>scheduleAtTimingPriority
>                                 > Semaphore>>signal
>  PRIMITIVE  "readyToSchedule variable"  seems to have not been recorded in
> "Most recent primitives"
>                                 >
> DelayBasicScheduler>>scheduleAtTimingPriority
>                              <
> DelaySemaphoreScheduler>>scheduleAtTimingPriority
>                           <
> DelayBasicScheduler>>runBackendLoopAtTimingPriority
>                              > DelayMicrosecondTicker>>nowTick
>  PRIMITIVE
>                              > Delay>>timingPrioritySignalExpired
>                                 > Semaphore>>signal
> PRIMITIVE "delaySemaphore variable"      CRASH
>
>  Smalltalk stack dump:
>     0x7ffee21d7138 M Delay>timingPrioritySignalExpired 0x113e3b138: a(n)
> Delay
>     0x7ffee21d7170 M [] in
> DelaySemaphoreScheduler(DelayBasicScheduler)>runBackendLoopAtTimingPriority
> 0x11453b8c0: a(n) DelaySemaphoreScheduler
>        0x1169f51c8 s BlockClosure>ensure:
>        0x1169f5828 s
> DelaySemaphoreScheduler(DelayBasicScheduler)>runBackendLoopAtTimingPriority
>        0x1169f5c38 s [] in
> DelaySemaphoreScheduler(DelayBasicScheduler)>startTimerEventLoopPriority:
>        0x1169fae10 s [] in BlockClosure>newProcess
>
> The only reason I can think that last semaphore call would be a problem is
> if variable delaySemaphore was not a Semaphore,
> but the C stack backtrace doesn't look like it crashes in the Semaphore
> handling.
> Its perhaps co-incidental with delay scheduling code is invoked fairly
> often, that its the current process when this crash occurs.
>
>
> To hep isolate the failure point, I would considering copying #nowTIck to
> a multiple series like...
> #nowTick1
> #nowTick2
> #nowTick3
> etc so that all of the sender sites can be made distinct.
> And maybe the same for #signal senders within the delay scheduling code.
>
> One side-curiousity is I can't see where the primitive #@ is being called
> from.
> Does it indicate some loop as its trying to determine
>
> To me the "vmIOProcessEvents" in the C backtrace with a fairly deep stack
> into the AppKit framework
> indicates its related to event handling.
>
> cheers -ben
>


-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20181112/ab426476/attachment.html>


More information about the Vm-dev mailing list