[Vm-dev] Difficult to debug VM crash with full blocks and Sista V1

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Mon Sep 23 22:25:12 UTC 2019


Follow up: so we have a problem of repeatability...
I found nothing like rr for macos, and CPU watchpoint capability is too
restricted to be usefull.

So what I did is save a crash.image attempting to crash at resumption time:
    Smalltalk saveAs: 'crash'.
    Smalltalk snapshot: true andQuit: true.
    5 timesRepeat: [SocketTest suite run].

Then I launched the vm under lldb and gathered $pc stats to see if I could
observe some reproducibility
(lldb) p/x $pc - methodZoneBase
0x00000000000a0e94
0x0000000000090f70
0x000000000009f824
0x000000000009bfc8
0x000000000009d030
0x000000000009c7c0
0x00000000000951e0
0x00000000000949c8 <- once
0x0000000000094418
0x00000000000949c8 <— twice
0x000000000009b848
0x00000000000a0e94 <- twice
0x000000000009a176
0x000000006dfff2e0 ???
0x00000000000996d1
0x00000000000949c8 <- thrice good candidate for watchpoint set
0x000000000009ae70
0x000000000009ae70
0x00000000000a18e7
0x0000000000095470
0x000000000009c458
0x00000000000a0a7c
0x0000000000093e89
0x000000000009beb0
0x0000000000095a37
0x0000000000095408
0x00000000000a06bc
0x000000000009cd48
0x0000000000095408 <- twice
0x000000000009ba41
0x0000000000095408 <- thrice new candidate for watchpoint set
0x0000000000045258   <— smallest ever
0x000000000009c7c0
0x0000000000095408 <- fourth ! the watch point was set here, why didn’t it
stop??? argh methodZoneBase was incremented after I set the watchpoint!!!
0x00000000000a12dc
0x00000000000951e0
0x00000000000a0e24
0x000000000009ff8c
0x000000000009c7f6
0x000000000009a626

Then I have chosen to watch the most frequent crash address,
that is methodZoneBase+0x0000000000095408,
Unfortunately, s commented above, I missed one (I set the watchpoint too
early, before methodZoneBase took its definitive address).

And after a few (a bunch of)  trials, here is what I found:

(lldb) run ../../image/crash.image
There is a running process, kill it and restart?: [Y/n]
Process 28175 exited with status = 9 (0x00000009)
Process 28185 launched:
'/Users/nicolas/Smalltalk/OpenSmalltalk/opensmalltalk-vm/build.macos64x64/squeak.cog.spur/Squeak.app/Contents/MacOS/Squeak'
(x86_64)
Squeak was compiled with optimization - stepping may behave oddly;
variables may not be available.
Process 28185 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 6.1
    frame #0: 0x000000010007bc22 Squeak`-[sqSqueakOSXApplication
doHeadlessSetup](self=0x000000010066fc00, _cmd="doHeadlessSetup") at
sqSqueakOSXApplication.m:125 [opt]
   122 //No super call here as we've already set headless from command line
or info.plist
   123 extern BOOL gSqueakHeadless;
   124    // Notice that setActivationPolicy: is available in OSX 10.6 and
later
-> 125    if ([NSApp respondsToSelector:@selector(setActivationPolicy:)]) {
   126        if (gSqueakHeadless) {
   127            [NSApp
setActivationPolicy:NSApplicationActivationPolicyProhibited];
   128        } else {
Target 0: (Squeak) stopped.
(lldb) wa s e -s 1 -- methodZoneBase+0x0000000000095408
Watchpoint created: Watchpoint 33: addr = 0x107c58138 size = 1 state =
enabled type = w
    new value: -3689348814741910324
(lldb) cont
Process 28185 resuming

Watchpoint 33 hit:
old value: -3689348814741910324
new value: -3689348814741910836
Process 28185 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint 33
    frame #0: 0x00000001000731dc
Squeak`fillInMethodHeadersizeselector(method=0x0000000107c58130, size=544,
selector=4435118304) at cogitX64SysV.c:6320 [opt]
   6317    sqInt rawHeader;
   6318
   6319 (method->cmType = CMMethod);
-> 6320 (method->objectHeader = nullHeaderForMachineCodeMethod());
   6321 (method->blockSize = size);
   6322 (method->methodObject = methodObj);
   6323
Target 0: (Squeak) stopped.
(lldb)
Process 28185 resuming

Watchpoint 33 hit:
old value: -3689348814741910836
new value: -3689571654825360894
Process 28185 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint 33
    frame #0: 0x0000000100073255
Squeak`fillInMethodHeadersizeselector(method=0x0000000107c58130, size=544,
selector=4435118304) at cogitX64SysV.c:6339 [opt]
   6336 (method->selector = selector);
   6337 (method->cmNumArgs = argumentCountOfMethodHeader(methodHeader));
   6338 (method->cmHasMovableLiteral = hasMovableLiteral);
-> 6339 if ((method->cmRefersToYoung = hasYoungReferent)) {
   6340 addToYoungReferrers(method);
   6341 }
   6342 (method->cmUsageCount = initialMethodUsageCount());
Target 0: (Squeak) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint 33
  * frame #0: 0x0000000100073255
Squeak`fillInMethodHeadersizeselector(method=0x0000000107c58130, size=544,
selector=4435118304) at cogitX64SysV.c:6339 [opt]
    frame #1: 0x000000010004f90c Squeak`compileCogMethod [inlined]
generateCogMethod(selector=<unavailable>) at cogitX64SysV.c:7098 [opt]
    frame #2: 0x000000010004f7a5
Squeak`compileCogMethod(selector=4435118304) at cogitX64SysV.c:27780 [opt]
    frame #3: 0x000000010004bc54 Squeak`cogselector(aMethodObj=4468481712,
aSelectorOop=4435118304) at cogitX64SysV.c:4833 [opt]
    frame #4: 0x0000000100013d4d Squeak`primitiveClosureValue at
gcc3x-cointerp.c:27237 [opt]
    frame #5: 0x0000000107bd1925
    frame #6: 0x000000010000236b Squeak`interpret at gcc3x-cointerp.c:2767
[opt]
    frame #7: 0x000000010008ed55 Squeak`-[sqSqueakMainApplication
runSqueak](self=0x000000010066fc00, _cmd=<unavailable>) at
sqSqueakMainApplication.m:201 [opt]
    frame #8: 0x00007fff44b077b8 Foundation`__NSFirePerformWithOrder + 360
    frame #9: 0x00007fff4297cf57
CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__
+ 23
    frame #10: 0x00007fff4297ce7f CoreFoundation`__CFRunLoopDoObservers +
527
    frame #11: 0x00007fff4295f3f8 CoreFoundation`__CFRunLoopRun + 1240
    frame #12: 0x00007fff4295ec93 CoreFoundation`CFRunLoopRunSpecific + 483
    frame #13: 0x00007fff41c49d96 HIToolbox`RunCurrentEventLoopInMode + 286
    frame #14: 0x00007fff41c49a0f HIToolbox`ReceiveNextEventCommon + 366
    frame #15: 0x00007fff41c49884
HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 64
    frame #16: 0x00007fff3fef9a73 AppKit`_DPSNextEvent + 2085
    frame #17: 0x00007fff4068fe34 AppKit`-[NSApplication(NSEvent)
_nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044
    frame #18: 0x00007fff3feee885 AppKit`-[NSApplication run] + 764
    frame #19: 0x00007fff3febda72 AppKit`NSApplicationMain + 804
    frame #20: 0x00007fff6a88c015 libdyld.dylib`start + 1
(lldb)

Here I don't know how to proceed, because I have no knowledge of this part
of the VM, it's also quite late now...
Eliot, does it ring some bell?

Le dim. 22 sept. 2019 à 11:21, Jan Vrany <jan.vrany at fit.cvut.cz> a écrit :

>
> On 2019-09-22 10:57, Nicolas Cellier wrote:
> > Hi Jan,
> > Thanks, a recording debugger would indeed be a tremendous help!
> > Unfortunately, I'm unable to reproduce the bug under linux, only
> > macosx...
> > Do you know something equivalent to rr for mac?
>
> No, I don't. There might be something though, I never had a mac...
>
> Jan
>
> >
> > Le dim. 22 sept. 2019 à 09:55, Jan Vrany <jan.vrany at fit.cvut.cz> a
> > écrit :
> >
> >> Hi,
> >>
> >>>
> >>> - can anyone think of any other strategies I might take to try and
> >>> reproduce this?
> >>>
> >> at the risk of stating the obvious, in cases similar to this rr (+
> >> VDB)
> >> helped
> >> me many times. See https://www.youtube.com/watch?v=Qbu4ED8wPYE
> >>
> >> Never used with opensmalltalk VM, though.
> >>
> >> HTH, Jan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20190924/ffef4c7d/attachment.html>


More information about the Vm-dev mailing list