[Vm-dev] Difficult to debug VM crash with full blocks and Sista V1

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Mon Sep 23 22:28:29 UTC 2019


Le mar. 24 sept. 2019 à 00:25, Nicolas Cellier <
nicolas.cellier.aka.nice at gmail.com> a écrit :

> Follow up: so we have a problem of repeatability...
> I found nothing like rr for macos, and CPU watchpoint capability is too
> restricted to be usefull.
>
> So what I did is save a crash.image attempting to crash at resumption time:
>     Smalltalk saveAs: 'crash'.
>     Smalltalk snapshot: true andQuit: true.
>     5 timesRepeat: [SocketTest suite run].
>
> Then I launched the vm under lldb and gathered $pc stats to see if I could
> observe some reproducibility
> (lldb) p/x $pc - methodZoneBase
> 0x00000000000a0e94
> 0x0000000000090f70
> 0x000000000009f824
> 0x000000000009bfc8
> 0x000000000009d030
> 0x000000000009c7c0
> 0x00000000000951e0
> 0x00000000000949c8 <- once
> 0x0000000000094418
> 0x00000000000949c8 <— twice
> 0x000000000009b848
> 0x00000000000a0e94 <- twice
> 0x000000000009a176
> 0x000000006dfff2e0 ???
> 0x00000000000996d1
> 0x00000000000949c8 <- thrice good candidate for watchpoint set
> 0x000000000009ae70
> 0x000000000009ae70
> 0x00000000000a18e7
> 0x0000000000095470
> 0x000000000009c458
> 0x00000000000a0a7c
> 0x0000000000093e89
> 0x000000000009beb0
> 0x0000000000095a37
> 0x0000000000095408
> 0x00000000000a06bc
> 0x000000000009cd48
> 0x0000000000095408 <- twice
> 0x000000000009ba41
> 0x0000000000095408 <- thrice new candidate for watchpoint set
> 0x0000000000045258   <— smallest ever
> 0x000000000009c7c0
> 0x0000000000095408 <- fourth ! the watch point was set here, why didn’t it
> stop??? argh methodZoneBase was incremented after I set the watchpoint!!!
> 0x00000000000a12dc
> 0x00000000000951e0
> 0x00000000000a0e24
> 0x000000000009ff8c
> 0x000000000009c7f6
> 0x000000000009a626
>
> Then I have chosen to watch the most frequent crash address,
> that is methodZoneBase+0x0000000000095408,
> Unfortunately, s commented above, I missed one (I set the watchpoint too
> early, before methodZoneBase took its definitive address).
>
> And after a few (a bunch of)  trials, here is what I found:
>
> (lldb) run ../../image/crash.image
> There is a running process, kill it and restart?: [Y/n]
> Process 28175 exited with status = 9 (0x00000009)
> Process 28185 launched:
> '/Users/nicolas/Smalltalk/OpenSmalltalk/opensmalltalk-vm/build.macos64x64/squeak.cog.spur/Squeak.app/Contents/MacOS/Squeak'
> (x86_64)
> Squeak was compiled with optimization - stepping may behave oddly;
> variables may not be available.
> Process 28185 stopped
> * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 6.1
>     frame #0: 0x000000010007bc22 Squeak`-[sqSqueakOSXApplication
> doHeadlessSetup](self=0x000000010066fc00, _cmd="doHeadlessSetup") at
> sqSqueakOSXApplication.m:125 [opt]
>    122 //No super call here as we've already set headless from command
> line or info.plist
>    123 extern BOOL gSqueakHeadless;
>    124    // Notice that setActivationPolicy: is available in OSX 10.6 and
> later
> -> 125    if ([NSApp respondsToSelector:@selector(setActivationPolicy:)])
> {
>    126        if (gSqueakHeadless) {
>    127            [NSApp
> setActivationPolicy:NSApplicationActivationPolicyProhibited];
>    128        } else {
> Target 0: (Squeak) stopped.
> (lldb) wa s e -s 1 -- methodZoneBase+0x0000000000095408
> Watchpoint created: Watchpoint 33: addr = 0x107c58138 size = 1 state =
> enabled type = w
>     new value: -3689348814741910324
> (lldb) cont
> Process 28185 resuming
>
> Watchpoint 33 hit:
> old value: -3689348814741910324
> new value: -3689348814741910836
> Process 28185 stopped
> * thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint 33
>     frame #0: 0x00000001000731dc
> Squeak`fillInMethodHeadersizeselector(method=0x0000000107c58130, size=544,
> selector=4435118304) at cogitX64SysV.c:6320 [opt]
>    6317    sqInt rawHeader;
>    6318
>    6319 (method->cmType = CMMethod);
> -> 6320 (method->objectHeader = nullHeaderForMachineCodeMethod());
>    6321 (method->blockSize = size);
>    6322 (method->methodObject = methodObj);
>    6323
> Target 0: (Squeak) stopped.
> (lldb)
> Process 28185 resuming
>
> Watchpoint 33 hit:
> old value: -3689348814741910836
> new value: -3689571654825360894
> Process 28185 stopped
> * thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint 33
>     frame #0: 0x0000000100073255
> Squeak`fillInMethodHeadersizeselector(method=0x0000000107c58130, size=544,
> selector=4435118304) at cogitX64SysV.c:6339 [opt]
>    6336 (method->selector = selector);
>    6337 (method->cmNumArgs = argumentCountOfMethodHeader(methodHeader));
>    6338 (method->cmHasMovableLiteral = hasMovableLiteral);
> -> 6339 if ((method->cmRefersToYoung = hasYoungReferent)) {
>    6340 addToYoungReferrers(method);
>    6341 }
>    6342 (method->cmUsageCount = initialMethodUsageCount());
> Target 0: (Squeak) stopped.
> (lldb) bt
> * thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint 33
>   * frame #0: 0x0000000100073255
> Squeak`fillInMethodHeadersizeselector(method=0x0000000107c58130, size=544,
> selector=4435118304) at cogitX64SysV.c:6339 [opt]
>     frame #1: 0x000000010004f90c Squeak`compileCogMethod [inlined]
> generateCogMethod(selector=<unavailable>) at cogitX64SysV.c:7098 [opt]
>     frame #2: 0x000000010004f7a5
> Squeak`compileCogMethod(selector=4435118304) at cogitX64SysV.c:27780 [opt]
>     frame #3: 0x000000010004bc54 Squeak`cogselector(aMethodObj=4468481712,
> aSelectorOop=4435118304) at cogitX64SysV.c:4833 [opt]
>     frame #4: 0x0000000100013d4d Squeak`primitiveClosureValue at
> gcc3x-cointerp.c:27237 [opt]
>     frame #5: 0x0000000107bd1925
>     frame #6: 0x000000010000236b Squeak`interpret at gcc3x-cointerp.c:2767
> [opt]
>     frame #7: 0x000000010008ed55 Squeak`-[sqSqueakMainApplication
> runSqueak](self=0x000000010066fc00, _cmd=<unavailable>) at
> sqSqueakMainApplication.m:201 [opt]
>     frame #8: 0x00007fff44b077b8 Foundation`__NSFirePerformWithOrder + 360
>     frame #9: 0x00007fff4297cf57
> CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__
> + 23
>     frame #10: 0x00007fff4297ce7f CoreFoundation`__CFRunLoopDoObservers +
> 527
>     frame #11: 0x00007fff4295f3f8 CoreFoundation`__CFRunLoopRun + 1240
>     frame #12: 0x00007fff4295ec93 CoreFoundation`CFRunLoopRunSpecific + 483
>     frame #13: 0x00007fff41c49d96 HIToolbox`RunCurrentEventLoopInMode + 286
>     frame #14: 0x00007fff41c49a0f HIToolbox`ReceiveNextEventCommon + 366
>     frame #15: 0x00007fff41c49884
> HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 64
>     frame #16: 0x00007fff3fef9a73 AppKit`_DPSNextEvent + 2085
>     frame #17: 0x00007fff4068fe34 AppKit`-[NSApplication(NSEvent)
> _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044
>     frame #18: 0x00007fff3feee885 AppKit`-[NSApplication run] + 764
>     frame #19: 0x00007fff3febda72 AppKit`NSApplicationMain + 804
>     frame #20: 0x00007fff6a88c015 libdyld.dylib`start + 1
> (lldb)
>
> Here I don't know how to proceed, because I have no knowledge of this part
> of the VM, it's also quite late now...
> Eliot, does it ring some bell?
>
>
Ah, forget it, I thought it was overwriting, but it is the first write on
yet virgin methodeZone...
(it's just that the scenario was different from all the previous run)

Le dim. 22 sept. 2019 à 11:21, Jan Vrany <jan.vrany at fit.cvut.cz> a écrit :
>
>>
>> On 2019-09-22 10:57, Nicolas Cellier wrote:
>> > Hi Jan,
>> > Thanks, a recording debugger would indeed be a tremendous help!
>> > Unfortunately, I'm unable to reproduce the bug under linux, only
>> > macosx...
>> > Do you know something equivalent to rr for mac?
>>
>> No, I don't. There might be something though, I never had a mac...
>>
>> Jan
>>
>> >
>> > Le dim. 22 sept. 2019 à 09:55, Jan Vrany <jan.vrany at fit.cvut.cz> a
>> > écrit :
>> >
>> >> Hi,
>> >>
>> >>>
>> >>> - can anyone think of any other strategies I might take to try and
>> >>> reproduce this?
>> >>>
>> >> at the risk of stating the obvious, in cases similar to this rr (+
>> >> VDB)
>> >> helped
>> >> me many times. See https://www.youtube.com/watch?v=Qbu4ED8wPYE
>> >>
>> >> Never used with opensmalltalk VM, though.
>> >>
>> >> HTH, Jan
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20190924/cc1a5723/attachment-0001.html>


More information about the Vm-dev mailing list