[Vm-dev] Difficult to debug VM crash with full blocks and Sista V1

Eliot Miranda eliot.miranda at gmail.com
Fri Oct 4 19:01:38 UTC 2019


On Thu, Oct 3, 2019 at 1:47 PM Nicolas Cellier <
nicolas.cellier.aka.nice at gmail.com> wrote:

> If I comment /* signal(SIGIO,forceInterruptCheck) */ in
> platforms//unix/vm/aio.c
> then I don't experiment crashes anymore
>
> One solution if we want to manipulate the rsp safely is
> - to define a specific stack for signal handling via sigaltstack()
> - to use that specific stack via sigaction() with SA_ONSTACK flag
>

That's the right solution as far as I understand these issues.


>
> That's too low level and I prefer some assistance here.
> There are several calls to signal in the VM so they should all be reviewed.
>

Right, I will do so today.


> Le jeu. 3 oct. 2019 à 22:10, Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com> a écrit :
>
>> I confirm that each crash follows a SIGIO -> forceInterruptCheck() with
>> corrupted stack pointer pointing to methodZone (that is executable zone so
>> program counter).
>> I did:
>> 1) remove static declaration of methodZoneBase and limitAddress in
>> cogitX64Sysv.c so as to make them exported globals
>> 2) declared them as extern sqInt in gcc-cointerp.c
>> 3) instrumented forceInterruptCheck() so as to emit a warning and/or set
>> a breakpoint:
>>
>>     if ( getsp() > methodZoneBase && getsp() <= limitAddress ) {
>>         printf("pb in forceInterruptCheck\n");
>>     }
>>
>> Unlike what I said, forceInterruptCheck() does nothing special and
>> certainly not invoke Smalltalk code nor JITTED code.
>> The interruptCheckChain function that it could invoke is generally null
>> (except for debug/simulation purpose), and always null in the crash i
>> observed.
>> The signal handler also generally operate on same stack (unless
>> specifically asked via sigaction() - which is not the case).
>> My earler diagnostic was driven by the false assumption that handler
>> would use the interrupt stack pointer (IST), and thus that only special
>> stack pointer manipulation happening in trampoline could corrupt it, hmmm
>> that was late night blurred minded ;)
>>
>> What comes to my mind:
>> - the signal SIGIO might happen while we are manipulating the stack
>> pointer, hence the heisenbug
>> - the stack pointer could be offseted by some incorrect code (but why no
>> error would happen before SIGIO then?)
>>
>> I don't have clear understanding of all spur machinery details that would
>> help further diagnosis
>> Eliot, doesn't it sound interesting?
>>
>> Le jeu. 3 oct. 2019 à 08:10, Nicolas Cellier <
>> nicolas.cellier.aka.nice at gmail.com> a écrit :
>>
>>> Maybe I draw conclusions too quickly.
>>> But clearly, the signal handler has a stack pointer pointing in
>>> methodZone.
>>>
>>> (lldb) p/x $rsp
>>> (unsigned long) $211 = 0x0000000107c92da0
>>> (lldb) p/x methodZoneBase
>>> (sqInt) $212 = 0x0000000107c01d30
>>> (lldb) p/x limitAddress
>>> (usqInt) $213 = 0x0000000107d5e000
>>>
>>> It remains to see how that happened.
>>>
>>> Le jeu. 3 oct. 2019 à 05:49, Nicolas Cellier <
>>> nicolas.cellier.aka.nice at gmail.com> a écrit :
>>>
>>>> Hi all,
>>>> So I finally got it!
>>>> That's our specific handling of SIGIO with forceInterruptCheck.
>>>> Don't you JIT the signal handler!
>>>>
>>>> The signal handler is using a different stack pointer so as to not
>>>> interfere with the program.
>>>> Until Cog decide to jump into JITTED code from within the signal
>>>> handler.
>>>> It now share a stack pointer pointing somewhere in cog methodZone, and
>>>> anything it does with it's stack (call a function, save registers...) will
>>>> destroy the JITTED code.
>>>> It even happened when I used the debugger which shares the interrupt
>>>> handler stack pointer!
>>>>
>>>> Why does it not happen in linux, and why only Mac OS 64, I have no
>>>> clue...
>>>> But we can't decently live very long with such handling.
>>>>
>>>> Le mer. 2 oct. 2019 à 23:29, Nicolas Cellier <
>>>> nicolas.cellier.aka.nice at gmail.com> a écrit :
>>>>
>>>>> Hi Eliot,
>>>>> so I have set watchpoint to 4 previous crash and got this:
>>>>>
>>>>> (lldb) run ../../image/crash.image
>>>>> There is a running process, kill it and restart?: [Y/n]
>>>>> Process 98150 exited with status = 9 (0x00000009)
>>>>> Process 98164 launched:
>>>>> '/Users/nicolas/Smalltalk/OpenSmalltalk/opensmalltalk-vm/build.macos64x64/squeak.cog.spur/SqueakDebug.app/Contents/MacOS/Squeak'
>>>>> (x86_64)
>>>>> Process 98164 stopped
>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint
>>>>> 1.1
>>>>>     frame #0: 0x00000001001618ec Squeak`-[sqSqueakMainApplication
>>>>> setupMenus](self=0x000000010076e1a0, _cmd="setupMenus") at
>>>>> sqSqueakMainApplication.m:123
>>>>>    120
>>>>>    121 - (void) setupMenus {
>>>>>    122 //    nothing to do so far since the menu is setup in the
>>>>> MainMenu.nib file
>>>>> -> 123 }
>>>>>    124
>>>>>    125 - (void) setupTimers {
>>>>>    126 extern void SetUpTimers(void);
>>>>> Target 0: (Squeak) stopped.
>>>>> (lldb) wa se e -s 1 -- 0x107c992b4
>>>>> Watchpoint created: Watchpoint 35: addr = 0x107c992b4 size = 1 state =
>>>>> enabled type = w
>>>>> (lldb) wa se e -s 1 -- 0x107c929e0
>>>>> Watchpoint created: Watchpoint 34: addr = 0x107c929e0 size = 1 state =
>>>>> enabled type = w
>>>>> (lldb) wa se e -s 1 -- 0x0000000107c945df
>>>>> Watchpoint created: Watchpoint 36: addr = 0x107c945df size = 1 state =
>>>>> enabled type = w
>>>>> (lldb) wa se e -s 1 -- 0x0000000107c92da3
>>>>> Watchpoint created: Watchpoint 37: addr = 0x107c92da3 size = 1 state =
>>>>> enabled type = w
>>>>>     new value: -3689348814741910324
>>>>> (lldb) c
>>>>> Process 98164 resuming
>>>>>
>>>>> Watchpoint 34 hit:
>>>>> new value: -3689348814741910456
>>>>> Process 98164 stopped
>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint
>>>>> 34
>>>>>     frame #0: 0x0000000100102067
>>>>> Squeak`byteAtPointerput(ptr="H???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????,
>>>>> val=72) at sqMemoryAccess.h:136
>>>>>    133  /* Use static inline functions when the compiler produces
>>>>> efficient code for small accessors.
>>>>>    134     These are preferred because static type checking will
>>>>> prevent inadvertent confusion of pointers and oops. */
>>>>>    135  static inline sqInt byteAtPointer(char *ptr) { return
>>>>> (sqInt)(*((unsigned char *)ptr)); }
>>>>> -> 136  static inline sqInt byteAtPointerput(char *ptr, int val) {
>>>>> return (sqInt)(*((unsigned char *)ptr)= (unsigned char)val); }
>>>>>    137  static inline sqInt shortAtPointer(char *ptr) { return
>>>>> (sqInt)(*((short *)ptr)); }
>>>>>    138  static inline sqInt shortAtPointerput(char *ptr, int val) {
>>>>> return (sqInt)(*((short *)ptr)= (short)val); }
>>>>>    139  static inline sqInt intAtPointer(char *ptr) { return
>>>>> (sqInt)(*((int *)ptr)); }
>>>>> Target 0: (Squeak) stopped.
>>>>> (lldb)
>>>>> Process 98164 resuming
>>>>>
>>>>> Watchpoint 37 hit:
>>>>> old value: -3689348814741910324
>>>>> new value: -3689912642157150198
>>>>> Process 98164 stopped
>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint
>>>>> 37
>>>>>     frame #0: 0x000000010012de84
>>>>> Squeak`fillInMethodHeadersizeselector(method=0x0000000107c92da0, size=1240,
>>>>> selector=4435376352) at cogitX64SysV.c:6321
>>>>>    6318
>>>>>    6319 (method->cmType = CMMethod);
>>>>>    6320 (method->objectHeader = nullHeaderForMachineCodeMethod());
>>>>> -> 6321 (method->blockSize = size);
>>>>>    6322 (method->methodObject = methodObj);
>>>>>    6323
>>>>>    6324 /* If the method has already been cogged (e.g. Newspeak
>>>>> accessors) then
>>>>> Target 0: (Squeak) stopped.
>>>>> (lldb)
>>>>> Process 98164 resuming
>>>>>
>>>>> Watchpoint 36 hit:
>>>>> new value: -3689348814741910520
>>>>> Process 98164 stopped
>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint
>>>>> 36
>>>>>     frame #0: 0x0000000100102067
>>>>> Squeak`byteAtPointerput(ptr="\b???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????,
>>>>> val=8) at sqMemoryAccess.h:136
>>>>>    133  /* Use static inline functions when the compiler produces
>>>>> efficient code for small accessors.
>>>>>    134     These are preferred because static type checking will
>>>>> prevent inadvertent confusion of pointers and oops. */
>>>>>    135  static inline sqInt byteAtPointer(char *ptr) { return
>>>>> (sqInt)(*((unsigned char *)ptr)); }
>>>>> -> 136  static inline sqInt byteAtPointerput(char *ptr, int val) {
>>>>> return (sqInt)(*((unsigned char *)ptr)= (unsigned char)val); }
>>>>>    137  static inline sqInt shortAtPointer(char *ptr) { return
>>>>> (sqInt)(*((short *)ptr)); }
>>>>>    138  static inline sqInt shortAtPointerput(char *ptr, int val) {
>>>>> return (sqInt)(*((short *)ptr)= (short)val); }
>>>>>    139  static inline sqInt intAtPointer(char *ptr) { return
>>>>> (sqInt)(*((int *)ptr)); }
>>>>> Target 0: (Squeak) stopped.
>>>>> (lldb)
>>>>> Process 98164 resuming
>>>>>
>>>>> Watchpoint 35 hit:
>>>>> new value: -3689348814741910456
>>>>> Process 98164 stopped
>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint
>>>>> 35
>>>>>     frame #0: 0x0000000100102067
>>>>> Squeak`byteAtPointerput(ptr="H???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????,
>>>>> val=72) at sqMemoryAccess.h:136
>>>>>    133  /* Use static inline functions when the compiler produces
>>>>> efficient code for small accessors.
>>>>>    134     These are preferred because static type checking will
>>>>> prevent inadvertent confusion of pointers and oops. */
>>>>>    135  static inline sqInt byteAtPointer(char *ptr) { return
>>>>> (sqInt)(*((unsigned char *)ptr)); }
>>>>> -> 136  static inline sqInt byteAtPointerput(char *ptr, int val) {
>>>>> return (sqInt)(*((unsigned char *)ptr)= (unsigned char)val); }
>>>>>    137  static inline sqInt shortAtPointer(char *ptr) { return
>>>>> (sqInt)(*((short *)ptr)); }
>>>>>    138  static inline sqInt shortAtPointerput(char *ptr, int val) {
>>>>> return (sqInt)(*((short *)ptr)= (short)val); }
>>>>>    139  static inline sqInt intAtPointer(char *ptr) { return
>>>>> (sqInt)(*((int *)ptr)); }
>>>>> Target 0: (Squeak) stopped.
>>>>> (lldb)
>>>>> Process 98164 resuming
>>>>>
>>>>> Watchpoint 37 hit:
>>>>> old value: -3689912642157150198
>>>>> new value: -5066732099722018553
>>>>> Process 98164 stopped
>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint
>>>>> 37
>>>>>     frame #0: 0x000000010002e801 Squeak`forceInterruptCheck at
>>>>> gcc3x-cointerp.c:64151
>>>>>    64148 /* StackInterpreter>>#forceInterruptCheck */
>>>>>    64149 sqInt
>>>>>    64150 forceInterruptCheck(void)
>>>>> -> 64151 {   DECL_MAYBE_SQ_GLOBAL_STRUCT
>>>>>    64152    void (*iccFunc)();
>>>>>    64153    StackPage *thePage;
>>>>>    64154
>>>>> Target 0: (Squeak) stopped.
>>>>> (lldb) bt
>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint
>>>>> 37
>>>>>   * frame #0: 0x000000010002e801 Squeak`forceInterruptCheck at
>>>>> gcc3x-cointerp.c:64151
>>>>>     frame #1: 0x00007fff6ab9af5a libsystem_platform.dylib`_sigtramp +
>>>>> 26
>>>>>     frame #2: 0x0000000107c10699
>>>>>     frame #3: 0x0000000107c9646f
>>>>>     frame #4: 0x0000000107c9633e
>>>>>     frame #5: 0x0000000107c962ac
>>>>>     frame #6: 0x0000000107c961e9
>>>>>     frame #7: 0x0000000107c8fcf8
>>>>>     frame #8: 0x0000000107c98605
>>>>>     frame #9: 0x0000000107c8bbfe
>>>>>     frame #10: 0x0000000107c1100e
>>>>>     frame #11: 0x0000000107c8c4ad
>>>>>     frame #12: 0x0000000107c0cef6
>>>>>     frame #13: 0x0000000107c8c2a5
>>>>>     frame #14: 0x0000000107c8bb4c
>>>>>     frame #15: 0x0000000107c0cef6
>>>>>     frame #16: 0x0000000107c8ba05
>>>>>     frame #17: 0x0000000107c8b85b
>>>>>     frame #18: 0x0000000107c1100e
>>>>>     frame #19: 0x0000000107c8b7e1
>>>>>     frame #20: 0x0000000107c57018
>>>>>     frame #21: 0x0000000107c018c8
>>>>> (lldb) call printCallStack()
>>>>>
>>>>> Watchpoint 34 hit:
>>>>> old value: -3689348814741910456
>>>>> new value: 4435376368
>>>>> error: Execution was interrupted, reason: watchpoint 34.
>>>>> The process has been returned to the state before expression
>>>>> evaluation.
>>>>> (lldb) call printCogMethodFor((void*)0x0000000107c10699)
>>>>>
>>>>> Watchpoint 34 hit:
>>>>> old value: 4435376368
>>>>> new value: 0
>>>>> error: Execution was interrupted, reason: watchpoint 34.
>>>>> The process has been returned to the state before expression
>>>>> evaluation.
>>>>>
>>>>> I don't know how to proceed, but maybe there's enough clue for you...
>>>>>
>>>>> Le lun. 30 sept. 2019 à 01:28, Eliot Miranda <eliot.miranda at gmail.com>
>>>>> a écrit :
>>>>>
>>>>>>
>>>>>> Hi Nicolas,
>>>>>>
>>>>>> On Wed, Sep 25, 2019 at 12:32 AM Nicolas Cellier <
>>>>>> nicolas.cellier.aka.nice at gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> I noticed that many crash happen in testSendTimeout.
>>>>>>> This test is organizing a race:
>>>>>>> - Smalltalk fill the socket send buffer
>>>>>>> - the OS tries to drain
>>>>>>>
>>>>>>> The other thing I noticed is that [SocketTest suite run] takes 7 to
>>>>>>> 8s on macos while only 2 to 3s on ubuntu and windows
>>>>>>> So it might be that some test times out on macos, while it doesn't
>>>>>>> on other OS.
>>>>>>>
>>>>>>> This would explain that many crash also happen in JITted Timer loop,
>>>>>>> and that we cannot observe it in other OSes.
>>>>>>>
>>>>>>
>>>>>> Yes, that is, I think, an important clue.  The other one is that,
>>>>>> given that the crash always happens only early in a run, plus the address
>>>>>> at which it crashes in machine code, suggests to me that the crash occurs
>>>>>> when the interpreter invokes a block in the JITted startTimerLoop.
>>>>>>
>>>>>> What worries me is that the JITted code for startTimerLoop looks
>>>>>> completely corrupted, not merely moved or overwritten (the method header
>>>>>> looks OK up to the selector field, which is the last field in the header).
>>>>>> So I'm wondering whether it could be a bug in support code overwriting
>>>>>> machine code, and hence that not might be related to a 32-bit vs 64-bit
>>>>>> code issue.  I was hoping your recent sqUnixSocket.c changes might work but
>>>>>> I'm sure they have no bearing.
>>>>>>
>>>>>> The access to OS ressources and the race may also explain some
>>>>>>> randomization of the crash...
>>>>>>>
>>>>>>> So one idea would be to make the test timeout in linux too, see if
>>>>>>> we can make it crash then try using rr.
>>>>>>> Maybe it's possible with a huge Socket buffer, a smaller image side
>>>>>>> buffer (we could reduce the size from 1000 to 1 so as to increase overhead).
>>>>>>>
>>>>>>
>>>>>> Yes, that's a good idea.  But I do think we need to set up a reliable
>>>>>> remote server, som thing that acts like an infinite sink for receiving a
>>>>>> par4ticular large Monticello package, so that ev very time the example runs
>>>>>> the same package is indeed copied to a remote server.  Would it be possible
>>>>>> to construct such a fake Monticello server?
>>>>>>
>>>>>> What I'm wanting the server to do is
>>>>>> a) take a while to respond to the Monticello package contents request
>>>>>> b) to never include a particular package which the example would
>>>>>> always try and upload
>>>>>>
>>>>>> The test case would be to copy a particular version of VMMaker from
>>>>>> package-cache to the remote "fake" server
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> It's just theories I tried to construct while biking, not in front
>>>>>>> of a keyboard, and I won't be able to verify until this evening...
>>>>>>> But the watchpoint strategy is too much vain shooting in the dark,
>>>>>>> we need more ellaborated plans ;)
>>>>>>>
>>>>>>
>>>>>> Does anyone see the crash on macOS in 32 bits?
>>>>>>
>>>>>> Le mar. 24 sept. 2019 à 00:28, Nicolas Cellier <
>>>>>>> nicolas.cellier.aka.nice at gmail.com> a écrit :
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Le mar. 24 sept. 2019 à 00:25, Nicolas Cellier <
>>>>>>>> nicolas.cellier.aka.nice at gmail.com> a écrit :
>>>>>>>>
>>>>>>>>> Follow up: so we have a problem of repeatability...
>>>>>>>>> I found nothing like rr for macos, and CPU watchpoint capability
>>>>>>>>> is too restricted to be usefull.
>>>>>>>>>
>>>>>>>>> So what I did is save a crash.image attempting to crash at
>>>>>>>>> resumption time:
>>>>>>>>>     Smalltalk saveAs: 'crash'.
>>>>>>>>>     Smalltalk snapshot: true andQuit: true.
>>>>>>>>>     5 timesRepeat: [SocketTest suite run].
>>>>>>>>>
>>>>>>>>> Then I launched the vm under lldb and gathered $pc stats to see if
>>>>>>>>> I could observe some reproducibility
>>>>>>>>> (lldb) p/x $pc - methodZoneBase
>>>>>>>>> 0x00000000000a0e94
>>>>>>>>> 0x0000000000090f70
>>>>>>>>> 0x000000000009f824
>>>>>>>>> 0x000000000009bfc8
>>>>>>>>> 0x000000000009d030
>>>>>>>>> 0x000000000009c7c0
>>>>>>>>> 0x00000000000951e0
>>>>>>>>> 0x00000000000949c8 <- once
>>>>>>>>> 0x0000000000094418
>>>>>>>>> 0x00000000000949c8 <— twice
>>>>>>>>> 0x000000000009b848
>>>>>>>>> 0x00000000000a0e94 <- twice
>>>>>>>>> 0x000000000009a176
>>>>>>>>> 0x000000006dfff2e0 ???
>>>>>>>>> 0x00000000000996d1
>>>>>>>>> 0x00000000000949c8 <- thrice good candidate for watchpoint set
>>>>>>>>> 0x000000000009ae70
>>>>>>>>> 0x000000000009ae70
>>>>>>>>> 0x00000000000a18e7
>>>>>>>>> 0x0000000000095470
>>>>>>>>> 0x000000000009c458
>>>>>>>>> 0x00000000000a0a7c
>>>>>>>>> 0x0000000000093e89
>>>>>>>>> 0x000000000009beb0
>>>>>>>>> 0x0000000000095a37
>>>>>>>>> 0x0000000000095408
>>>>>>>>> 0x00000000000a06bc
>>>>>>>>> 0x000000000009cd48
>>>>>>>>> 0x0000000000095408 <- twice
>>>>>>>>> 0x000000000009ba41
>>>>>>>>> 0x0000000000095408 <- thrice new candidate for watchpoint set
>>>>>>>>> 0x0000000000045258   <— smallest ever
>>>>>>>>> 0x000000000009c7c0
>>>>>>>>> 0x0000000000095408 <- fourth ! the watch point was set here, why
>>>>>>>>> didn’t it stop??? argh methodZoneBase was incremented after I set the
>>>>>>>>> watchpoint!!!
>>>>>>>>> 0x00000000000a12dc
>>>>>>>>> 0x00000000000951e0
>>>>>>>>> 0x00000000000a0e24
>>>>>>>>> 0x000000000009ff8c
>>>>>>>>> 0x000000000009c7f6
>>>>>>>>> 0x000000000009a626
>>>>>>>>>
>>>>>>>>> Then I have chosen to watch the most frequent crash address,
>>>>>>>>> that is methodZoneBase+0x0000000000095408,
>>>>>>>>> Unfortunately, s commented above, I missed one (I set the
>>>>>>>>> watchpoint too early, before methodZoneBase took its definitive address).
>>>>>>>>>
>>>>>>>>> And after a few (a bunch of)  trials, here is what I found:
>>>>>>>>>
>>>>>>>>> (lldb) run ../../image/crash.image
>>>>>>>>> There is a running process, kill it and restart?: [Y/n]
>>>>>>>>> Process 28175 exited with status = 9 (0x00000009)
>>>>>>>>> Process 28185 launched:
>>>>>>>>> '/Users/nicolas/Smalltalk/OpenSmalltalk/opensmalltalk-vm/build.macos64x64/squeak.cog.spur/Squeak.app/Contents/MacOS/Squeak'
>>>>>>>>> (x86_64)
>>>>>>>>> Squeak was compiled with optimization - stepping may behave oddly;
>>>>>>>>> variables may not be available.
>>>>>>>>> Process 28185 stopped
>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>>>>>>>>> breakpoint 6.1
>>>>>>>>>     frame #0: 0x000000010007bc22 Squeak`-[sqSqueakOSXApplication
>>>>>>>>> doHeadlessSetup](self=0x000000010066fc00, _cmd="doHeadlessSetup") at
>>>>>>>>> sqSqueakOSXApplication.m:125 [opt]
>>>>>>>>>    122 //No super call here as we've already set headless from
>>>>>>>>> command line or info.plist
>>>>>>>>>    123 extern BOOL gSqueakHeadless;
>>>>>>>>>    124    // Notice that setActivationPolicy: is available in OSX
>>>>>>>>> 10.6 and later
>>>>>>>>> -> 125    if ([NSApp respondsToSelector:@selector(setActivationPolicy:)])
>>>>>>>>> {
>>>>>>>>>    126        if (gSqueakHeadless) {
>>>>>>>>>    127            [NSApp
>>>>>>>>> setActivationPolicy:NSApplicationActivationPolicyProhibited];
>>>>>>>>>    128        } else {
>>>>>>>>> Target 0: (Squeak) stopped.
>>>>>>>>> (lldb) wa s e -s 1 -- methodZoneBase+0x0000000000095408
>>>>>>>>> Watchpoint created: Watchpoint 33: addr = 0x107c58138 size = 1
>>>>>>>>> state = enabled type = w
>>>>>>>>>     new value: -3689348814741910324
>>>>>>>>> (lldb) cont
>>>>>>>>> Process 28185 resuming
>>>>>>>>>
>>>>>>>>> Watchpoint 33 hit:
>>>>>>>>> old value: -3689348814741910324
>>>>>>>>> new value: -3689348814741910836
>>>>>>>>> Process 28185 stopped
>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>>>>>>>>> watchpoint 33
>>>>>>>>>     frame #0: 0x00000001000731dc
>>>>>>>>> Squeak`fillInMethodHeadersizeselector(method=0x0000000107c58130, size=544,
>>>>>>>>> selector=4435118304) at cogitX64SysV.c:6320 [opt]
>>>>>>>>>    6317    sqInt rawHeader;
>>>>>>>>>    6318
>>>>>>>>>    6319 (method->cmType = CMMethod);
>>>>>>>>> -> 6320 (method->objectHeader = nullHeaderForMachineCodeMethod());
>>>>>>>>>    6321 (method->blockSize = size);
>>>>>>>>>    6322 (method->methodObject = methodObj);
>>>>>>>>>    6323
>>>>>>>>> Target 0: (Squeak) stopped.
>>>>>>>>> (lldb)
>>>>>>>>> Process 28185 resuming
>>>>>>>>>
>>>>>>>>> Watchpoint 33 hit:
>>>>>>>>> old value: -3689348814741910836
>>>>>>>>> new value: -3689571654825360894
>>>>>>>>> Process 28185 stopped
>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>>>>>>>>> watchpoint 33
>>>>>>>>>     frame #0: 0x0000000100073255
>>>>>>>>> Squeak`fillInMethodHeadersizeselector(method=0x0000000107c58130, size=544,
>>>>>>>>> selector=4435118304) at cogitX64SysV.c:6339 [opt]
>>>>>>>>>    6336 (method->selector = selector);
>>>>>>>>>    6337 (method->cmNumArgs =
>>>>>>>>> argumentCountOfMethodHeader(methodHeader));
>>>>>>>>>    6338 (method->cmHasMovableLiteral = hasMovableLiteral);
>>>>>>>>> -> 6339 if ((method->cmRefersToYoung = hasYoungReferent)) {
>>>>>>>>>    6340 addToYoungReferrers(method);
>>>>>>>>>    6341 }
>>>>>>>>>    6342 (method->cmUsageCount = initialMethodUsageCount());
>>>>>>>>> Target 0: (Squeak) stopped.
>>>>>>>>> (lldb) bt
>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>>>>>>>>> watchpoint 33
>>>>>>>>>   * frame #0: 0x0000000100073255
>>>>>>>>> Squeak`fillInMethodHeadersizeselector(method=0x0000000107c58130, size=544,
>>>>>>>>> selector=4435118304) at cogitX64SysV.c:6339 [opt]
>>>>>>>>>     frame #1: 0x000000010004f90c Squeak`compileCogMethod [inlined]
>>>>>>>>> generateCogMethod(selector=<unavailable>) at cogitX64SysV.c:7098 [opt]
>>>>>>>>>     frame #2: 0x000000010004f7a5
>>>>>>>>> Squeak`compileCogMethod(selector=4435118304) at cogitX64SysV.c:27780 [opt]
>>>>>>>>>     frame #3: 0x000000010004bc54
>>>>>>>>> Squeak`cogselector(aMethodObj=4468481712, aSelectorOop=4435118304) at
>>>>>>>>> cogitX64SysV.c:4833 [opt]
>>>>>>>>>     frame #4: 0x0000000100013d4d Squeak`primitiveClosureValue at
>>>>>>>>> gcc3x-cointerp.c:27237 [opt]
>>>>>>>>>     frame #5: 0x0000000107bd1925
>>>>>>>>>     frame #6: 0x000000010000236b Squeak`interpret at
>>>>>>>>> gcc3x-cointerp.c:2767 [opt]
>>>>>>>>>     frame #7: 0x000000010008ed55 Squeak`-[sqSqueakMainApplication
>>>>>>>>> runSqueak](self=0x000000010066fc00, _cmd=<unavailable>) at
>>>>>>>>> sqSqueakMainApplication.m:201 [opt]
>>>>>>>>>     frame #8: 0x00007fff44b077b8
>>>>>>>>> Foundation`__NSFirePerformWithOrder + 360
>>>>>>>>>     frame #9: 0x00007fff4297cf57
>>>>>>>>> CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__
>>>>>>>>> + 23
>>>>>>>>>     frame #10: 0x00007fff4297ce7f
>>>>>>>>> CoreFoundation`__CFRunLoopDoObservers + 527
>>>>>>>>>     frame #11: 0x00007fff4295f3f8 CoreFoundation`__CFRunLoopRun +
>>>>>>>>> 1240
>>>>>>>>>     frame #12: 0x00007fff4295ec93
>>>>>>>>> CoreFoundation`CFRunLoopRunSpecific + 483
>>>>>>>>>     frame #13: 0x00007fff41c49d96
>>>>>>>>> HIToolbox`RunCurrentEventLoopInMode + 286
>>>>>>>>>     frame #14: 0x00007fff41c49a0f HIToolbox`ReceiveNextEventCommon
>>>>>>>>> + 366
>>>>>>>>>     frame #15: 0x00007fff41c49884
>>>>>>>>> HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 64
>>>>>>>>>     frame #16: 0x00007fff3fef9a73 AppKit`_DPSNextEvent + 2085
>>>>>>>>>     frame #17: 0x00007fff4068fe34 AppKit`-[NSApplication(NSEvent)
>>>>>>>>> _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044
>>>>>>>>>     frame #18: 0x00007fff3feee885 AppKit`-[NSApplication run] + 764
>>>>>>>>>     frame #19: 0x00007fff3febda72 AppKit`NSApplicationMain + 804
>>>>>>>>>     frame #20: 0x00007fff6a88c015 libdyld.dylib`start + 1
>>>>>>>>> (lldb)
>>>>>>>>>
>>>>>>>>> Here I don't know how to proceed, because I have no knowledge of
>>>>>>>>> this part of the VM, it's also quite late now...
>>>>>>>>> Eliot, does it ring some bell?
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Ah, forget it, I thought it was overwriting, but it is the first
>>>>>>>> write on yet virgin methodeZone...
>>>>>>>> (it's just that the scenario was different from all the previous
>>>>>>>> run)
>>>>>>>>
>>>>>>>> Le dim. 22 sept. 2019 à 11:21, Jan Vrany <jan.vrany at fit.cvut.cz> a
>>>>>>>>> écrit :
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019-09-22 10:57, Nicolas Cellier wrote:
>>>>>>>>>> > Hi Jan,
>>>>>>>>>> > Thanks, a recording debugger would indeed be a tremendous help!
>>>>>>>>>> > Unfortunately, I'm unable to reproduce the bug under linux, only
>>>>>>>>>> > macosx...
>>>>>>>>>> > Do you know something equivalent to rr for mac?
>>>>>>>>>>
>>>>>>>>>> No, I don't. There might be something though, I never had a mac...
>>>>>>>>>>
>>>>>>>>>> Jan
>>>>>>>>>>
>>>>>>>>>> >
>>>>>>>>>> > Le dim. 22 sept. 2019 à 09:55, Jan Vrany <jan.vrany at fit.cvut.cz>
>>>>>>>>>> a
>>>>>>>>>> > écrit :
>>>>>>>>>> >
>>>>>>>>>> >> Hi,
>>>>>>>>>> >>
>>>>>>>>>> >>>
>>>>>>>>>> >>> - can anyone think of any other strategies I might take to
>>>>>>>>>> try and
>>>>>>>>>> >>> reproduce this?
>>>>>>>>>> >>>
>>>>>>>>>> >> at the risk of stating the obvious, in cases similar to this
>>>>>>>>>> rr (+
>>>>>>>>>> >> VDB)
>>>>>>>>>> >> helped
>>>>>>>>>> >> me many times. See https://www.youtube.com/watch?v=Qbu4ED8wPYE
>>>>>>>>>> >>
>>>>>>>>>> >> Never used with opensmalltalk VM, though.
>>>>>>>>>> >>
>>>>>>>>>> >> HTH, Jan
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> _,,,^..^,,,_
>>>>>> best, Eliot
>>>>>>
>>>>>

-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20191004/039bef04/attachment-0001.html>


More information about the Vm-dev mailing list