[squeak-dev] Difficult to debug VM crash with full blocks and Sista V1

Nicola Mingotti nmingotti at gmail.com
Sat Sep 14 04:03:30 UTC 2019


I can help you a bit only on this point:
"- is there a way of introducing network delays in Mac OS that might 
help me induce the bug?"

Yes, in theory it is possible. Some time ago I red the documentation of 
'dummynet' in FreeBSD for the firewall 'ipfw', it seemed to be very 
interesting but I never had occasion to use it.

Now, Apple Unix is in large part taken from FreeBSD => I check if they 
took also dummynet:
macOS> apropos dummynet
dummynet(4) ....

So, yes, it is there.

HTH

bye
Nicola









On 9/13/19 8:15 PM, Eliot Miranda wrote:
> Hi All,
>
>     there is a VM bug in 64-bit Spur with the Sista V1 bytecode set 
> and full blocks.  The symptom is that when waiting for a remote 
> Monticello repository to update and/or deliver a package version the 
> system crashes in JITTED code after what appears to be some kind of wait.
>
> This is a reliably occurring bug b ut maddeningly difficult to 
> reproduce.  The bug reliably occurs when interacting with a remote 
> rep[ository (e.g. http://source.squeak.org/VMMaker) when the server is 
> "cold", and hence makes the image wait.  Every time I have tried to 
> repeat the failing sequence the crash has not occurre3d, I think 
> because the server is now "hot" and serves up the version quickly.  
> Today I even tried shutting down my machine for over an hour and 
> rebooting.  But I could not get the crash to occur even though it 
> seems to me that every time I try it the first time in the4 day it 
> does crash.
>
> This is an important bug to fix.  If it cannot be fixed then full 
> blocks and Sista V1 are not ready for use in the upcoming Squeak 
> release.  I am looking for help in debugging this.
>
> - is anyone else uising the 64-bit VM with full blocks and Sista V1 
> who sees hard VM crashes?  If so, under what circumstances?
>
> - is it possible to flush caches in the 
> http://source.squeak.org/VMMaker server, or could people tolerate me 
> rebooting the server?
>
> - is there a way of introducing network delays in Mac OS that might 
> help me induce the bug?
>
> - can anyone think of any other strategies I might take to try and 
> reproduce this?
>
> I may have to try and reproduce e the bug in the simulator to have a 
> chance of identifying the bug. Does anyone have a good enough mental 
> model of the Monticello server interaction and have energy to help me 
> figure this one out?
>
> Here is some information from the last crash I did see in the debugger 
> (alas it is incomplete; there are a number of additional pieces of 
> info I could have collected).
>
> (lldb) thr b
>
> * thread #1, queue = 'com.apple.main-thread', stop reason = 
> EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
>
> * frame #0: 0x000000010de5700a
>
>   frame #1: 0x000000010dd7b174
>
>   frame #2: 0x000000010dd45f1c
>
>   frame #3: 0x000000010dd44534
>
>   frame #4: 0x000000010dd44c60
>
> (lldb) x/10i 0x000000010de5700a
>
>
> (lldb) call printStackCallStackOf($rbp)
>
>   0x7ffeefbdfc30 M Heap>upHeap: 0x11273ca90: a(n) Heap
>
>   0x7ffeefbdfc68 M Heap>add: 0x11273ca90: a(n) Heap
>
>   0x7ffeefbdfca0 M Delay class>scheduleDelay:from: 0x1123ebfb8: a(n) 
> Delay class
>
>   0x7ffeefbdfcf0 M Delay class>handleTimerEvent 0x1123ebfb8: a(n) 
> Delay class
>
>   0x7ffeefbdfd20 M Delay class>runTimerEventLoop 0x1123ebfb8: a(n) 
> Delay class
>
>
> (lldb) x/10i 0x000000010dd7b174
>
>   0x10dd7b174: 48 8b 55 10  movq   0x10(%rbp), %rdx
>
>   0x10dd7b178: 48 89 ec     movq   %rbp, %rsp
>
>   0x10dd7b17b: 5d           popq   %rbp
>
>   0x10dd7b17c: c2 10 00     retq   $0x10
>
>   0x10dd7b17f: cc           int3
>
>   0x10dd7b180: cc           int3
>
>   0x10dd7b181: cc           int3
>
>   0x10dd7b182: cc           int3
>
>   0x10dd7b183: cc           int3
>
>   0x10dd7b184: cc           int3
>
> (lldb) print whereIs(0x000000010dd7b174)
>
> (char *) $0 = 0x00000001000f83ff " is in generated methods"
>
> (lldb) call printCogMethodFor((void *)0x000000010dd7b174)
>
>     0x10dd7afc0 <->        0x10dd7b198: method:        0x112f23c10 
> selector: 0x112232c20 add:
>
> (lldb) print whereIs(0x000000010de5700a)
>
> (char *) $1 = 0x00000001000f83ff " is in generated methods"
>
> (lldb) call printCogMethodFor((void *)0x000000010de5700a)
>
>     0x10de56ba0 <->        0x10de57078: method:        0x1126ec218 
> prim 23856 selector: 0x7ffeefbf3d20
>
>
> this method ends up being the fitted version of Delay 
> class>> startTimerEventLoop
> _,,,^..^,,,_
> best, Eliot
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20190913/c173bb32/attachment.html>


More information about the Squeak-dev mailing list