<div dir="ltr"><div>Hi Eliot,</div><div>thanks for the heads up. Also remember that I was not able to produce such crash on Windows nor linux spur64, and I don't remember of anyone reporting it...<br></div><div>This make it not obvious that we can produce it in the Simulator which is yet another machine...</div><div>I also never saw such problem in spur32.</div><div><br></div><div>On the low level side (lldb/gdb) it is possible to set watchpoint on this code zone</div><div>watchpoint set expression --size 10 -- <font face="arial, sans-serif">0x10cefe70a</font></div><div><font face="arial, sans-serif">w s e -s 2 -- <font face="arial, sans-serif">0x10cefe70a</font></font></div><div><font face="arial, sans-serif"><font face="arial, sans-serif">Don't know if feasible (is there any compaction in cog method zone? is the debugger fast enough with the instrumentation?)<br></font></font></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le ven. 20 sept. 2019 à 21:38, Eliot Miranda <<a href="mailto:eliot.miranda@gmail.com">eliot.miranda@gmail.com</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><font face="arial, sans-serif">Hi All,<br><br> two further reports. One is that attempting to simulate SocketTest suite run with a small code zone to provoke code comp-actions did not produce a crash. Sad, because I left it running over night. There's still more work to do with simulation here though because the simulation is far from perfect; only two out of 14 tests pass. At least they run ;-)<br><br>14 run, 2 passes, 0 expected failures, 6 failures, 6 errors, 0 unexpected passes<br>squeak> <br><br>The other report is that I have much more information on the nature of the crash and reason to suspect something to do with code zone management. The JITted version of the Delay class>>#startTimerEventLoop method is badly corrupted from its last field on:<br><br>(lldb) call printCogMethodFor((void *)0x10cefe70a)<br> 0x10cefe288 <-> 0x10cefe760: method: 0x1117a96a8 selector: 0x7fff58e9af6d " is no where obvious"<br><br>*(CogMethod *)0x10cefe288<br>(CogMethod) $14 = {<br> objectHeader = 0x008000000a000035<br> cmNumArgs = 0<br> cmType = 2 CMMethod<br> cmRefersToYoung = false<br> cpicHasMNUCaseOrCMIsFullBlock = false<br> cmUsageCount = 1<br> cmUsesPenultimateLit = false<br> cbUsesInstVars = false<br> cmHasMovableLiteral = true<br> cmUnusedFlag = 0x00000001<br> stackCheckOffset = 116/0x74<br> blockSize = 1240/0x4d8<br> blockEntryOffset = 1154/0x482 (1149/0x479 in simulator)<br> methodObject = 0x1117a96a8<br> methodHeader = 0x00000000000000b1<br> selector = 0x00007fff58e9af6d<br>}<br><br>Everything is fine up to the selector field which is well out of range of the heap, an odd value ending in 16rD, and something followed by garbage instructions:<br><br>(lldb) print/x (CogMethod *)0x10cefe288 + 1<br>(CogMethod *) $24 = 0x000000010cefe2b0<br>(lldb) x/10i 0x000000010cefe2b0<br> 0x10cefe2b0: d0 0c bd ef fe 7f 00 rorb 0x7ffeef(,%rdi,4)<br> 0x10cefe2b7: 00 48 89 addb %cl, -0x77(%rax)<br> 0x10cefe2ba: d0 48 83 rorb -0x7d(%rax)<br> 0x10cefe2bd: e0 07 loopne 0x10cefe2c6<br> 0x10cefe2bf: 75 09 jne 0x10cefe2ca<br> 0x10cefe2c1: 48 8b 02 movq (%rdx), %rax<br> 0x10cefe2c4: 48 25 ff ff 3f 00 andq $0x3fffff, %rax ; imm = 0x3FFFFF<br> 0x10cefe2ca: 48 39 c8 cmpq %rcx, %rax<br> 0x10cefe2cd: 75 e4 jne 0x10cefe2b3<br> 0x10cefe2cf: 4c 8b de movq %rsi, %r11<br><br>the crash occurs I believe when the sort block for SuspendedDelays created in Delay class>>#startTimerEventLoop is activated via the code in CoInterpreter>>#executeCogBlock:closure:mayContextSwitch: which jumps to the blockEntry code for the method (code that dispatches between the two blocks in <span style="color:rgb(0,0,0)">startTimerEventLoop). This dispatch code is also corrupted:</span></font></div><div dir="ltr"><font face="arial, sans-serif"><span style="color:rgb(0,0,0)"><br></span></font></div><div dir="ltr"><div dir="ltr"><font face="arial, sans-serif">(lldb) print/x 0x10cefe288 + 0x482 (CogMethod for <span style="color:rgb(0,0,0)">startTimerEventLoop + its blockEntryOffset)</span></font></div><div dir="ltr"><font face="arial, sans-serif">(long) $36 = 0x000000010cefe70a</font></div><div><font face="arial, sans-serif"><br></font></div></div><div dir="ltr"><div dir="ltr"><font face="arial, sans-serif">(lldb) x/10i 0x10cefe70a</font></div><div dir="ltr"><font face="arial, sans-serif"> 0x10cefe70a: e7 ef outl %eax, $0xef</font></div><div dir="ltr"><font face="arial, sans-serif"> 0x10cefe70c: 0c 01 orb $0x1, %al</font></div><div dir="ltr"><font face="arial, sans-serif"> 0x10cefe70e: 00 00 addb %al, (%rax)</font></div><div dir="ltr"><font face="arial, sans-serif"> 0x10cefe710: 00 00 addb %al, (%rax)</font></div><div dir="ltr"><font face="arial, sans-serif"> 0x10cefe712: 00 00 addb %al, (%rax)</font></div><div><font face="arial, sans-serif"><br></font></div></div><div dir="ltr"><font face="arial, sans-serif" color="#000000"><span>whereas it should look like</span></font></div><div dir="ltr"><div dir="ltr"><font face="arial, sans-serif">00002c74: xorq %r9, %r9 : 4D 31 C9 </font></div><div dir="ltr"><font face="arial, sans-serif">00002c77: jmp .+0x3 (0x2c7c=startTimerEventLoop@47C) : EB 03 </font></div><div dir="ltr"><font face="arial, sans-serif">blockEntry:</font></div><div dir="ltr"><font face="arial, sans-serif">00002c79: movq %rdx, %r9 : 49 89 D1 </font></div><div dir="ltr"><font face="arial, sans-serif">00002c7c: movq methodDict@16(%rdx), %rax : 48 8B 42 10 </font></div><div dir="ltr"><font face="arial, sans-serif">00002c80: cmpq $0x661, %rax : 48 3D 61 06 00 00 </font></div><div dir="ltr"><font face="arial, sans-serif">00002c86: jle .-0xE4 (0x2ba8=startTimerEventLoop@3A8) : 0F 8E 1C FF FF FF </font></div><div dir="ltr"><font face="arial, sans-serif">block startpc: CB/659</font></div><div dir="ltr"><font face="arial, sans-serif">00002c8c: jmp .-0x66 (0x2c28=startTimerEventLoop@428) : EB 9A </font></div><div dir="ltr"><font face="arial, sans-serif">block startpc: DF/6F9</font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">So my current best guess is that <span style="color:rgb(0,0,0)">JITted version of the Delay class>>#startTimerEventLoop method is corrupted by some bug in code zone compaction. </span></font></div><div><span style="color:rgb(0,0,0)"><font face="arial, sans-serif"><br></font></span></div><div><font face="arial, sans-serif" color="#000000"><span>A second route is indicated by the prim trace log which strangely shows two stack overflow events, not one, very soon before the crash, which could indicate a bug in the interpreter/jit control flow on context switch:</span></font></div><div><font face="arial, sans-serif" color="#000000"><span><br></span></font></div><div><p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">(lldb) call dumpPrimTraceLog()</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">…</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">forceDisplayUpdate</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">utcMicrosecondClock</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">utcMicrosecondClock</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">findNextUnwindContextUpTo:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">tempAt:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">tempAt:put:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">tempAt:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">terminateTo:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">tempAt:put:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">findNextUnwindContextUpTo:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">terminateTo:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">basicNew:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">primSocketConnectionStatus:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">utcMicrosecondClock</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">primSocketSendDone:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">primSocket:sendData:startIndex:count:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">utcMicrosecondClock</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">primSocketSendDone:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">primSocketConnectionStatus:</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">utcMicrosecondClock</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">**StackOverflow**</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">**StackOverflow**</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">wait</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">signal</font></p>
<p style="margin:0px;font-stretch:normal;line-height:normal"><font face="arial, sans-serif">utcMicrosecondClock</font><span style="color:rgb(0,0,0);font-family:-webkit-standard">(newMethod -> 0x10cec3230 : 0x111f565a8)</span></p><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">This log is showing non-jitted primitive invocations and context switch/code zone reclamation events outside of JITted code (for obvious performance reasons we don't record such events in JITted code).</font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">So, still more digging to narrow this down, and possibly fixing simulation of the suite is worth-while. But what I really need is a more reproducible case in the real VM. Sigh :-)</font></div><font face="arial, sans-serif">_,,,^..^,,,_</font></div><font face="arial, sans-serif">best, Eliot</font></div></div></div></div></div></div></div></div></div></div>
</blockquote></div>