[Vm-dev] A new ready-to-crash image is available

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Sun Feb 9 17:25:29 UTC 2020


So it seems that this second failure related to wrong B3D_FACE_ACTIVE was
caused by my own fix.
The bug should have disappeared after
https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/36a1f1e2ef637347ed3b81a2f4cf8df347e4d803

Without a proper understanding, this must be considered as a "workaround"
rather than a proper fix.
It means that it solves the symptoms, but maybe not the root cause...
I consider that it's nice to have a Squeak3D plugin working.
But remember that it's using CPU rather than GPU, so it should presumably
be superseded by something more to date.
If only we could properly document the algorithm, that would also avoid
wild guesses, workarounds and incomplete patches...

Le dim. 9 févr. 2020 à 12:38, Nicolas Cellier <
nicolas.cellier.aka.nice at gmail.com> a écrit :

> With instrumentation, I see another instance of removal of absent face
> from the fill list in b3dToggleTopFills
> The logic here seems to be that we expect B3D_FACE_ACTIVE flagged face to
> be on fill list, and we toggle both.
> So there is another broken invariant.
>
> The poor man correction is just a correction of symptoms, not of root
> cause.
> I'd prefer the later if ever we can...
>
> Le dim. 9 févr. 2020 à 11:20, Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com> a écrit :
>
>> I have instrumented a bit more the fill list machinery, and here is some
>> logic error I caught:
>>
>>   * frame #0: 0x14ad0fee Squeak3D`b3dAbort(msg="Trying to remove a face
>> not in fillList") at b3dMain.c:87:2 [opt]
>>     frame #1: 0x14ae774c Squeak3D`b3dRemoveFill(fillList=0x06852cc8,
>> aFace=0x068175a8) at b3dMain.c:938:54 [opt]
>>     frame #2: 0x14aecbe2 Squeak3D`b3dMainLoop(state=0x14b195ac,
>> stopReason=0) at b3dMain.c:1379:7 [opt]
>>     frame #3: 0x14aa5b43 Squeak3D`b3dStartRasterizer at
>> Squeak3D.c:1701:12 [opt]
>>
>> If we remove a face which is not in the list, then we are going to
>> corrupt the fill list...
>> Where does that happen?
>>
>>    1376 if(leftEdge == lastIntersection) {
>>    1377 /* Special case if this is an intersection edge */
>>    1378 assert(fillList->firstFace == leftEdge->leftFace);
>> -> 1379 b3dRemoveFill(fillList, leftEdge->rightFace);
>>    1380 b3dAddFrontFill(fillList, leftEdge->rightFace);
>>    1381 } else {
>>
>> Ah, a special case of intersection edge...
>> Why the rightFace would or would not be already in the fillList?
>> Is it really a loop invariant?
>> Hmm, hard to answer without deeper understanding of the whole loop...
>> I have not even an idea of what is left/tight/top face, so no semantic
>> clue.
>>
>> What I suggest as poor man correction is to protect the removal with a
>> if(b3dIsInFillList(fillList,rightFace)) condition...
>>
>> Le sam. 8 févr. 2020 à 21:54, Nicolas Cellier <
>> nicolas.cellier.aka.nice at gmail.com> a écrit :
>>
>>>
>>>
>>> Le sam. 8 févr. 2020 à 21:45, Nicolas Cellier <
>>> nicolas.cellier.aka.nice at gmail.com> a écrit :
>>>
>>>>
>>>>
>>>> Le sam. 8 févr. 2020 à 01:35, Stéphane Rollandin <
>>>> lecteur at zogotounga.net> a écrit :
>>>>
>>>>>
>>>>> > Why only with fast VM? It might be yet another case of Undefined
>>>>> > Behavior (UB)...
>>>>> > I have thus recompiled the VM with UB sanitizer, and there is indeed
>>>>> > some UB reported:
>>>>> >
>>>>> > ../../platforms/Cross/plugins/Squeak3D/b3dMain.c:1252:29: runtime
>>>>> error:
>>>>> > left shift of negative value -760
>>>>> > ../../platforms/Cross/plugins/Squeak3D/b3dMain.c:1254:25: runtime
>>>>> error:
>>>>> > left shift of negative value -751
>>>>> > ../../platforms/Cross/plugins/Squeak3D/b3dDraw.c:317:33: runtime
>>>>> error:
>>>>> > left shift of negative value -802
>>>>> > ../../platforms/Cross/plugins/Squeak3D/b3dDraw.c:318:33: runtime
>>>>> error:
>>>>> > left shift of negative value -802
>>>>> > ../../platforms/Cross/plugins/Squeak3D/b3dDraw.c:316:33: runtime
>>>>> error:
>>>>> > left shift of negative value -114
>>>>> > ../../platforms/Cross/plugins/Squeak3D/b3dMain.c:829:61: runtime
>>>>> error:
>>>>> > left shift of negative value -2
>>>>> >
>>>>> > Though, the instrumented fast VM does not fail...
>>>>> > It might be that some aggressive optimizations assuming the absence
>>>>> of
>>>>> > UB do not occur with all the instrumentation stuff embedded...
>>>>>
>>>>> This is very dark magic.
>>>>>
>>>>> > IMO, declaring a left shift of negative int UB is sort of FOOLISH.
>>>>>
>>>>> Tell me where to vote and I'll vote for you.
>>>>>
>>>>> You cannot yet vote for opinions, except on some social networks ;)
>>>>
>>>> > We will have to protect each and every left shift in b3d with a
>>>>> cast...
>>>>>
>>>>> To see a good side in this, stumbling at this point upon this kind of
>>>>> errors must mean the 3D code in itself is quite sound. Indeed I had
>>>>> only
>>>>> a couple of similar crashes for hours of testing (well, playing).
>>>>>
>>>>> What I saw also a couple times, and which is more difficult to report,
>>>>> is the VM hanging at 100% CPU on its core and having to be killed
>>>>> externally. Could it be the same nasal demons at work?
>>>>>
>>>>> hard to say...
>>>> I think that you can send a SIGUSR1 to dump stacks, or attach a
>>>> debugger to running VM....
>>>>
>>>> Unfortunately, I also had another crash:
>>>>
>>>> ../../platforms/Cross/plugins/Squeak3D/b3dMain.c:954:43: runtime error:
>>>> member access within null pointer of type 'B3DPrimitiveFace' (aka 'struct
>>>> B3DPrimitiveFace')
>>>>
>>>> Segmentation fault Sat Feb  8 21:19:37 2020
>>>>
>>>> VM: 202002050212 nicolas at MBP-de-Nicolas
>>>> :Smalltalk/OpenSmalltalk/opensmalltalk-vm
>>>> Date: Tue Feb 4 18:12:07 2020 CommitHash: 0f974af6a
>>>> Plugins: 202002050212 nicolas at MBP-de-Nicolas
>>>> :Smalltalk/OpenSmalltalk/opensmalltalk-vm
>>>>
>>>> C stack backtrace & registers:
>>>> eax 0x00000018 ebx 0x00000000 ecx 0x040835a8 edx 0x00000000
>>>> edi 0x040835a8 esi 0x040835a8 ebp 0xbfeec978 esp 0xbfeec940
>>>> eip 0x0f0584b5
>>>> 0   Squeak3D                            0x0f0584b5 b3dAddFrontFill + 118
>>>> 1   Squeak                              0x00275ea4 reportStackState +
>>>> 870
>>>> 2   Squeak                              0x00276862 sigsegv + 353
>>>> 3   libsystem_platform.dylib            0xa7dffbae _sigtramp + 46
>>>> 4   ???                                 0xffffffff 0x0 + 4294967295
>>>> 5   Squeak3D                            0x0f05a0ee b3dToggleTopFills +
>>>> 604
>>>> 6   Squeak3D                            0x0f05cdc0 b3dMainLoop + 7239
>>>> 7   Squeak3D                            0x0f017adb b3dStartRasterizer +
>>>> 1668
>>>>
>>>
>>> And I see that this one more closely match your crash.dmp report...
>>> So the negativeInt<<shift was another problem created by my clang
>>> version on OSX.
>>> I ran a few times without crash, so thought it was thru, but your crash
>>> is not yet fixed...
>>>
>>>
>>>>> Stef
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20200209/c842f282/attachment-0001.html>


More information about the Vm-dev mailing list