On 2/1/06, Cees De Groot cdegroot@gmail.com wrote:
On IRC, Jecel Assumpção remarks that the Semaphore in both cases is 'in rest' - excessSignals is 1, where 0 would be expected when the critical section is active. And -1 when two critical sections are active :-)
More debugging, it turns out that AccessProtect's excessSignals indeed can become > 1 in the application. Whether that is due to the many times that we terminate processes inside delays or not, I don't know. Fact is that in my current image, the application starts up, does a lot of stuff exercising Delay, and then hits this condition. It must be some sort of race condition.
On IRC, Craig suggested I move over to the VM level for debugging. However, I'm not sure how to proceed - I could presumably trap excessSignals becoming '2' on all semaphores, but only some of them are used as Mutex so that breakpoint would be called spuriously (unless someone knows how to formulate a bp that only hits on semaphores setup as mutex...). And then, the culprit could very well be the first time the semaphore is signalled, but I can only set a breakpoint on the second time. Manually checking all the times the BP is hit is undoable, hundreds of Delays are scheduled before the bug hits.
Any suggestions on how we could proceed?
Thanks,
Cees