Hi guys,
Thanks, Eliot, and please accept my apologies -- when I later rebooted the image it still only had 1182, and so I'm not 100% certain I had updated that image or simply hadn't saved after I updated. I am going to update and save everything and keep moving forward normally to see it happens again. The 201807260206 was flawless for me in the first big stress test, this time I'll be sure to save my updated trunk from the get go.
Thanks for that reminder about kill SIGUSR1, Levente. I don't use post-it notes but that's so useful I probably should get one for that.. :)
Best, Chris
On Thu, Aug 2, 2018 at 6:52 AM, Levente Uzonyi leves@caesar.elte.hu wrote:
Hi Chris,
Did you try to send SIGUSR1 to this image before killing it? (assuming it didn't happen on Windows)
Levente
On Wed, 1 Aug 2018, Chris Muller wrote:
My image just locked after it tried to unwind normally from a network timeout. I had to kill it from the OS.
Eliot would you guide me how to get my trunk image rolled back to a state without this fix so I can try to get back to where I was to see if its related?
Should I just load Kernel-eem.1184, an earlier one, or something else?
On Fri, Jul 27, 2018 at 1:08 PM, commits@source.squeak.org wrote:
A new version of Kernel was added to project The Inbox: http://source.squeak.org/inbox/Kernel-eem.1185.mcz
==================== Summary ====================
Name: Kernel-eem.1185 Author: eem Time: 27 July 2018, 11:08:12.641836 am UUID: 7296ad00-708d-4cef-b6bc-ceb24d897a70 Ancestors: Kernel-eem.1184
In releaseCriticalSection: use isUnwindContext instead of the more fragile context selector == #ensure:. Now it will handle ifCurtailed: too. Thanks Tobias!
Reformat Process>>#terminate. It was baaad.
=============== Diff against Kernel-eem.1184 ===============
Item was changed:
----- Method: Process>>releaseCriticalSection: (in category 'private')
releaseCriticalSection: runnable "Figure out if we are terminating a process that is in the ensure: block of a critical section. In this case, if the block has made progress, pop the suspendedContext so that we leave the ensure: block inside the critical: without signaling the semaphore/exiting the primitive section, since presumably this has already happened. But if it hasn't made progress but is beyond the
wait (which we can tell by the oldList being one of the runnable
lists, i.e. a LinkedList, not a
wait (which we can tell my the oldList being one of the runnable
lists, i.e. a LinkedList, not a Semaphore or Mutex, et al), then the ensure: block needs to be run." | selectorJustSent | (suspendedContext method pragmaAt: #criticalSection) ifNil: [^self]. selectorJustSent := suspendedContext selectorJustSentOrSelf.
"Receiver and/or argument blocks of ensure: in
Semaphore>>critical: or Mutex>>#critical:" suspendedContext isClosureContext ifTrue:
[suspendedContext sender isUnwindContext ifTrue:
[suspendedContext sender selector == #ensure: ifTrue: [| notWaitingButMadeNoProgress | "Avoid running the ensure: block twice, popping
it if it has already been run. If runnable but at the wait, leave it in place. N.B. No need to check if the block receiver of ensure: has not started to run (via suspendedContext pc = suspendedContext startpc) because ensure: uses valueNoContextSwitch, and so there is no suspension point before the wait." notWaitingButMadeNoProgress := runnable and: [selectorJustSent == #wait and: [suspendedContext sender selectorJustSentOrSelf == #valueNoContextSwitch]]. notWaitingButMadeNoProgress ifFalse: [suspendedContext := suspendedContext home]]. ^self].
"Either Semaphore>>critical: or Mutex>>#critical:. Is the
process still blocked? If so, nothing further to do." runnable ifFalse: [^self].
"If still at the wait the ensure: block has not been activated,
so signal to restore." selectorJustSent == #wait ifTrue: [suspendedContext receiver signal].
"If still at the lock primitive and the lock primitive just
acquired ownership (indicated by it answering false) then the ensure block has not been activated, so explicitly primitiveExitCriticalSection to unlock." (selectorJustSent == #primitiveEnterCriticalSection or: [selectorJustSent == #primitiveTestAndSetOwnershipOfCriticalSection]) ifTrue: [(suspendedContext stackPtr > 0 and: [suspendedContext top == false]) ifTrue: [suspendedContext receiver primitiveExitCriticalSection]]!
Item was changed:
----- Method: Process>>terminate (in category 'changing process state')
terminate "Stop the process that the receiver represents forever. Unwind to execute pending ensure:/ifCurtailed: blocks before terminating. If the process is in the middle of a critical: critical section, release it properly."
| ctxt unwindBlock oldList |
self isActiveProcess ifTrue:
[ctxt := thisContext.
[ctxt := ctxt findNextUnwindContextUpTo: nil.
ctxt ~~ nil] whileTrue:
[(ctxt tempAt: 2) ifNil:
["N.B. Unlike Context>>unwindTo: we do
not set complete (tempAt: 2) to true."
unwindBlock := ctxt tempAt: 1.
thisContext terminateTo: ctxt.
unwindBlock value]].
self isActiveProcess ifTrue: [
ctxt := thisContext.
[ ctxt := ctxt findNextUnwindContextUpTo: nil.
ctxt isNil
] whileFalse: [
(ctxt tempAt: 2) ifNil:[
ctxt tempAt: 2 put: nil.
unwindBlock := ctxt tempAt: 1.
thisContext terminateTo: ctxt.
unwindBlock value].
]. thisContext terminateTo: nil. self suspend.
"If the process is resumed this will provoke a
cannotReturn: error.
Would self debug: thisContext title: 'Resuming a
terminated process' be better?"
^self].
] ifFalse:[
"Always suspend the process first so it doesn't
accidentally get woken up.
N.B. If oldList is a LinkedList then the process is
runnable. If it is a Semaphore/Mutex et al
then the process is blocked, and if it is nil then the
process is already suspended."
oldList := self suspend.
suspendedContext ifNotNil:
["Release any method marked with the
<criticalSection> pragma.
The argument is whether the process is
runnable."
self releaseCriticalSection: (oldList isNil or:
[oldList class == LinkedList]).
"Always suspend the process first so it doesn't accidentally get
woken up.
N.B. If oldList is a LinkedList then the process is runnable. If
it is a Semaphore/Mutex et al
then the process is blocked, and if it is nil then the process
is already suspended."
oldList := self suspend.
suspendedContext ifNotNil:
["Release any method marked with the <criticalSection>
pragma.
The argument is whether the process is runnable."
self releaseCriticalSection: (oldList isNil or: [oldList
class == LinkedList]).
"If terminating a process halfways through an
unwind, try to complete that unwind block first."
(suspendedContext findNextUnwindContextUpTo: nil)
ifNotNil:
[:outer|
(suspendedContext
findContextSuchThat:[:c| c closure == (outer tempAt: 1)]) ifNotNil:
[:inner| "This is an unwind block
currently under evaluation"
suspendedContext
runUntilErrorOrReturnFrom: inner]].
"If terminating a process halfways through an unwind, try
to complete that unwind block first."
(suspendedContext findNextUnwindContextUpTo: nil)
ifNotNil:
[:outer|
(suspendedContext findContextSuchThat:[:c| c
closure == (outer tempAt: 1)]) ifNotNil:
[:inner| "This is an unwind block
currently under evaluation"
suspendedContext
runUntilErrorOrReturnFrom: inner]].
ctxt := self popTo: suspendedContext bottomContext.
ctxt == suspendedContext bottomContext ifFalse:
[self debug: ctxt title: 'Unwind error during
termination'].
"Set the context to its endPC for the benefit of
isTerminated."
ctxt pc: ctxt endPC]!
ctxt := self popTo: suspendedContext
bottomContext.
ctxt == suspendedContext bottomContext ifFalse:
[self debug: ctxt title: 'Unwind error
during termination'].
"Set the context to its endPC for the benefit of
isTerminated."
ctxt pc: ctxt endPC]]!