[squeak-dev] trunk image locked (was: The Inbox: Kernel-eem.1185.mcz)
Eliot Miranda
eliot.miranda at gmail.com
Thu Aug 2 05:54:10 UTC 2018
Hi Chris,
On Wed, Aug 1, 2018 at 6:55 PM, Chris Muller <asqueaker at gmail.com> wrote:
> My image just locked after it tried to unwind normally from a network
> timeout. I had to kill it from the OS.
>
> Eliot would you guide me how to get my trunk image rolled back to a
> state without this fix so I can try to get back to where I was to see
> if its related?
>
> Should I just load Kernel-eem.1184, an earlier one,
> or something else?
>
1182 is the last one before any of the critical: termination changes.
1183 has the critical: termination changes but doesn't change
Semaphore>>critical: to the simple version.
1184 has the simpler Semaphore>>critical:
I would like to have a look at this image. My changes look good to me and
others. One possible issue is that the send of releaseCriticalSection: in
Process>>terminate considers a double attempt at terminating a process (for
example sending terminate within an unwind block that is being run during
terminate) an attempt to terminate a live process. That could be a
mistake, and instead the send should only consider the process live if
suspend answers a LinkedList. So you could experiment with changing the
line
self releaseCriticalSection: (oldList isNil or: [oldList class ==
LinkedList]).
to
self releaseCriticalSection: oldList class == LinkedList.
But if it is possible to reproduce this lookup then I want to have a look
and try and see what is going on.
>
>
>
> On Fri, Jul 27, 2018 at 1:08 PM, <commits at source.squeak.org> wrote:
> > A new version of Kernel was added to project The Inbox:
> > http://source.squeak.org/inbox/Kernel-eem.1185.mcz
> >
> > ==================== Summary ====================
> >
> > Name: Kernel-eem.1185
> > Author: eem
> > Time: 27 July 2018, 11:08:12.641836 am
> > UUID: 7296ad00-708d-4cef-b6bc-ceb24d897a70
> > Ancestors: Kernel-eem.1184
> >
> > In releaseCriticalSection: use isUnwindContext instead of the more
> fragile context selector == #ensure:. Now it will handle ifCurtailed:
> too. Thanks Tobias!
> >
> > Reformat Process>>#terminate. It was baaad.
> >
> > =============== Diff against Kernel-eem.1184 ===============
> >
> > Item was changed:
> > ----- Method: Process>>releaseCriticalSection: (in category
> 'private') -----
> > releaseCriticalSection: runnable
> > "Figure out if we are terminating a process that is in the
> ensure: block of a critical section.
> > In this case, if the block has made progress, pop the
> suspendedContext so that we leave the
> > ensure: block inside the critical: without signaling the
> semaphore/exiting the primitive section,
> > since presumably this has already happened. But if it hasn't
> made progress but is beyond the
> > + wait (which we can tell by the oldList being one of the
> runnable lists, i.e. a LinkedList, not a
> > - wait (which we can tell my the oldList being one of the
> runnable lists, i.e. a LinkedList, not a
> > Semaphore or Mutex, et al), then the ensure: block needs to be
> run."
> > | selectorJustSent |
> > (suspendedContext method pragmaAt: #criticalSection) ifNil:
> [^self].
> > selectorJustSent := suspendedContext selectorJustSentOrSelf.
> >
> > "Receiver and/or argument blocks of ensure: in
> Semaphore>>critical: or Mutex>>#critical:"
> > suspendedContext isClosureContext ifTrue:
> > + [suspendedContext sender isUnwindContext ifTrue:
> > - [suspendedContext sender selector == #ensure: ifTrue:
> > [| notWaitingButMadeNoProgress |
> > "Avoid running the ensure: block twice, popping
> it if it has already been run. If runnable
> > but at the wait, leave it in place. N.B. No
> need to check if the block receiver of ensure: has
> > not started to run (via suspendedContext pc =
> suspendedContext startpc) because ensure:
> > uses valueNoContextSwitch, and so there is no
> suspension point before the wait."
> > notWaitingButMadeNoProgress :=
> > runnable
> > and: [selectorJustSent == #wait
> > and: [suspendedContext sender
> selectorJustSentOrSelf == #valueNoContextSwitch]].
> > notWaitingButMadeNoProgress ifFalse:
> > [suspendedContext := suspendedContext
> home]].
> > ^self].
> >
> > "Either Semaphore>>critical: or Mutex>>#critical:. Is the
> process still blocked? If so, nothing further to do."
> > runnable ifFalse: [^self].
> >
> > "If still at the wait the ensure: block has not been activated,
> so signal to restore."
> > selectorJustSent == #wait ifTrue:
> > [suspendedContext receiver signal].
> >
> > "If still at the lock primitive and the lock primitive just
> acquired ownership (indicated by it answering false)
> > then the ensure block has not been activated, so explicitly
> primitiveExitCriticalSection to unlock."
> > (selectorJustSent == #primitiveEnterCriticalSection
> > or: [selectorJustSent == #primitiveTestAndSetOwnershipOfCriticalSection])
> ifTrue:
> > [(suspendedContext stackPtr > 0
> > and: [suspendedContext top == false]) ifTrue:
> > [suspendedContext receiver
> primitiveExitCriticalSection]]!
> >
> > Item was changed:
> > ----- Method: Process>>terminate (in category 'changing process
> state') -----
> > terminate
> > "Stop the process that the receiver represents forever.
> > Unwind to execute pending ensure:/ifCurtailed: blocks before
> terminating.
> > If the process is in the middle of a critical: critical
> section, release it properly."
> >
> > | ctxt unwindBlock oldList |
> > + self isActiveProcess ifTrue:
> > + [ctxt := thisContext.
> > + [ctxt := ctxt findNextUnwindContextUpTo: nil.
> > + ctxt ~~ nil] whileTrue:
> > + [(ctxt tempAt: 2) ifNil:
> > + ["N.B. Unlike Context>>unwindTo: we do
> not set complete (tempAt: 2) to true."
> > + unwindBlock := ctxt tempAt: 1.
> > + thisContext terminateTo: ctxt.
> > + unwindBlock value]].
> > - self isActiveProcess ifTrue: [
> > - ctxt := thisContext.
> > - [ ctxt := ctxt findNextUnwindContextUpTo: nil.
> > - ctxt isNil
> > - ] whileFalse: [
> > - (ctxt tempAt: 2) ifNil:[
> > - ctxt tempAt: 2 put: nil.
> > - unwindBlock := ctxt tempAt: 1.
> > - thisContext terminateTo: ctxt.
> > - unwindBlock value].
> > - ].
> > thisContext terminateTo: nil.
> > self suspend.
> > + "If the process is resumed this will provoke a
> cannotReturn: error.
> > + Would self debug: thisContext title: 'Resuming a
> terminated process' be better?"
> > + ^self].
> > - ] ifFalse:[
> > - "Always suspend the process first so it doesn't
> accidentally get woken up.
> > - N.B. If oldList is a LinkedList then the process is
> runnable. If it is a Semaphore/Mutex et al
> > - then the process is blocked, and if it is nil then the
> process is already suspended."
> > - oldList := self suspend.
> > - suspendedContext ifNotNil:
> > - ["Release any method marked with the
> <criticalSection> pragma.
> > - The argument is whether the process is
> runnable."
> > - self releaseCriticalSection: (oldList isNil or:
> [oldList class == LinkedList]).
> >
> > + "Always suspend the process first so it doesn't accidentally get
> woken up.
> > + N.B. If oldList is a LinkedList then the process is runnable.
> If it is a Semaphore/Mutex et al
> > + then the process is blocked, and if it is nil then the process
> is already suspended."
> > + oldList := self suspend.
> > + suspendedContext ifNotNil:
> > + ["Release any method marked with the <criticalSection>
> pragma.
> > + The argument is whether the process is runnable."
> > + self releaseCriticalSection: (oldList isNil or:
> [oldList class == LinkedList]).
> > - "If terminating a process halfways through an
> unwind, try to complete that unwind block first."
> > - (suspendedContext findNextUnwindContextUpTo:
> nil) ifNotNil:
> > - [:outer|
> > - (suspendedContext
> findContextSuchThat:[:c| c closure == (outer tempAt: 1)]) ifNotNil:
> > - [:inner| "This is an unwind
> block currently under evaluation"
> > - suspendedContext
> runUntilErrorOrReturnFrom: inner]].
> >
> > + "If terminating a process halfways through an unwind,
> try to complete that unwind block first."
> > + (suspendedContext findNextUnwindContextUpTo: nil)
> ifNotNil:
> > + [:outer|
> > + (suspendedContext findContextSuchThat:[:c| c
> closure == (outer tempAt: 1)]) ifNotNil:
> > + [:inner| "This is an unwind block
> currently under evaluation"
> > + suspendedContext
> runUntilErrorOrReturnFrom: inner]].
> > +
> > + ctxt := self popTo: suspendedContext bottomContext.
> > + ctxt == suspendedContext bottomContext ifFalse:
> > + [self debug: ctxt title: 'Unwind error during
> termination'].
> > + "Set the context to its endPC for the benefit of
> isTerminated."
> > + ctxt pc: ctxt endPC]!
> > - ctxt := self popTo: suspendedContext
> bottomContext.
> > - ctxt == suspendedContext bottomContext ifFalse:
> > - [self debug: ctxt title: 'Unwind error
> during termination'].
> > - "Set the context to its endPC for the benefit of
> isTerminated."
> > - ctxt pc: ctxt endPC]]!
> >
> >
>
>
--
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20180801/1c140a5b/attachment.html>
More information about the Squeak-dev
mailing list
|