[squeak-dev] trunk image locked (was: The Inbox: Kernel-eem.1185.mcz)

Eliot Miranda eliot.miranda at gmail.com
Thu Aug 2 05:54:10 UTC 2018


Hi Chris,

On Wed, Aug 1, 2018 at 6:55 PM, Chris Muller <asqueaker at gmail.com> wrote:

> My image just locked after it tried to unwind normally from a network
> timeout.   I had to kill it from the OS.
>
> Eliot would you guide me how to get my trunk image rolled back to a
> state without this fix so I can try to get back to where I was to see
> if its related?
>
> Should I just load Kernel-eem.1184, an earlier one,
> or something else?
>

1182 is the last one before any of the critical: termination changes.
1183 has the critical: termination changes but doesn't change
Semaphore>>critical: to the simple version.
1184 has the simpler Semaphore>>critical:

I would like to have a look at this image.  My changes look good to me and
others.  One possible issue is that the send of releaseCriticalSection: in
Process>>terminate considers a double attempt at terminating a process (for
example sending terminate within an unwind block that is being run during
terminate) an attempt to terminate a live process.  That could be a
mistake, and instead the send should only consider the process live if
suspend answers a LinkedList.  So you could experiment with changing the
line

self releaseCriticalSection: (oldList isNil or: [oldList class ==
LinkedList]).

to
self releaseCriticalSection: oldList class == LinkedList.

But if it is possible to reproduce this lookup then I want to have a look
and try and see what is going on.



>
>
>
> On Fri, Jul 27, 2018 at 1:08 PM,  <commits at source.squeak.org> wrote:
> > A new version of Kernel was added to project The Inbox:
> > http://source.squeak.org/inbox/Kernel-eem.1185.mcz
> >
> > ==================== Summary ====================
> >
> > Name: Kernel-eem.1185
> > Author: eem
> > Time: 27 July 2018, 11:08:12.641836 am
> > UUID: 7296ad00-708d-4cef-b6bc-ceb24d897a70
> > Ancestors: Kernel-eem.1184
> >
> > In releaseCriticalSection: use isUnwindContext instead of the more
> fragile context selector == #ensure:.  Now it will handle ifCurtailed:
> too.  Thanks Tobias!
> >
> > Reformat Process>>#terminate.  It was baaad.
> >
> > =============== Diff against Kernel-eem.1184 ===============
> >
> > Item was changed:
> >   ----- Method: Process>>releaseCriticalSection: (in category
> 'private') -----
> >   releaseCriticalSection: runnable
> >         "Figure out if we are terminating a process that is in the
> ensure: block of a critical section.
> >          In this case, if the block has made progress, pop the
> suspendedContext so that we leave the
> >          ensure: block inside the critical: without signaling the
> semaphore/exiting the primitive section,
> >          since presumably this has already happened.  But if it hasn't
> made progress but is beyond the
> > +        wait (which we can tell by the oldList being one of the
> runnable lists, i.e. a LinkedList, not a
> > -        wait (which we can tell my the oldList being one of the
> runnable lists, i.e. a LinkedList, not a
> >          Semaphore or Mutex, et al), then the ensure: block needs to be
> run."
> >         | selectorJustSent |
> >         (suspendedContext method pragmaAt: #criticalSection) ifNil:
> [^self].
> >         selectorJustSent := suspendedContext selectorJustSentOrSelf.
> >
> >         "Receiver and/or argument blocks of ensure: in
> Semaphore>>critical: or Mutex>>#critical:"
> >         suspendedContext isClosureContext ifTrue:
> > +               [suspendedContext sender isUnwindContext ifTrue:
> > -               [suspendedContext sender selector == #ensure: ifTrue:
> >                         [| notWaitingButMadeNoProgress |
> >                         "Avoid running the ensure: block twice, popping
> it if it has already been run. If runnable
> >                          but at the wait, leave it in place. N.B. No
> need to check if the block receiver of ensure: has
> >                          not started to run (via suspendedContext pc =
> suspendedContext startpc) because ensure:
> >                          uses valueNoContextSwitch, and so there is no
> suspension point before the wait."
> >                          notWaitingButMadeNoProgress :=
> >                                 runnable
> >                                 and: [selectorJustSent == #wait
> >                                 and: [suspendedContext sender
> selectorJustSentOrSelf == #valueNoContextSwitch]].
> >                          notWaitingButMadeNoProgress ifFalse:
> >                                 [suspendedContext := suspendedContext
> home]].
> >                  ^self].
> >
> >         "Either Semaphore>>critical: or Mutex>>#critical:.  Is the
> process still blocked?  If so, nothing further to do."
> >         runnable ifFalse: [^self].
> >
> >         "If still at the wait the ensure: block has not been activated,
> so signal to restore."
> >         selectorJustSent == #wait ifTrue:
> >                 [suspendedContext receiver signal].
> >
> >         "If still at the lock primitive and the lock primitive just
> acquired ownership (indicated by it answering false)
> >          then the ensure block has not been activated, so explicitly
> primitiveExitCriticalSection to unlock."
> >         (selectorJustSent == #primitiveEnterCriticalSection
> >          or: [selectorJustSent == #primitiveTestAndSetOwnershipOfCriticalSection])
> ifTrue:
> >                 [(suspendedContext stackPtr > 0
> >                   and: [suspendedContext top == false]) ifTrue:
> >                         [suspendedContext receiver
> primitiveExitCriticalSection]]!
> >
> > Item was changed:
> >   ----- Method: Process>>terminate (in category 'changing process
> state') -----
> >   terminate
> >         "Stop the process that the receiver represents forever.
> >          Unwind to execute pending ensure:/ifCurtailed: blocks before
> terminating.
> >          If the process is in the middle of a critical: critical
> section, release it properly."
> >
> >         | ctxt unwindBlock oldList |
> > +       self isActiveProcess ifTrue:
> > +               [ctxt := thisContext.
> > +                [ctxt := ctxt findNextUnwindContextUpTo: nil.
> > +                 ctxt ~~ nil] whileTrue:
> > +                       [(ctxt tempAt: 2) ifNil:
> > +                               ["N.B. Unlike Context>>unwindTo: we do
> not set complete (tempAt: 2) to true."
> > +                                unwindBlock := ctxt tempAt: 1.
> > +                                thisContext terminateTo: ctxt.
> > +                                unwindBlock value]].
> > -       self isActiveProcess ifTrue: [
> > -               ctxt := thisContext.
> > -               [       ctxt := ctxt findNextUnwindContextUpTo: nil.
> > -                       ctxt isNil
> > -               ] whileFalse: [
> > -                       (ctxt tempAt: 2) ifNil:[
> > -                               ctxt tempAt: 2 put: nil.
> > -                               unwindBlock := ctxt tempAt: 1.
> > -                               thisContext terminateTo: ctxt.
> > -                               unwindBlock value].
> > -               ].
> >                 thisContext terminateTo: nil.
> >                 self suspend.
> > +               "If the process is resumed this will provoke a
> cannotReturn: error.
> > +                Would self debug: thisContext title: 'Resuming a
> terminated process' be better?"
> > +               ^self].
> > -       ] ifFalse:[
> > -               "Always suspend the process first so it doesn't
> accidentally get woken up.
> > -                N.B. If oldList is a LinkedList then the process is
> runnable. If it is a Semaphore/Mutex et al
> > -                then the process is blocked, and if it is nil then the
> process is already suspended."
> > -               oldList := self suspend.
> > -               suspendedContext ifNotNil:
> > -                       ["Release any method marked with the
> <criticalSection> pragma.
> > -                         The argument is whether the process is
> runnable."
> > -                        self releaseCriticalSection: (oldList isNil or:
> [oldList class == LinkedList]).
> >
> > +       "Always suspend the process first so it doesn't accidentally get
> woken up.
> > +        N.B. If oldList is a LinkedList then the process is runnable.
> If it is a Semaphore/Mutex et al
> > +        then the process is blocked, and if it is nil then the process
> is already suspended."
> > +       oldList := self suspend.
> > +       suspendedContext ifNotNil:
> > +               ["Release any method marked with the <criticalSection>
> pragma.
> > +                 The argument is whether the process is runnable."
> > +                self releaseCriticalSection: (oldList isNil or:
> [oldList class == LinkedList]).
> > -                       "If terminating a process halfways through an
> unwind, try to complete that unwind block first."
> > -                       (suspendedContext findNextUnwindContextUpTo:
> nil) ifNotNil:
> > -                               [:outer|
> > -                               (suspendedContext
> findContextSuchThat:[:c| c closure == (outer tempAt: 1)]) ifNotNil:
> > -                                       [:inner| "This is an unwind
> block currently under evaluation"
> > -                                       suspendedContext
> runUntilErrorOrReturnFrom: inner]].
> >
> > +               "If terminating a process halfways through an unwind,
> try to complete that unwind block first."
> > +               (suspendedContext findNextUnwindContextUpTo: nil)
> ifNotNil:
> > +                       [:outer|
> > +                        (suspendedContext findContextSuchThat:[:c| c
> closure == (outer tempAt: 1)]) ifNotNil:
> > +                               [:inner| "This is an unwind block
> currently under evaluation"
> > +                                suspendedContext
> runUntilErrorOrReturnFrom: inner]].
> > +
> > +               ctxt := self popTo: suspendedContext bottomContext.
> > +               ctxt == suspendedContext bottomContext ifFalse:
> > +                       [self debug: ctxt title: 'Unwind error during
> termination'].
> > +               "Set the context to its endPC for the benefit of
> isTerminated."
> > +               ctxt pc: ctxt endPC]!
> > -                       ctxt := self popTo: suspendedContext
> bottomContext.
> > -                       ctxt == suspendedContext bottomContext ifFalse:
> > -                               [self debug: ctxt title: 'Unwind error
> during termination'].
> > -                       "Set the context to its endPC for the benefit of
> isTerminated."
> > -                       ctxt pc: ctxt endPC]]!
> >
> >
>
>


-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20180801/1c140a5b/attachment.html>


More information about the Squeak-dev mailing list