[Vm-dev] Delay machinery (was Re: [Pharo-dev] Suspending a Process)

Nicolai Hess nicolaihess at web.de
Sun Jul 27 23:14:59 UTC 2014


2014-07-25 20:05 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:

>
> Hi Nicolai, Hi Ben,
>
>
> On Fri, Jul 25, 2014 at 10:55 AM, Nicolai Hess <nicolaihess at web.de> wrote:
>
>>
>>
>> Hi Ben,
>>
>> I am on Windows too :(
>> So, the fixes does not work (not always) on winddows too. But at least
>> they make it less probable to occure, but it still happens.
>> The most distracting thing is, after the first ui lock, pressing alt+dot,
>> closing the debuggers, pressing alt+dot ....
>> and trying to close the very first debugger, after that, it all works.
>> The UI is responsive again and suspending the process does
>> not block the ui anymore.
>> It "looks like" supsending the process reactivates another process that
>> blocks the UI. And as soon as I terminate this
>> process (alt+dot, close debugger ...) all works.
>> But I really don't know.
>>
>
> if you can run a unix machine (in a VM?) then remember that kill -USR1 pid
> will cause the VM to print out a stack backtrace of all processes in the
> image.  That can be very useful in debuggng lockups like this.
>
> HTH
>

Ok, but I don't know if this helps, at least it does not look very helpful
to me:)


SIGUSR1 Mon Jul 28 01:06:16 2014


pharo VM version: 3.9-7 #1 Tue May  6 08:30:23 UTC 2014 gcc 4.8.2
[Production ITHB VM]
Built from: NBCoInterpreter NativeBoost-CogPlugin-GuillermoPolito.19 uuid:
acc98e51-2fba-4841-a965-2975997bba66 May  6 2014
With: NBCogit NativeBoost-CogPlugin-GuillermoPolito.19 uuid:
acc98e51-2fba-4841-a965-2975997bba66 May  6 2014
Revision: https://github.com/pharo-project/pharo-vm.git Commit:
ef5832e6f70e5b24e8b9b1f4b8509a62b6c88040 Date: 2014-01-26 15:34:28 +0100
By: Esteban Lorenzano <estebanlm at gmail.com> Jenkins build #14794
Build host: Linux chindi08 2.6.24-32-xen #1 SMP Mon Dec 3 16:12:25 UTC 2012
i686 i686 i686 GNU/Linux
plugin path: /usr/lib/pharo-vm/ [default: /usr/lib/pharo-vm/]


C stack backtrace:
/usr/lib/pharo-vm/pharo-vm[0x809ad23]
/usr/lib/pharo-vm/pharo-vm[0x809af6e]
[0xf7784410]
[0xf7784425]
/lib/i386-linux-gnu/libc.so.6(__select+0x2d)[0xf763691d]
/usr/lib/pharo-vm/pharo-vm(aioPoll+0x13d)[0x809748d]
/usr/lib/pharo-vm/vm-display-X11.so(+0xdc85)[0xf71d2c85]
/usr/lib/pharo-vm/pharo-vm(ioRelinquishProcessorForMicroseconds+0x17)[0x8099b57]
/usr/lib/pharo-vm/pharo-vm[0x8070685]
[0xb6f8dbc3]
[0xb6f89700]
[0xb7a2650e]
[0xb6f895c0]


All Smalltalk process stacks (active first):
Process 0xb88376fc priority 10
0xff76c830 M ProcessorScheduler class>idleProcess 0xb7306b08: a(n)
ProcessorScheduler class
0xff76c850 I [] in ProcessorScheduler class>startUp 0xb7306b08: a(n)
ProcessorScheduler class
0xff76c870 I [] in BlockClosure>newProcess 0xb8837620: a(n) BlockClosure

Process 0xb8838c78 priority 50
0xff768830 I WeakArray class>finalizationProcess 0xb7306cd8: a(n) WeakArray
class
0xff768850 I [] in WeakArray class>restartFinalizationProcess 0xb7306cd8:
a(n) WeakArray class
0xff768870 I [] in BlockClosure>newProcess 0xb8838b9c: a(n) BlockClosure

Process 0xb9148c20 priority 40
0xff7907b8 M [] in Semaphore>critical: 0xb82f8ef4: a(n) Semaphore
0xff7907d8 M BlockClosure>ensure: 0xb91502e0: a(n) BlockClosure
0xff7907f8 M Semaphore>critical: 0xb82f8ef4: a(n) Semaphore
0xff790814 M Delay>schedule 0xb91501e4: a(n) Delay
0xff79082c M Delay>wait 0xb91501e4: a(n) Delay
0xff790850 I [] in BackgroundWorkDisplayMorph>initialize 0xb91488b0: a(n)
BackgroundWorkDisplayMorph
0xff790870 I [] in BlockClosure>newProcess 0xb9148b40: a(n) BlockClosure

Process 0xb7902630 priority 40
0xff764784 M [] in Semaphore>critical: 0xb82f8ef4: a(n) Semaphore
0xff7647a4 M BlockClosure>ensure: 0xb916b7a4: a(n) BlockClosure
0xff7647c4 M Semaphore>critical: 0xb82f8ef4: a(n) Semaphore
0xff7647e0 M Delay>schedule 0xb916b6a8: a(n) Delay
0xff7647f8 M Delay>wait 0xb916b6a8: a(n) Delay
0xff764818 M WorldState>interCyclePause: 0xb75e8fd8: a(n) WorldState
0xff764834 M WorldState>doOneCycleFor: 0xb75e8fd8: a(n) WorldState
0xff764850 M WorldMorph>doOneCycle 0xb75e8fa4: a(n) WorldMorph
0xff764870 I [] in MorphicUIManager()>? 0xb770ac38: a(n) MorphicUIManager
0xb78cb554 s [] in BlockClosure()>?

Process 0xb82f9078 priority 80
0xff765858 M Delay class>handleTimerEvent 0xb8684d08: a(n) Delay class
0xff765870 M Delay class()>? 0xb8684d08: a(n) Delay class
0xb8623474 s [] in Delay class()>?
0xb82f9018 s [] in BlockClosure>newProcess

Process 0xb883735c priority 60
0xff76680c M InputEventFetcher>waitForInput 0xb72f059c: a(n)
InputEventFetcher
0xff766830 M InputEventFetcher>eventLoop 0xb72f059c: a(n) InputEventFetcher
0xff766850 I [] in InputEventFetcher>installEventLoop 0xb72f059c: a(n)
InputEventFetcher
0xff766870 I [] in BlockClosure>newProcess 0xb8837280: a(n) BlockClosure

Process 0xb8837534 priority 60
0xb8837568 s SmalltalkImage>lowSpaceWatcher
0xb9127478 s [] in SmalltalkImage>installLowSpaceWatcher
0xb88374d4 s [] in BlockClosure>newProcess

Most recent primitives
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
~ 200 times



>
> Nicolai
>>
>>
>>
>> 2014-07-25 16:56 GMT+02:00 Ben Coman <btc at openinworld.com>:
>>
>>>
>>>  Over the last few days I have been looking deeper into the image
>>> locking when suspending a process. It is an interesting rabbit hole [1]
>>> that leads to pondering the Delay machinery, that leads to some VM
>>> questions.
>>>
>>> When  pressing the interrupt key it seems to always opens the debugger
>>> with the following call stack.
>>> Semaphore>>critical:   'self wait'
>>> BlockClosure>>ensure:     'self valueNoContextSwitch'
>>> Semaphore>>critical:      'ensure: [ caught ifTrue: [self signal]]
>>> Delay>>schedule         'AccessProtect critical: ['
>>> Delay>>wait              'self schedule'
>>> WorldState>>interCyclePause:
>>>
>>> I notice...
>>>     Delay class >> initialize
>>>         TimingSemaphore := (Smalltalk specialObjectsArray at: 30).
>>> and...
>>>     Delay class >> startTimerEventLoop
>>>         TimingSemaphore := Semaphore new.
>>> which seems incongruous that TimingSemaphore is set in differently.  So
>>> while I presume this critical stuff all works fine, just in an exotic way,
>>> my entropy-guarding-neuron would just like confirm this is so.
>>>
>>> --------------
>>>
>>> In Delay class >> handleTimerEvent the comment says...
>>>     "Handle a timer event....
>>>           -a timer signal (not explicitly specified)"
>>> ...is that event perhaps a 'tick' generated periodically by the VM via
>>> that item from specialObjectArray ?  Or is there some other mechanism ?
>>>
>>> --------------
>>>
>>> [1] http://www.urbandictionary.com/define.php?term=Rabbit+Hole
>>> cheers -ben
>>>
>>>
>>> P.S. I've left the following for some initial context as I change the
>>> subject.  btw Nicolai, I confirm that my proposed fixes only work on
>>> Windows, not Mavericks (and I haven't checked Linux).
>>>
>>> Nicolai Hess wrote:
>>>
>>>
>>> Hi ben, thank you for looking at this.
>>>
>>> 2014-07-22 20:17 GMT+02:00 <btc at openinworld.com>:
>>>
>>>>  I thought this might be interesting to learn, so I've gave it a go.
>>>> I  had some success at the end, but I'll give a progressive report.
>>>>
>>>> First I thought I'd try moving the update of StringMorph outside the
>>>> worker-process using a Morph's #step method as follows...
>>>>
>>>> Morph subclass: #BackgroundWorkDisplayMorph
>>>>     instanceVariableNames: 'interProcessString stringMorph'
>>>>     classVariableNames: ''
>>>>     category: 'BenPlay'
>>>>     "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>initializeMorph
>>>>     self color: Color red.
>>>>     stringMorph := StringMorph new.
>>>>     self addMorphBack: stringMorph.
>>>>     self extent:(300 at 50).
>>>>     "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>newWorkerProcess
>>>>     ^[
>>>>         | work |
>>>>         work := 0.
>>>>         [     20 milliSeconds asDelay wait.
>>>>             work := work + 1.
>>>>             interProcessString := work asString.
>>>>         ] repeat.
>>>>     ] newProcess.
>>>>     "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>step
>>>>     stringMorph contents: interProcessString.
>>>>     "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>stepTime
>>>>     ^50
>>>>     "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>initialize
>>>>     | workerProcess running |
>>>>     super initialize.
>>>>     self initializeMorph.
>>>>
>>>>     workerProcess := self newWorkerProcess.
>>>>     running := false.
>>>>
>>>>     self on: #mouseUp send: #value to:
>>>>     [      (running := running not)
>>>>             ifTrue: [  workerProcess resume. self color: Color green.  ]
>>>>             ifFalse: [ workerProcess suspend. self color: Color red. ]
>>>>     ]
>>>>     "---------"
>>>>
>>>>
>>>>
>>>> But evaluating "BackgroundWorkDisplayMorph new openInWorld"  found this
>>>> exhibited the same problematic behavior you reported... Clicking on the
>>>> morph worked a few times and then froze the UI until Cmd-. pressed a few
>>>> times.
>>>>
>>>
>>>> However I found the following never locked the GUI.
>>>>
>>>> BackgroundWorkDisplayMorph>>initialize
>>>>     "BackgroundWorkDisplayMorph new openInWorld"
>>>>     | workerProcess running |
>>>>     super initialize.
>>>>     self initializeMorph.
>>>>
>>>>     workerProcess := self newWorkerProcess.
>>>>     running := false.
>>>>
>>>>     [ [      (running := running not)
>>>>             ifTrue: [  workerProcess resume. self color: Color green  ]
>>>>             ifFalse: [ workerProcess suspend. self color: Color red ].
>>>>         10 milliSeconds asDelay wait.
>>>>     ] repeat ] fork.
>>>>     "---------"
>>>>
>>>>
>>>  This locks the UI as well. Not every timet hough. I did this 5 times,
>>> every time in a freshly loaded image and it happens two times.
>>>
>>>
>>>
>>>> So the problem seemed to not be with #suspend/#resume or with the
>>>> shared variable /interProcessString/.  Indeed, since in the worker thread
>>>> /interProcessString/ is atomically assigned a copy via #asString, and the
>>>> String never updated, I think there is no need to surround use of it with a
>>>> critical section.
>>>>
>>>> The solution then was to move the "#resume/#suspend" away from the
>>>> "#on: #mouseUp send: #value to:" as follows...
>>>>
>>>> BackgroundWorkDisplayMorph>>initialize
>>>>     "BackgroundWorkDisplayMorph new openInWorld"
>>>>     | workerProcess running lastRunning |
>>>>     super initialize.
>>>>     self initializeMorph.
>>>>
>>>>     workerProcess := self newWorkerProcess.
>>>>     lastRunning := running := false.
>>>>
>>>>     [ [    lastRunning = running ifFalse:
>>>>         [    running
>>>>                 ifTrue: [  workerProcess resume  ]
>>>>                 ifFalse: [ workerProcess suspend ].
>>>>             lastRunning := running.
>>>>         ].
>>>>         10 milliSeconds asDelay wait.
>>>>     ] repeat ] fork.
>>>>
>>>>     self on: #mouseUp send: #value to:
>>>>     [      (running := running not)
>>>>             ifTrue: [  self color: Color green.  ]
>>>>             ifFalse: [ self color: Color red. ]
>>>>     ]
>>>>     "---------"
>>>>
>>>
>>>  And this too :(
>>>
>>>
>>>
>>>>
>>>> And finally remove the busy loop.
>>>>
>>>> BackgroundWorkDisplayMorph>>initialize
>>>>     "BackgroundWorkDisplayMorph new openInWorld"
>>>>     | workerProcess running lastRunning semaphore |
>>>>     super initialize.
>>>>     self initializeMorph.
>>>>
>>>>     workerProcess := self newWorkerProcess.
>>>>     lastRunning := running := false.
>>>>     semaphore := Semaphore new.
>>>>
>>>>     [ [    semaphore wait.
>>>>         running
>>>>             ifTrue: [  workerProcess resume  ]
>>>>             ifFalse: [ workerProcess suspend ].
>>>>     ] repeat ] fork.
>>>>
>>>>     self on: #mouseUp send: #value to:
>>>>     [      (running := running not)
>>>>             ifTrue: [  self color: Color green.  ]
>>>>             ifFalse: [ self color: Color red. ].
>>>>         semaphore signal.
>>>>     ]
>>>>     "---------"
>>>>
>>>>
>>>
>>>  And this locks the UI too. (Loaded the code 20 times, every time after
>>> a fresh image start up. Two times I got a locked
>>>  ui after the first two clicks).
>>>  And I don't understand this code :)
>>>
>>>
>>>
>>>> Now I can't say how close that is to how it "should" be done.  Its the
>>>> first time I used sempahores and just what I discovered hacking around.
>>>> But hey! it works :)
>>>>
>>>> cheers -ben
>>>>
>>>>
>>>>
>>>> Nicolai Hess wrote:
>>>>
>>>>  I am still struggling with it.
>>>>
>>>>  Any ideas?
>>>>
>>>>
>>>> 2014-07-09 11:19 GMT+02:00 Nicolai Hess <nicolaihess at web.de>:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2014-07-09 2:07 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:
>>>>>
>>>>>  Hi Nicolai,
>>>>>>
>>>>>>
>>>>>>  On Tue, Jul 8, 2014 at 7:19 AM, Nicolai Hess <nicolaihess at web.de>
>>>>>> wrote:
>>>>>>
>>>>>>>  I want to create a process doing some work and call #changed on a
>>>>>>> Morph.
>>>>>>> I want to start/suspend/resume or stop this process.
>>>>>>> But sometimes, suspending the process locks the UI-Process,
>>>>>>> and I don't know why. Did I miss something or do I have to care when
>>>>>>> to call suspend?
>>>>>>>
>>>>>>>  Wrapping the "morph changed" call in
>>>>>>>  UIManager default defer:[ morph changed].
>>>>>>>  Does not change anything.
>>>>>>>
>>>>>>> Here is an example to reproduce it.
>>>>>>> Create the process,
>>>>>>> call resume, call supsend. It works, most of the time,
>>>>>>> but sometimes, calling suspend locks the ui.
>>>>>>>
>>>>>>> p:=[[true] whileTrue:[ Transcript crShow: (DateAndTime now
>>>>>>> asString). 30 milliSeconds asDelay wait]] newProcess.
>>>>>>>
>>>>>>  p resume.
>>>>>>> p suspend.
>>>>>>>
>>>>>>
>>>>>>   If you simply suspend this process at random form a user-priority
>>>>>> process you'll never be able to damage the Delay machinery you're using,
>>>>>> but chances are you'll suspend the process inside the critical section that
>>>>>> Transcript uses to make itself thread-safe, and that'll lock up the
>>>>>> Transcript.
>>>>>>
>>>>>
>>>>>  Thank you Eliot
>>>>>  yes I guessed it locks up the critical section, but I hoped with
>>>>> would not happen if I the use UIManager defer call.
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>  ThreadSafeTranscript>>nextPutAll: value
>>>>>>   accessSemaphore
>>>>>>  critical: [stream nextPutAll: value].
>>>>>>  ^value
>>>>>>
>>>>>>  So instead you need to use a semaphore.  e.g.
>>>>>>
>>>>>>  | p s wait |
>>>>>> s := Semaphore new.
>>>>>> p:=[[true] whileTrue:[wait ifTrue: [s wait]. Transcript crShow:
>>>>>> (DateAndTime now asString). 30 milliSeconds asDelay wait]] newProcess.
>>>>>> wait := true.
>>>>>> 30 milliSeconds asDelay wait.
>>>>>>  wait := false.
>>>>>> s signal
>>>>>>
>>>>>>  etc...
>>>>>>
>>>>>
>>>>>  Is this a common pattern I can find in pharos classes. Or I need
>>>>> some help understanding this. The semaphore
>>>>>  wait/signal is used instead of process resume/suspend?
>>>>>
>>>>> What I want is a process doing repeatly some computation,
>>>>> calls or triggers an update on a morph, and I want to suspend and
>>>>> resume this process.
>>>>>
>>>>> I would stop this discussion if someone tells me, "No your are doing
>>>>> it wrong, go this way ..",  BUT what strikes me:
>>>>> in this example, that reproduces my problem more closely:
>>>>>
>>>>> |p m s running|
>>>>> running:=false.
>>>>> m:=Morph new color:Color red.
>>>>> s:= StringMorph new.
>>>>> m addMorphBack:s.
>>>>> p:=[[true]whileTrue:[20 milliSeconds asDelay wait. s
>>>>> contents:(DateAndTime now asString). m changed]] newProcess.
>>>>> m on:#mouseUp send:#value to:[
>>>>>     running ifTrue:[p suspend. m color:Color red.]
>>>>>     ifFalse:[p resume.m color:Color green.].
>>>>>     running := running not].
>>>>> m extent:(300 at 50).
>>>>> m openInWorld
>>>>>
>>>>>
>>>>>  clicking on the morph will stop or resume the process, if it locks
>>>>> up I can still press alt+dot ->
>>>>>  - a Debugger opens but the UI is still not responsive. I can click
>>>>> with the mouse on the debuggers close icon.
>>>>>  - nothing happens, as the UI is still blocked.
>>>>>  - pressing alt+Dot again, the mouse click on the close icon is
>>>>> processed and the first debugger window closes
>>>>> - maybe other debuggers open.
>>>>>
>>>>> Repeating this steps, at some time the system is *fully* responsive
>>>>> again!
>>>>>  And miraculously, it works after that without further blockages.
>>>>> What's happening here?
>>>>>
>>>>>
>>>>>  Nicolai
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>  HTH
>>>>>>
>>>>>>   regards
>>>>>>>  Nicolai
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>>> best,
>>>>>> Eliot
>>>>>>
>>>>> --
> Aloha,
> Eliot
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20140728/7880f8d9/attachment-0001.htm


More information about the Vm-dev mailing list