[Vm-dev] Delay machinery (was Re: [Pharo-dev] Suspending a
Process)
Nicolai Hess
nicolaihess at web.de
Sun Jul 27 23:14:59 UTC 2014
2014-07-25 20:05 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:
>
> Hi Nicolai, Hi Ben,
>
>
> On Fri, Jul 25, 2014 at 10:55 AM, Nicolai Hess <nicolaihess at web.de> wrote:
>
>>
>>
>> Hi Ben,
>>
>> I am on Windows too :(
>> So, the fixes does not work (not always) on winddows too. But at least
>> they make it less probable to occure, but it still happens.
>> The most distracting thing is, after the first ui lock, pressing alt+dot,
>> closing the debuggers, pressing alt+dot ....
>> and trying to close the very first debugger, after that, it all works.
>> The UI is responsive again and suspending the process does
>> not block the ui anymore.
>> It "looks like" supsending the process reactivates another process that
>> blocks the UI. And as soon as I terminate this
>> process (alt+dot, close debugger ...) all works.
>> But I really don't know.
>>
>
> if you can run a unix machine (in a VM?) then remember that kill -USR1 pid
> will cause the VM to print out a stack backtrace of all processes in the
> image. That can be very useful in debuggng lockups like this.
>
> HTH
>
Ok, but I don't know if this helps, at least it does not look very helpful
to me:)
SIGUSR1 Mon Jul 28 01:06:16 2014
pharo VM version: 3.9-7 #1 Tue May 6 08:30:23 UTC 2014 gcc 4.8.2
[Production ITHB VM]
Built from: NBCoInterpreter NativeBoost-CogPlugin-GuillermoPolito.19 uuid:
acc98e51-2fba-4841-a965-2975997bba66 May 6 2014
With: NBCogit NativeBoost-CogPlugin-GuillermoPolito.19 uuid:
acc98e51-2fba-4841-a965-2975997bba66 May 6 2014
Revision: https://github.com/pharo-project/pharo-vm.git Commit:
ef5832e6f70e5b24e8b9b1f4b8509a62b6c88040 Date: 2014-01-26 15:34:28 +0100
By: Esteban Lorenzano <estebanlm at gmail.com> Jenkins build #14794
Build host: Linux chindi08 2.6.24-32-xen #1 SMP Mon Dec 3 16:12:25 UTC 2012
i686 i686 i686 GNU/Linux
plugin path: /usr/lib/pharo-vm/ [default: /usr/lib/pharo-vm/]
C stack backtrace:
/usr/lib/pharo-vm/pharo-vm[0x809ad23]
/usr/lib/pharo-vm/pharo-vm[0x809af6e]
[0xf7784410]
[0xf7784425]
/lib/i386-linux-gnu/libc.so.6(__select+0x2d)[0xf763691d]
/usr/lib/pharo-vm/pharo-vm(aioPoll+0x13d)[0x809748d]
/usr/lib/pharo-vm/vm-display-X11.so(+0xdc85)[0xf71d2c85]
/usr/lib/pharo-vm/pharo-vm(ioRelinquishProcessorForMicroseconds+0x17)[0x8099b57]
/usr/lib/pharo-vm/pharo-vm[0x8070685]
[0xb6f8dbc3]
[0xb6f89700]
[0xb7a2650e]
[0xb6f895c0]
All Smalltalk process stacks (active first):
Process 0xb88376fc priority 10
0xff76c830 M ProcessorScheduler class>idleProcess 0xb7306b08: a(n)
ProcessorScheduler class
0xff76c850 I [] in ProcessorScheduler class>startUp 0xb7306b08: a(n)
ProcessorScheduler class
0xff76c870 I [] in BlockClosure>newProcess 0xb8837620: a(n) BlockClosure
Process 0xb8838c78 priority 50
0xff768830 I WeakArray class>finalizationProcess 0xb7306cd8: a(n) WeakArray
class
0xff768850 I [] in WeakArray class>restartFinalizationProcess 0xb7306cd8:
a(n) WeakArray class
0xff768870 I [] in BlockClosure>newProcess 0xb8838b9c: a(n) BlockClosure
Process 0xb9148c20 priority 40
0xff7907b8 M [] in Semaphore>critical: 0xb82f8ef4: a(n) Semaphore
0xff7907d8 M BlockClosure>ensure: 0xb91502e0: a(n) BlockClosure
0xff7907f8 M Semaphore>critical: 0xb82f8ef4: a(n) Semaphore
0xff790814 M Delay>schedule 0xb91501e4: a(n) Delay
0xff79082c M Delay>wait 0xb91501e4: a(n) Delay
0xff790850 I [] in BackgroundWorkDisplayMorph>initialize 0xb91488b0: a(n)
BackgroundWorkDisplayMorph
0xff790870 I [] in BlockClosure>newProcess 0xb9148b40: a(n) BlockClosure
Process 0xb7902630 priority 40
0xff764784 M [] in Semaphore>critical: 0xb82f8ef4: a(n) Semaphore
0xff7647a4 M BlockClosure>ensure: 0xb916b7a4: a(n) BlockClosure
0xff7647c4 M Semaphore>critical: 0xb82f8ef4: a(n) Semaphore
0xff7647e0 M Delay>schedule 0xb916b6a8: a(n) Delay
0xff7647f8 M Delay>wait 0xb916b6a8: a(n) Delay
0xff764818 M WorldState>interCyclePause: 0xb75e8fd8: a(n) WorldState
0xff764834 M WorldState>doOneCycleFor: 0xb75e8fd8: a(n) WorldState
0xff764850 M WorldMorph>doOneCycle 0xb75e8fa4: a(n) WorldMorph
0xff764870 I [] in MorphicUIManager()>? 0xb770ac38: a(n) MorphicUIManager
0xb78cb554 s [] in BlockClosure()>?
Process 0xb82f9078 priority 80
0xff765858 M Delay class>handleTimerEvent 0xb8684d08: a(n) Delay class
0xff765870 M Delay class()>? 0xb8684d08: a(n) Delay class
0xb8623474 s [] in Delay class()>?
0xb82f9018 s [] in BlockClosure>newProcess
Process 0xb883735c priority 60
0xff76680c M InputEventFetcher>waitForInput 0xb72f059c: a(n)
InputEventFetcher
0xff766830 M InputEventFetcher>eventLoop 0xb72f059c: a(n) InputEventFetcher
0xff766850 I [] in InputEventFetcher>installEventLoop 0xb72f059c: a(n)
InputEventFetcher
0xff766870 I [] in BlockClosure>newProcess 0xb8837280: a(n) BlockClosure
Process 0xb8837534 priority 60
0xb8837568 s SmalltalkImage>lowSpaceWatcher
0xb9127478 s [] in SmalltalkImage>installLowSpaceWatcher
0xb88374d4 s [] in BlockClosure>newProcess
Most recent primitives
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
~ 200 times
>
> Nicolai
>>
>>
>>
>> 2014-07-25 16:56 GMT+02:00 Ben Coman <btc at openinworld.com>:
>>
>>>
>>> Over the last few days I have been looking deeper into the image
>>> locking when suspending a process. It is an interesting rabbit hole [1]
>>> that leads to pondering the Delay machinery, that leads to some VM
>>> questions.
>>>
>>> When pressing the interrupt key it seems to always opens the debugger
>>> with the following call stack.
>>> Semaphore>>critical: 'self wait'
>>> BlockClosure>>ensure: 'self valueNoContextSwitch'
>>> Semaphore>>critical: 'ensure: [ caught ifTrue: [self signal]]
>>> Delay>>schedule 'AccessProtect critical: ['
>>> Delay>>wait 'self schedule'
>>> WorldState>>interCyclePause:
>>>
>>> I notice...
>>> Delay class >> initialize
>>> TimingSemaphore := (Smalltalk specialObjectsArray at: 30).
>>> and...
>>> Delay class >> startTimerEventLoop
>>> TimingSemaphore := Semaphore new.
>>> which seems incongruous that TimingSemaphore is set in differently. So
>>> while I presume this critical stuff all works fine, just in an exotic way,
>>> my entropy-guarding-neuron would just like confirm this is so.
>>>
>>> --------------
>>>
>>> In Delay class >> handleTimerEvent the comment says...
>>> "Handle a timer event....
>>> -a timer signal (not explicitly specified)"
>>> ...is that event perhaps a 'tick' generated periodically by the VM via
>>> that item from specialObjectArray ? Or is there some other mechanism ?
>>>
>>> --------------
>>>
>>> [1] http://www.urbandictionary.com/define.php?term=Rabbit+Hole
>>> cheers -ben
>>>
>>>
>>> P.S. I've left the following for some initial context as I change the
>>> subject. btw Nicolai, I confirm that my proposed fixes only work on
>>> Windows, not Mavericks (and I haven't checked Linux).
>>>
>>> Nicolai Hess wrote:
>>>
>>>
>>> Hi ben, thank you for looking at this.
>>>
>>> 2014-07-22 20:17 GMT+02:00 <btc at openinworld.com>:
>>>
>>>> I thought this might be interesting to learn, so I've gave it a go.
>>>> I had some success at the end, but I'll give a progressive report.
>>>>
>>>> First I thought I'd try moving the update of StringMorph outside the
>>>> worker-process using a Morph's #step method as follows...
>>>>
>>>> Morph subclass: #BackgroundWorkDisplayMorph
>>>> instanceVariableNames: 'interProcessString stringMorph'
>>>> classVariableNames: ''
>>>> category: 'BenPlay'
>>>> "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>initializeMorph
>>>> self color: Color red.
>>>> stringMorph := StringMorph new.
>>>> self addMorphBack: stringMorph.
>>>> self extent:(300 at 50).
>>>> "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>newWorkerProcess
>>>> ^[
>>>> | work |
>>>> work := 0.
>>>> [ 20 milliSeconds asDelay wait.
>>>> work := work + 1.
>>>> interProcessString := work asString.
>>>> ] repeat.
>>>> ] newProcess.
>>>> "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>step
>>>> stringMorph contents: interProcessString.
>>>> "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>stepTime
>>>> ^50
>>>> "---------"
>>>>
>>>> BackgroundWorkDisplayMorph>>initialize
>>>> | workerProcess running |
>>>> super initialize.
>>>> self initializeMorph.
>>>>
>>>> workerProcess := self newWorkerProcess.
>>>> running := false.
>>>>
>>>> self on: #mouseUp send: #value to:
>>>> [ (running := running not)
>>>> ifTrue: [ workerProcess resume. self color: Color green. ]
>>>> ifFalse: [ workerProcess suspend. self color: Color red. ]
>>>> ]
>>>> "---------"
>>>>
>>>>
>>>>
>>>> But evaluating "BackgroundWorkDisplayMorph new openInWorld" found this
>>>> exhibited the same problematic behavior you reported... Clicking on the
>>>> morph worked a few times and then froze the UI until Cmd-. pressed a few
>>>> times.
>>>>
>>>
>>>> However I found the following never locked the GUI.
>>>>
>>>> BackgroundWorkDisplayMorph>>initialize
>>>> "BackgroundWorkDisplayMorph new openInWorld"
>>>> | workerProcess running |
>>>> super initialize.
>>>> self initializeMorph.
>>>>
>>>> workerProcess := self newWorkerProcess.
>>>> running := false.
>>>>
>>>> [ [ (running := running not)
>>>> ifTrue: [ workerProcess resume. self color: Color green ]
>>>> ifFalse: [ workerProcess suspend. self color: Color red ].
>>>> 10 milliSeconds asDelay wait.
>>>> ] repeat ] fork.
>>>> "---------"
>>>>
>>>>
>>> This locks the UI as well. Not every timet hough. I did this 5 times,
>>> every time in a freshly loaded image and it happens two times.
>>>
>>>
>>>
>>>> So the problem seemed to not be with #suspend/#resume or with the
>>>> shared variable /interProcessString/. Indeed, since in the worker thread
>>>> /interProcessString/ is atomically assigned a copy via #asString, and the
>>>> String never updated, I think there is no need to surround use of it with a
>>>> critical section.
>>>>
>>>> The solution then was to move the "#resume/#suspend" away from the
>>>> "#on: #mouseUp send: #value to:" as follows...
>>>>
>>>> BackgroundWorkDisplayMorph>>initialize
>>>> "BackgroundWorkDisplayMorph new openInWorld"
>>>> | workerProcess running lastRunning |
>>>> super initialize.
>>>> self initializeMorph.
>>>>
>>>> workerProcess := self newWorkerProcess.
>>>> lastRunning := running := false.
>>>>
>>>> [ [ lastRunning = running ifFalse:
>>>> [ running
>>>> ifTrue: [ workerProcess resume ]
>>>> ifFalse: [ workerProcess suspend ].
>>>> lastRunning := running.
>>>> ].
>>>> 10 milliSeconds asDelay wait.
>>>> ] repeat ] fork.
>>>>
>>>> self on: #mouseUp send: #value to:
>>>> [ (running := running not)
>>>> ifTrue: [ self color: Color green. ]
>>>> ifFalse: [ self color: Color red. ]
>>>> ]
>>>> "---------"
>>>>
>>>
>>> And this too :(
>>>
>>>
>>>
>>>>
>>>> And finally remove the busy loop.
>>>>
>>>> BackgroundWorkDisplayMorph>>initialize
>>>> "BackgroundWorkDisplayMorph new openInWorld"
>>>> | workerProcess running lastRunning semaphore |
>>>> super initialize.
>>>> self initializeMorph.
>>>>
>>>> workerProcess := self newWorkerProcess.
>>>> lastRunning := running := false.
>>>> semaphore := Semaphore new.
>>>>
>>>> [ [ semaphore wait.
>>>> running
>>>> ifTrue: [ workerProcess resume ]
>>>> ifFalse: [ workerProcess suspend ].
>>>> ] repeat ] fork.
>>>>
>>>> self on: #mouseUp send: #value to:
>>>> [ (running := running not)
>>>> ifTrue: [ self color: Color green. ]
>>>> ifFalse: [ self color: Color red. ].
>>>> semaphore signal.
>>>> ]
>>>> "---------"
>>>>
>>>>
>>>
>>> And this locks the UI too. (Loaded the code 20 times, every time after
>>> a fresh image start up. Two times I got a locked
>>> ui after the first two clicks).
>>> And I don't understand this code :)
>>>
>>>
>>>
>>>> Now I can't say how close that is to how it "should" be done. Its the
>>>> first time I used sempahores and just what I discovered hacking around.
>>>> But hey! it works :)
>>>>
>>>> cheers -ben
>>>>
>>>>
>>>>
>>>> Nicolai Hess wrote:
>>>>
>>>> I am still struggling with it.
>>>>
>>>> Any ideas?
>>>>
>>>>
>>>> 2014-07-09 11:19 GMT+02:00 Nicolai Hess <nicolaihess at web.de>:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2014-07-09 2:07 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:
>>>>>
>>>>> Hi Nicolai,
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 8, 2014 at 7:19 AM, Nicolai Hess <nicolaihess at web.de>
>>>>>> wrote:
>>>>>>
>>>>>>> I want to create a process doing some work and call #changed on a
>>>>>>> Morph.
>>>>>>> I want to start/suspend/resume or stop this process.
>>>>>>> But sometimes, suspending the process locks the UI-Process,
>>>>>>> and I don't know why. Did I miss something or do I have to care when
>>>>>>> to call suspend?
>>>>>>>
>>>>>>> Wrapping the "morph changed" call in
>>>>>>> UIManager default defer:[ morph changed].
>>>>>>> Does not change anything.
>>>>>>>
>>>>>>> Here is an example to reproduce it.
>>>>>>> Create the process,
>>>>>>> call resume, call supsend. It works, most of the time,
>>>>>>> but sometimes, calling suspend locks the ui.
>>>>>>>
>>>>>>> p:=[[true] whileTrue:[ Transcript crShow: (DateAndTime now
>>>>>>> asString). 30 milliSeconds asDelay wait]] newProcess.
>>>>>>>
>>>>>> p resume.
>>>>>>> p suspend.
>>>>>>>
>>>>>>
>>>>>> If you simply suspend this process at random form a user-priority
>>>>>> process you'll never be able to damage the Delay machinery you're using,
>>>>>> but chances are you'll suspend the process inside the critical section that
>>>>>> Transcript uses to make itself thread-safe, and that'll lock up the
>>>>>> Transcript.
>>>>>>
>>>>>
>>>>> Thank you Eliot
>>>>> yes I guessed it locks up the critical section, but I hoped with
>>>>> would not happen if I the use UIManager defer call.
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> ThreadSafeTranscript>>nextPutAll: value
>>>>>> accessSemaphore
>>>>>> critical: [stream nextPutAll: value].
>>>>>> ^value
>>>>>>
>>>>>> So instead you need to use a semaphore. e.g.
>>>>>>
>>>>>> | p s wait |
>>>>>> s := Semaphore new.
>>>>>> p:=[[true] whileTrue:[wait ifTrue: [s wait]. Transcript crShow:
>>>>>> (DateAndTime now asString). 30 milliSeconds asDelay wait]] newProcess.
>>>>>> wait := true.
>>>>>> 30 milliSeconds asDelay wait.
>>>>>> wait := false.
>>>>>> s signal
>>>>>>
>>>>>> etc...
>>>>>>
>>>>>
>>>>> Is this a common pattern I can find in pharos classes. Or I need
>>>>> some help understanding this. The semaphore
>>>>> wait/signal is used instead of process resume/suspend?
>>>>>
>>>>> What I want is a process doing repeatly some computation,
>>>>> calls or triggers an update on a morph, and I want to suspend and
>>>>> resume this process.
>>>>>
>>>>> I would stop this discussion if someone tells me, "No your are doing
>>>>> it wrong, go this way ..", BUT what strikes me:
>>>>> in this example, that reproduces my problem more closely:
>>>>>
>>>>> |p m s running|
>>>>> running:=false.
>>>>> m:=Morph new color:Color red.
>>>>> s:= StringMorph new.
>>>>> m addMorphBack:s.
>>>>> p:=[[true]whileTrue:[20 milliSeconds asDelay wait. s
>>>>> contents:(DateAndTime now asString). m changed]] newProcess.
>>>>> m on:#mouseUp send:#value to:[
>>>>> running ifTrue:[p suspend. m color:Color red.]
>>>>> ifFalse:[p resume.m color:Color green.].
>>>>> running := running not].
>>>>> m extent:(300 at 50).
>>>>> m openInWorld
>>>>>
>>>>>
>>>>> clicking on the morph will stop or resume the process, if it locks
>>>>> up I can still press alt+dot ->
>>>>> - a Debugger opens but the UI is still not responsive. I can click
>>>>> with the mouse on the debuggers close icon.
>>>>> - nothing happens, as the UI is still blocked.
>>>>> - pressing alt+Dot again, the mouse click on the close icon is
>>>>> processed and the first debugger window closes
>>>>> - maybe other debuggers open.
>>>>>
>>>>> Repeating this steps, at some time the system is *fully* responsive
>>>>> again!
>>>>> And miraculously, it works after that without further blockages.
>>>>> What's happening here?
>>>>>
>>>>>
>>>>> Nicolai
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> HTH
>>>>>>
>>>>>> regards
>>>>>>> Nicolai
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> best,
>>>>>> Eliot
>>>>>>
>>>>> --
> Aloha,
> Eliot
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20140728/7880f8d9/attachment-0001.htm
More information about the Vm-dev
mailing list