[squeak-dev] Re: New scheduling policies [Was: Re: Re: Suspending process fix]

Wed Apr 29 21:09:53 UTC 2009

2009/4/29 Eliot Miranda <eliot.miranda at gmail.com>:
>
>
> On Wed, Apr 29, 2009 at 11:51 AM, Igor Stasenko <siguctua at gmail.com> wrote:
>>
>> 2009/4/29 Andreas Raab <andreas.raab at gmx.de>:
>> > Igor Stasenko wrote:
>> >>
>> >> I got a new VM build, which is ready to be tested with new
>> >> scheduler(after i implement it).
>> >
>> > One thing you should do is to implement the current scheduling policy
>> > and
>> > compare the overhead when implementing it in user-land. If the overhead
>> > is
>> > not too bad I think it would be worthwhile thinking about pulling this
>> > in
>> > for real (I have some thoughts about how to make this backwards
>> > compatible
>> > too).
>> >
>> the current VM retains all backward-compatible stuff.
>> But there are places where it checking if new scheduler is in place:
>>
>> hasNewScheduler
>>        "the old scheduler using just two instance variables"
>>        ^ (self lastPointerOf: self schedulerPointer) >=
>> (ProcessActionIndex*BytesPerWord + BaseHeaderSize)
>>
>> You're right about overhead. If its too heawyweight, then we may need
>> some additional primitives. But i strongly against making scheduling
>> being dependant from early bound VM behavior again :)
>>
>> Also, a new scheduler is not obliged to use an 80-long array of lists.
>> It can use more optimized structure, like Heap to maintain a list of
>> scheduled processes sorted by priority. Then a list iteration could be
>> shortened , as well as we can use any priority value for process (not
>> just in range 1-80), and still be able to schedule them correctly.
>
> Just to clear-up any confusion, the current VM is not limited to 80; it will
> use any size.

Sure. But to my experience, such limits changing very rarely, if never.
There are many constants spitted around everywhere.
Many of them is invented simply because there was need to choose a
'reasonable' number, like SemaphoresToSignalSize (for hartred
semaphoresToSignalA and semaphoresToSignalB tweens), or 80 lists for
scheduler.
Other constants is based on impirical evidences.
But best would be to write a code which requires no, or as small as
possible number of constants. I bet that such code would serve much
longer comparing with one which rely on constants - as hardware speed
improves, it improves with it, without need in tuning different
values, which were valid once 20 years ago :)

> To shorten the list iteration I've done the following, first in VisualWorks
> and now in the StackVM and Cog.  Simply maintain a "high-tide" which is the
> highest current runnable process priority.  The list search only has to
> start from this value rather than the highest priority, which most of the
> time saves scanning 40 empty lists on every wakeHighestPriority.
> I've attached the changes (just two methods) except for the initialization
> of highestRunnableProcessPriority to zero in the relevant
> initializeInterpreter:.
>

Yes, this is the simplest thing which can be done, to minimize the looping.
Of course a loop of 80 iterations is hardly noticeable in compiled C
code on modern machinery. But big buildings consist from small bricks.

>> > Cheers,
>> >  - Andreas
>> >
>> >> I rewrote the external signaling stuff & interrupt checking.
>> >> Now its not signals any semaphores. Instead, i added a primitive which
>> >> explicitly fetching all pending signals to array and flushing pending
>> >> signals VM internal buffer. Then in interrupt checker i simply switch
>> >> active process to special 'interrupt process' (or scheduler process -
>> >> Andreas), if there any pending signals to handle.
>> >>
>> >> What does it means for language side?
>> >> It means a very cool thing: you are no longer obliged to use
>> >> semaphores to respond to signals!
>> >> You can register any object in external objects table.
>> >> And new scheduler will simply do:
>> >>
>> >> externalObjects := Smalltalk externalObjects.
>> >> signalIndexes do: [:index |
>> >>    (externalObjects at: i) handleExternalSignal.
>> >> ]
>> >>
>> >> so, as long as your registered object responds to
>> >> #handleExternalSignal, you are free to choose what to do in response
>> >> to signal.
>> >> Semaphores, of course will signal themselves.
>> >>
>> >> After replacing scheduler with new model, the VM will no longer need
>> >> to know anything about semaphores. This is because any scheduling
>> >> related stuff will become 100% language-side specific.
>> >>
>> >> So, that with new model, multiple primitives become obsolete:
>> >>
>> >> primitiveYield
>> >> primitiveWait
>> >> primitiveSuspend
>> >> primitiveSignal
>> >> primitiveResume
>> >>
>> >> instead of them there are two new primitives:
>> >>
>> >> primitiveTransferToProcess
>> >>        "sets an ActiveProcess to new process,
>> >>        sets an InterruptedProcess to the process which was active
>> >>        set a ProcessAction to anAction object
>> >>        "
>> >>
>> >> primitiveFetchPendingSignals
>> >>        "primitive, fill an array (first argument)with special objects
>> >> indexes, needed to be signaled.
>> >>        Returns a number of signals being filled.
>> >>        Or negative number indicating that array is not big enough to
>> >> fetch
>> >> all signals at once.
>> >>        Primitive fails if first argument is not array.
>> >>        "
>> >>
>> >> 2009/4/29 Igor Stasenko <siguctua at gmail.com>:
>> >>>
>> >>> 2009/4/29 Andreas Raab <andreas.raab at gmx.de>:
>> >>>>
>> >>>> Igor Stasenko wrote:
>> >>>>>
>> >>>>> I came to an idea , you might be interested in.
>> >>>>> As many of us know, some CPUs having a special mode - interrupt
>> >>>>> mode.
>> >>>>> What if we introduce the interrupt mode for scheduler?
>> >>>>
>> >>>> [... snip ...]
>> >>>>>
>> >>>>> Now i trying to imagine, how a basic stuff might look like(please
>> >>>>> correct me if its utterly wrong way ;), if we will be able to use
>> >>>>> interrupt mode.
>> >>>>
>> >>>> This is actually along similar lines of thought that I had when I was
>> >>>> thinking of how to get rid of the builtin VM scheduling behavior. The
>> >>>> main
>> >>>> thought that I had was that the VM may have a "special" process - the
>> >>>> scheduler process (duh!) which it runs when it doesn't know what else
>> >>>> to
>> >>>> do.
>> >>>> The VM would then not directly schedule processes after semaphore
>> >>>> signals
>> >>>> but rather put them onto a "ready" queue that can be read by the
>> >>>> scheduler
>> >>>> process and switch to the scheduler process. The scheduler process
>> >>>> decides
>> >>>> what to run next and resumes the process via a primitive. Whenever an
>> >>>> external signal comes in, the VM automatically activates the
>> >>>> scheduler
>> >>>> process and the scheduler process then decides whether to resume the
>> >>>> previously running process or to switch to a different process.
>> >>>>
>> >>>> In a way this folds the timer process into the scheduler (which makes
>> >>>> good
>> >>>> sense from my perspective because much of the work in the timer is
>> >>>> stuff
>> >>>> that could be more effectively take place in the scheduler). The
>> >>>> implementation should be relatively straightforward - just add a
>> >>>> scheduler
>> >>>> process and a ready list to the special objects, and wherever the VM
>> >>>> would
>> >>>> normally process switch you just switch to the scheduler. Voila,
>> >>>> there
>> >>>> is
>> >>>> your user-manipulable scheduler ;-) And obviously, anything that is
>> >>>> run
>> >>>> out
>> >>>> of the scheduler process is by definition non-interruptable because
>> >>>> there is
>> >>>> simply nothing to switch to!
>> >>>>
>> >>> Very nice indeed. That's even better that my first proposal.
>> >>> ProcessorScheduler>>schedulingProcessLoop
>> >>> [
>> >>>  self handlePendingSignalsAndActions.
>> >>>  activeProcess ifNil: [ self idle ] ifNotNil: [ self
>> >>> primitiveTransferControlTo: activeProcess].
>> >>> ]  repeat.
>> >>>
>> >>> and when any process, somehow stops running
>> >>> (suspend/wait/terminate/interrupted etc), VM will again switch to
>> >>> scheduler process loop.
>> >>>
>> >>> What is important in having it, that there is guarantee to be not
>> >>> preempted by anything. Simply by having this, many
>> >>> concurrency/scheduling related problems can be solved by language-side
>> >>> implementation, without fear of having gotchas from VM side.
>> >>>
>> >>> Also, VM doesn't needs to know details about priorities, suspending,
>> >>> etc etc..  - which means that we can simplify VM considerably and
>> >>> implement same parts on the language side, where everything is late
>> >>> bound :)
>> >>>
>> >>> As for moving to multi-cores.. yes, as Gulik suggests, its like adding
>> >>> a new dimension:
>> >>>  - local scheduler for each core
>> >>>  - single global scheduler for freezing everything
>> >>>
>> >>> This, of course, if we could afford running same object memory over
>> >>> multiple cores. Handling interpreter/object memory state(s) with
>> >>> multiple cores is not trivial thing.
>> >>>
>> >>> If we going to keep more isolated model (islands, hydra ) then we need
>> >>> no/minimal changes to scheduler - each scheduler serves own island and
>> >>> receives asynchronous signals from other collegues through shared
>> >>> queue.
>> >>>
>> >>>> Cheers,
>> >>>>  - Andreas
>> >>>>
>> >>>
>> >>> --
>> >>> Best regards,
>> >>> Igor Stasenko AKA sig.
>> >>>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>
>
>
>
>

-- 
Best regards,
Igor Stasenko AKA sig.