#fork and deterministic resumption of the resulting process

Thu Feb 7 23:14:35 UTC 2008

Igor Stasenko a écrit :
> Please, correct me if i'm wrong:
> Andreas proposing that given code will work
> 
> workerProcess := [ ... ] fork.
> 
> while in block, you accessing a workerProcess variable and expect it
> to be initialized.
> 
> Now, if that's true, then some of us might assume that he can do
> something like this:
> 
> MyClass>>forkAndReturnForkedProcess
> 
> ^ [ .... ] fork.
> 
> and in another method write:
> 
> MyAnotherClass>>forkProcess
> ^ forkedProcess := myClass forkAndReturnForkedProcess.
> 
> and he should expect that this should work as well?
> 

How I understand it, above this code will work.
Because forkedProcess won't have a chance to start until activeProcess 
is blocked on a Semaphore wait.

> And then, third developer cames and writes another method , which
> relies on such behavior and so on..
> So, where this 'determinism' chain can be stopped, where you don't
> have any guarantees if forked process accessing
> initialized/uninitialized state?
> How i can estimate, where to stop with such kind of determinism?
> Isn't it would be better to have 'determinism' in following: if you
> forked process, you should make sure that you have everything properly
> initialized before doing fork?
> 

The simpler, the better. That's why i like Andreas solution.

>> With different API contracts and constraints a lot of code would break.
>>
>> Either, there will be a major rewrite of libraries, or you will have to
>> provide a backward compatibility for elder API.
>>
> 
> Hmm, why we should pursue with major rewrite each time?
> Why don't call it #determinatedFork (or whatever) and don't touch
> original method?
>

A #randomFork API has no value per se.
A #deterministicFork can simplify coding.

Of course, in a multi-processor VM, randomFork would be the default, and 
deterministicFork might still exist but be more costly (no parallelism).

>> What matters is that code is bad today, because programmer expectations
>> do not meet Kernel implementation. And Andreas solved this nicely I think.
>>
> I don't agree with that. I never expected that fork should work as
> Andreas proposing.
> 

Then Andreas patch won't change anything for you.
It's an easy way to make some existing code work.

I perfectly understand your point.
You say, we'd better change existing code.

Because Andreas implementation relies on correct sequencing based only 
on Process priority, it cannot be extended to multi-processor case.
So at the very least, this deterministicFork would not work without a 
rewrite.

It would be easier to mark explicitely existing code relying on this 
feature, and rewrite only those parts latter...

With Andreas change, we choose the other alternative, be lazy now, and 
delay this hard work to review every #fork, until we have such a VM 
available.

>> Maybe we are all wrong because you are able to provide a clever
>> concurrent parallel multithread implementation 99% compatible with
>> existing code base, even core BlockContext and Process class, maybe you
>> are a few years ahead, but until this is proven, we'll assume Andreas is
>> right.
>>
> 
> I don't think it's the case, where we need to break compatibility.
> 

I do not believe that you will be able to provide image compatibility, 
simply because a lot of code has been written with single-Processor 
model in mind.

But maybe I'm wrong, you looked at this part of code longer than me.

> 
> And finally, do you really think, that hiding concurrency issues from
> the eyes of developer helps him to write good thread-safe code?
> I think, it's doing exactly opposite: makes him think safe and comfortable.

I do not like this argument. It's like saying, coding is complex, and we 
must keep it complex to prevent vulgus pecum to put its nose in. Not 
Smalltalk philosophy as I understood it.

If you say: using proposed explicit #resume solution when needed would 
be as simple, more portable and future-proofed, that's an argument i'm 
more inclined to ear.

> But then, where problem fill finally knock to his door, he will not
> know what to do, because API having only 99% guarantees of something
> that it will work, leaving 1% to whom???
> 
Now, that it can be 100% with 5 lines of code, I say we should afford 
that facility.

It all depends on the horizon of availability of a multi-processor VM 
with all atomicity problems resolved. I tend to not believe in it 
anytime soon, and i tend to not believe it would run an image without 
rewriting some parts. That's maybe where we most disagree because the 
hard work you put into it.

Cheers

Nicolas