Fear and loathing of the "perification" of Smalltalk

Fri Sep 14 12:29:35 UTC 2007

Paolo Bonzini wrote:
>
>>> but I agree with Damien that I would *not* call it a block.  
>> Why not? Because curly braces are implemented as a fixed kind of 
>> macro in Squeak? If that's the case then you're stuck with a 
>> preconception that is limiting your thinking.
>
> No, because I see a Block as a single indivisible evaluation.
Your statement above reveals that you have a fixed notion of what you 
see a Block is. With the collecting-evaluator the block is still a 
single indivisible evaluation, it just collects up the results of it's work.

Parallel execution is really a different subject entirely than what 
curly braces and blocks with a collecting-evaluator do. Besides, 
parallel execution is valid too, but you don't see it that way as 
indicated from your own statement about how you see blocks.

> Parallel execution would be implemented on top of an *array of 
> blocks*, for example.
>
>>>   BlockClosure>>#value
>>>       ^self blocksForEachStatement inject: nil into:
>>>           [ <lambda> :old :each | each value ]
>>>
>>>   BlockClosure>>#values
>>>       ^self blocksForEachStatement collect: [ <lambda> :each | each 
>>> value ]
>>>
>>>   LambdaBlockClosure>>#value
>>>       <primitive: ...>
>>>
>>>   LambdaBlockClosure>>#values
>>>       ^Array with: self value
>>
>> Yes, one could do it that way however it's not as flexible since 
>> you're making the block with a "<lambda>" marker of some kind. What 
>> is the "<lambda>" statement doing in there precisely. Primitives 
>> don't currently work that way so you are suggesting further syntax 
>> changes too. How would that work?
>
> It would make a "single statement" block, and it is (conjectured to 
> be) needed to avoid infinite recursion (notice that 
> BlockClosure>>#value uses blocks).  In these two cases it is not 
> necessary, because the blocks I'm marking are also single-statement.  
> But it is not necessarily true that all the blocks involved in the 
> implementation of blocks are single statement.

There are many issues involved in parallelizing blocks for N-Core 
processors, especially N-Core processors where the communications cost 
is lower than the memory storage costs. The point however, wasn't an in 
depth analysis of that, the point was that altering the core evaluator 
in Smalltalk so that there are two, with the collecting-evaluator, is 
powerful; and when that thought was expressed a third evaluator option 
opened up allowing for yet another option. I'm not saying that I'd 
implement it as an evaluator but the point is that people are thinking 
about what it means to have other evaluators. I'd likely implement the 
parallelism as a block statement splitter operation the splits the 
statements in a block so that the following would occur.

[a. b. c] forkStatementsInParallel.

The above would get converted into three blocks that are executed as 
separate threads/processes and then the results would be joined and the 
result of "c" returned. There are of course many options needed thus 
there would be many methods in this protocol, including collecting the 
results of all the forked statements. Naturally you'd also want to be 
able to access, assign and otherwise control the processes and 
synchronize them. This really is a complex area due to the concurrency 
issues and needs a bit of research and will require some deep thought - 
fortunately lots of work has been done in this area and there are 
excellent papers out there.

>
>>> I will even make a bold statement about performance; you could 
>>> implement #value via a BlockClosure>>#asLambda method and cache its 
>>> result; then the result would probably not even be much slower than 
>>> the current state of things!
>>
>> Ok, how do you see "asLambda" working? What do you mean by 
>> "asLambda"? How do you see the result of #asLambda being different 
>> from a block?
>
> #asLambda would coalesce all the statements into a single indivisible 
> block-as-we-know-it-in-current-Smalltalk.
>

Oh maybe you mean that a list of blocks within a block would be merged 
into one. Well that is a useful operation but why obscure it with weird 
cryptic and misleading terminology such as #asLambda from functional 
languages, why not something more clear such as #mergeBlocks, or 
#mergeTopLevelBlocks? I'd want many options in such a protocol, 
including split blocks top level statements into separate blocks and 
fork them to run on different processor nodes. I fully support block 
manipulation methods as being very useful in certain circumstances.

With the "<lamda>" tagging of blocks used in your examples you've 
clearly altered Smalltalk syntax in a dramatic way, yet you've not 
answered any questions about it. Please explain why the lambda tag is there.

>>    [ a. b. c ] valuesOn: aStream
>>
>> Note that this level of parameterization with Blocks isn't possible 
>> with the curly braces syntax
>
> Sure it is.  Not that I'd endorse it.
>
> Array>>#valuesOn:
>     aStream nextPutAll: (self collect: [ :each | each value ])
>
I sit corrected. As I said, I can learn from you. Yes, since the curly 
braces are simply an initialization macro the underlying collection 
object can receive messages. I suppose that is a useful side effect of 
the curly braces, however they still are a new syntax when one isn't 
needed to get the job done.

>
>> [
>>     Object subclass: #Person.
>>
>>    Person
>>        addInstanceVariable: #firstName;
>>        addInstanceVariable: #middleName;
>>        addInstanceVariable: #lastName.
>>
>>    "Block form."
>>    Person addInstanceMethod: [firstName: aString | firstName := 
>> aString ].
>
> Not so easy.  
Yes, so very easy.

> How do you guarantee that "firstName" is in scope when the block is 
> compiled?
Why would you? Smalltalk is a dynamic language. Besides it's not 
different than what happens when filing in code now anyhow, at least no 
different at the language level. You define a class with it's variables 
and then load the methods. Sure, the evaluator needs to be a bit more 
flexible than existing code if it needs to be, but that is the nature of 
progress. Change happens. Adapt or fall behind or take a different path. 
It's your choice.

> You would have to store a parse tree for the method, or something like 
> that, and so far so good.  
Yes.

> But much worse, you would have to turn *each* and *every* undefined 
> variable appearing in a block from a compile-time error to a run-time 
> error a la #doesNotUnderstand:.
Nope.

>
> Because...
>
>>    "Yes, the intent is to be able to do this flexibly storing the 
>> block first and even using it as needed if you want."
>>    aBlock := [middleName: aString | middleName := aString ].
>>    Person addInstanceMethod: aBlock.
>
> ... there is no guarantee that a block will end up #value'd rather 
> than #addInstanceMethod:'ed to a class that does not even exist.
Yes, perfectly true and desired. This is part of the more expressive future.

>
>>    "Yes, the intent is to be able to do this as well."
>>    aBlock := [:aString | lastName := aString ].
>>    Person addInstanceMethod: aBlock named: #lastName:.
>
> Same here: you could also do
>
>     Animal addInstanceMethod: aBlock named: #lastName:
>     someCollection do: aBlock
>
> and both of these would be errors.  So, you cannot verify undefined 
> variables of blocks until run-time.
Well, the case with variables isn't an issue when they are defined in 
the context of an existing method for the class. So we can rule out many 
situations right there. For the rest, moving into the future isn't easy 
for the critic as it takes vision to make the great leaps. Yes, you 
raise valid points. There are solutions.

>
>> No more need for legacy Smalltalk chunk format with it's weird syntax 
>> - it can now be depreciated!
>
> Indeed, 
Thank you for the one sorta positive word about the notion. Man, some 
critics are really tough.

> but you can also do that with a declarative syntax as is implemented 
> in Stef's Sapphire or in GST 3.0 betas.

ALL META OPERATIONS CAN BE DONE WITH STANDARD SMALLTALK SYNTAX
Adding the variables to the class in my examples is a declarative 
statement! It's just expressed in the language of messaging itself! Why 
invent other languages or special syntaxes when we have a perfectly good 
syntax with unary, binary and keyword messages? There is not need for a
special declaration syntax, none; not when it can be done with message 
syntax. You just have to adjust your thinking and how it's implemented. 
I invite you to open your mind to new possibilities beyond the horizon 
of the critic within who knows very well how it's done now; engage the 
visionary who sees a new possibility and creates the future.

All meta operations can be done with standard Smalltalk syntax of unary, 
binary and keyword syntax messages! There is almost zero need to use any 
new syntaxes to do the job when Smalltalk style messaging syntax is the 
most potent syntax.

>
>> Part of moving a language forward is deleting that which isn't needed 
>> anymore. It turns out that by unifying Block and Method syntax we can 
>> simplify the language and eliminate unneeded syntax!
>
> Are you still sure after my objection above?
Yes, if the legacy chunk format can be replaced with a messaging based 
version, then yes, I'm still sure. See the crucial section above and 
welcome to the future of language design where meta and normal 
operations are unified in one elegant syntax using Smalltalk style 
messaging.

>
>> Yes, at runtime Blocks and Methods are implemented differently and 
>> there is little need, that I can see, to change that. This change 
>> takes place at the source code level for the most part.
>
> They are more similar than you'd think.  The main difference is in how 
> they are activated, not in how they are stored.
That is what I'm referring to, in part.
>
>> Yes, a "conversion" between Block and Method needs to take place. 
>> Yes, it's possible not all Blocks may be able to be converted and 
>> vice versa; that's what exceptions are for. (I'd have to think that 
>> one through in detail).
>
> That might be more or less the same problem I hinted at above.
yes.

>
>> Yes, those are technical terms; for an in depth discussion on them 
>> this article is excellent: 
>
> Thanks, I'm reading it.
What are your thoughts on it?

>
>> Maybe we could collect together all "extensions" or "variants" from 
>> all the Smalltalk versions into one place for easy reference? Who'd 
>> be up to helping with that?
>
> AFAIK, these are
>
>    {...} RHS    (GNU, Squeak)
>    {...} LHS    (old Squeak only?)
>    #{...}       (GNU, VW)
>    namespaces   (GNU, VW, VA, ST/X)
>    #[...]       (all?)
>    ##(...)      (GNU, VA, Dolphin)
>    pragmas      (many but with different implementation details)
That's a good start. Cool. Oh, extensions to primitives by VW.

>
> Sorry for snipping so much of the rest, it was useful to understand 
> your POV but it is not something that I can "answer" or "object to"; 
> just a note:
>
>>> The second example is Unicode; you can force everybody to write 
>>> "16r1000 asCharacter" (which is also less efficient than $A) or try 
>>> to find a simple extension to the syntax, for example $<16r1000> 
>>> could be an idea.
>> Or just $16r1000. Why bother with "<" and ">"?
>
> Because $8r40 is weird but valid Smalltalk (sends r1000 to the 
> character 8), while $<8r40> would be the same as the space character 
> (ASCII 32). Anyway, you see, this is a syntax extension.  :-)
Well, you raise some valid points. However, I really like the $16r1000 
format for creating the character (not the message send). In a way 
Smalltalk has an anomaly in that numbers can be confusing. Why should 
16r1000 work in one context as a number but not in another? So, while it 
might be a minor syntax change it doesn't add new weird syntax, (e.g. 
"<16r1000>"),  it just corrects a minor issue in the existing EBNF of 
the language so that numbers are uniformly interpreted the same in the 
context of creating a character with the "$" syntax (and other places as 
needed). I'll look into this deeper.

All the best,

Peter