Eliot Miranda uploaded a new version of Compiler to project The Trunk: http://source.squeak.org/trunk/Compiler-eem.480.mcz
==================== Summary ====================
Name: Compiler-eem.480 Author: eem Time: 22 November 2022, 7:10:27.324796 pm UUID: 3e5ba19e-c44a-4390-9004-de1246736cbc Ancestors: Compiler-eem.479
Do not warn of an uninitialized temporary if it is being sent ifNil: or ifNotNil:.
=============== Diff against Compiler-eem.479 ===============
Item was changed: ----- Method: Parser>>primaryExpression (in category 'expression types') ----- primaryExpression hereType == #word ifTrue: [parseNode := self variable. + (parseNode isUndefTemp + and: [(#('ifNil:' 'ifNotNil:') includes: here) not + and: [self interactive]]) + ifTrue: + [self queryUndefined]. - (parseNode isUndefTemp and: [self interactive]) - ifTrue: [self queryUndefined]. parseNode nowHasRef. ^ true]. hereType == #leftBracket ifTrue: [self advance. self blockExpression. ^true]. hereType == #leftBrace ifTrue: [self braceExpression. ^true]. hereType == #leftParenthesis ifTrue: [self advance. self expression ifFalse: [^self expected: 'expression']. (self match: #rightParenthesis) ifFalse: [^self expected: 'right parenthesis']. ^true]. (hereType == #string or: [hereType == #number or: [hereType == #literal or: [hereType == #character]]]) ifTrue: [parseNode := encoder encodeLiteral: self advance. ^true]. (here == #- and: [tokenType == #number and: [1 + hereEnd = mark]]) ifTrue: [self advance. parseNode := encoder encodeLiteral: self advance negated. ^true]. ^false!
Hi Eliot, hi all,
I'm skeptical about this change, as it creates or expands a special role of the selectors #ifNil:, #ifNotNil:, and their combinations. IMHO, it unnecessarily complicates the simple Smalltalk syntax. While I know and sometimes dislike these UndefinedVariable notifications, too, I don't know whether differentiating them by the selector is the right strategy to improve this situation.
Consider the following examples:
| a b c d e f g h | a ifNil: [a := 1]. c := b. c ifNil: [c := 3]. #(1 2 3) sorted: d. e := 5. (e isNil or: [f isNil]) ifTrue: [e := f := 6]. g perform: #ifNotNil: with: [b := g]. h ifNotNilDo: [h := 8].
How would you explain to a naive Smalltalker which of these variables will be marked as undefined at this point and why? (Of course, you can explain it by pointing to the implementation, but I think that's a significantly less intuitive explanation than just saying "you must declare any variable before using it".)
This behavior leads to a mental model that disambiguates between null and undefined similar to JavaScript which I never have found helpful.
Also, with this change, the compiler leaks the default value of any temporary variable, which we previously were able to hide at least partially.
In many cases, I think explicitly setting a temporary variable to nil before it is initialized within some non-trivial conditional complex would be more explicit, thus more readable, and something which we should generally encourage programmers to do.
Looking forward to your opinion!
Best,
Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von commits@source.squeak.org commits@source.squeak.org Gesendet: Mittwoch, 23. November 2022 04:10:30 An: squeak-dev@lists.squeakfoundation.org; packages@lists.squeakfoundation.org Betreff: [squeak-dev] The Trunk: Compiler-eem.480.mcz
Eliot Miranda uploaded a new version of Compiler to project The Trunk: http://source.squeak.org/trunk/Compiler-eem.480.mcz
==================== Summary ====================
Name: Compiler-eem.480 Author: eem Time: 22 November 2022, 7:10:27.324796 pm UUID: 3e5ba19e-c44a-4390-9004-de1246736cbc Ancestors: Compiler-eem.479
Do not warn of an uninitialized temporary if it is being sent ifNil: or ifNotNil:.
=============== Diff against Compiler-eem.479 ===============
Item was changed: ----- Method: Parser>>primaryExpression (in category 'expression types') ----- primaryExpression hereType == #word ifTrue: [parseNode := self variable. + (parseNode isUndefTemp + and: [(#('ifNil:' 'ifNotNil:') includes: here) not + and: [self interactive]]) + ifTrue: + [self queryUndefined]. - (parseNode isUndefTemp and: [self interactive]) - ifTrue: [self queryUndefined]. parseNode nowHasRef. ^ true]. hereType == #leftBracket ifTrue: [self advance. self blockExpression. ^true]. hereType == #leftBrace ifTrue: [self braceExpression. ^true]. hereType == #leftParenthesis ifTrue: [self advance. self expression ifFalse: [^self expected: 'expression']. (self match: #rightParenthesis) ifFalse: [^self expected: 'right parenthesis']. ^true]. (hereType == #string or: [hereType == #number or: [hereType == #literal or: [hereType == #character]]]) ifTrue: [parseNode := encoder encodeLiteral: self advance. ^true]. (here == #- and: [tokenType == #number and: [1 + hereEnd = mark]]) ifTrue: [self advance. parseNode := encoder encodeLiteral: self advance negated. ^true]. ^false!
Hi Christoph --
IMHO, it unnecessarily complicates the simple Smalltalk syntax. [...]
Nah, this is just a tooling change, not a syntactical one.
Yes, I would like to have this info skipped for #isNil as well. Note that one should not use #ifNotNilDo: anymore.
Best, Marcel Am 23.11.2022 11:00:43 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de: Hi Eliot, hi all,
I'm skeptical about this change, as it creates or expands a special role of the selectors #ifNil:, #ifNotNil:, and their combinations. IMHO, it unnecessarily complicates the simple Smalltalk syntax. While I know and sometimes dislike these UndefinedVariable notifications, too, I don't know whether differentiating them by the selector is the right strategy to improve this situation.
Consider the following examples:
| a b c d e f g h | a ifNil: [a := 1]. c := b. c ifNil: [c := 3]. #(1 2 3) sorted: d. e := 5. (e isNil or: [f isNil]) ifTrue: [e := f := 6]. g perform: #ifNotNil: with: [b := g]. h ifNotNilDo: [h := 8].
How would you explain to a naive Smalltalker which of these variables will be marked as undefined at this point and why? (Of course, you can explain it by pointing to the implementation, but I think that's a significantly less intuitive explanation than just saying "you must declare any variable before using it".) This behavior leads to a mental model that disambiguates between null and undefined similar to JavaScript which I never have found helpful. Also, with this change, the compiler leaks the default value of any temporary variable, which we previously were able to hide at least partially. In many cases, I think explicitly setting a temporary variable to nil before it is initialized within some non-trivial conditional complex would be more explicit, thus more readable, and something which we should generally encourage programmers to do.
Looking forward to your opinion!
Best, Christoph Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von commits@source.squeak.org commits@source.squeak.org Gesendet: Mittwoch, 23. November 2022 04:10:30 An: squeak-dev@lists.squeakfoundation.org; packages@lists.squeakfoundation.org Betreff: [squeak-dev] The Trunk: Compiler-eem.480.mcz Eliot Miranda uploaded a new version of Compiler to project The Trunk: http://source.squeak.org/trunk/Compiler-eem.480.mcz [http://source.squeak.org/trunk/Compiler-eem.480.mcz]
==================== Summary ====================
Name: Compiler-eem.480 Author: eem Time: 22 November 2022, 7:10:27.324796 pm UUID: 3e5ba19e-c44a-4390-9004-de1246736cbc Ancestors: Compiler-eem.479
Do not warn of an uninitialized temporary if it is being sent ifNil: or ifNotNil:.
=============== Diff against Compiler-eem.479 ===============
Item was changed: ----- Method: Parser>>primaryExpression (in category 'expression types') ----- primaryExpression hereType == #word ifTrue: [parseNode := self variable. + (parseNode isUndefTemp + and: [(#('ifNil:' 'ifNotNil:') includes: here) not + and: [self interactive]]) + ifTrue: + [self queryUndefined]. - (parseNode isUndefTemp and: [self interactive]) - ifTrue: [self queryUndefined]. parseNode nowHasRef. ^ true]. hereType == #leftBracket ifTrue: [self advance. self blockExpression. ^true]. hereType == #leftBrace ifTrue: [self braceExpression. ^true]. hereType == #leftParenthesis ifTrue: [self advance. self expression ifFalse: [^self expected: 'expression']. (self match: #rightParenthesis) ifFalse: [^self expected: 'right parenthesis']. ^true]. (hereType == #string or: [hereType == #number or: [hereType == #literal or: [hereType == #character]]]) ifTrue: [parseNode := encoder encodeLiteral: self advance. ^true]. (here == #- and: [tokenType == #number and: [1 + hereEnd = mark]]) ifTrue: [self advance. parseNode := encoder encodeLiteral: self advance negated. ^true]. ^false!
Hi Marcel,
Nah, this is just a tooling change, not a syntactical one.
Yes, but the compiler is one of the main tools (maybe *the* main tool?) for learning about Smalltalk syntax. If we encode certain idioms/patterns in it, users may learn them as a syntactical part of the language, may they not?
Yes, I would like to have this info skipped for #isNil as well. Note that one should not use #ifNotNilDo: anymore.
So build a comprehensive list of "nil-aware selectors"? I don't know ... What would be about #notNil? #isEmptyOrNil? What if Etoys also added a #test:ifNil:ifNotNil: besides the existing #test:ifTrue:ifFalse:? By encoding such information in the compiler, we make the language design less flexible. I feel reminded of the recent debate on keyboard shortcuts for ifFalse:/ifTrue: in the SmalltalkEditor. :-)
Best,
Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Mittwoch, 23. November 2022 11:24:51 An: squeak-dev Betreff: Re: [squeak-dev] The Trunk: Compiler-eem.480.mcz
Hi Christoph --
IMHO, it unnecessarily complicates the simple Smalltalk syntax. [...]
Nah, this is just a tooling change, not a syntactical one.
Yes, I would like to have this info skipped for #isNil as well. Note that one should not use #ifNotNilDo: anymore.
Best, Marcel
Am 23.11.2022 11:00:43 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de:
Hi Eliot, hi all,
I'm skeptical about this change, as it creates or expands a special role of the selectors #ifNil:, #ifNotNil:, and their combinations. IMHO, it unnecessarily complicates the simple Smalltalk syntax. While I know and sometimes dislike these UndefinedVariable notifications, too, I don't know whether differentiating them by the selector is the right strategy to improve this situation.
Consider the following examples:
| a b c d e f g h | a ifNil: [a := 1]. c := b. c ifNil: [c := 3]. #(1 2 3) sorted: d. e := 5. (e isNil or: [f isNil]) ifTrue: [e := f := 6]. g perform: #ifNotNil: with: [b := g]. h ifNotNilDo: [h := 8].
How would you explain to a naive Smalltalker which of these variables will be marked as undefined at this point and why? (Of course, you can explain it by pointing to the implementation, but I think that's a significantly less intuitive explanation than just saying "you must declare any variable before using it".)
This behavior leads to a mental model that disambiguates between null and undefined similar to JavaScript which I never have found helpful.
Also, with this change, the compiler leaks the default value of any temporary variable, which we previously were able to hide at least partially.
In many cases, I think explicitly setting a temporary variable to nil before it is initialized within some non-trivial conditional complex would be more explicit, thus more readable, and something which we should generally encourage programmers to do.
Looking forward to your opinion!
Best,
Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von commits@source.squeak.org commits@source.squeak.org Gesendet: Mittwoch, 23. November 2022 04:10:30 An: squeak-dev@lists.squeakfoundation.org; packages@lists.squeakfoundation.org Betreff: [squeak-dev] The Trunk: Compiler-eem.480.mcz
Eliot Miranda uploaded a new version of Compiler to project The Trunk: http://source.squeak.org/trunk/Compiler-eem.480.mcz
==================== Summary ====================
Name: Compiler-eem.480 Author: eem Time: 22 November 2022, 7:10:27.324796 pm UUID: 3e5ba19e-c44a-4390-9004-de1246736cbc Ancestors: Compiler-eem.479
Do not warn of an uninitialized temporary if it is being sent ifNil: or ifNotNil:.
=============== Diff against Compiler-eem.479 ===============
Item was changed: ----- Method: Parser>>primaryExpression (in category 'expression types') ----- primaryExpression hereType == #word ifTrue: [parseNode := self variable. + (parseNode isUndefTemp + and: [(#('ifNil:' 'ifNotNil:') includes: here) not + and: [self interactive]]) + ifTrue: + [self queryUndefined]. - (parseNode isUndefTemp and: [self interactive]) - ifTrue: [self queryUndefined]. parseNode nowHasRef. ^ true]. hereType == #leftBracket ifTrue: [self advance. self blockExpression. ^true]. hereType == #leftBrace ifTrue: [self braceExpression. ^true]. hereType == #leftParenthesis ifTrue: [self advance. self expression ifFalse: [^self expected: 'expression']. (self match: #rightParenthesis) ifFalse: [^self expected: 'right parenthesis']. ^true]. (hereType == #string or: [hereType == #number or: [hereType == #literal or: [hereType == #character]]]) ifTrue: [parseNode := encoder encodeLiteral: self advance. ^true]. (here == #- and: [tokenType == #number and: [1 + hereEnd = mark]]) ifTrue: [self advance. parseNode := encoder encodeLiteral: self advance negated. ^true]. ^false!
Hi Christoph, Hi Marcel,
apologies about the font size mismatches...
On Wed, Nov 23, 2022 at 2:25 AM Marcel Taeumel marcel.taeumel@hpi.de wrote:
Hi Christoph --
IMHO, it unnecessarily complicates the simple Smalltalk syntax. [...]
Nah, this is just a tooling change, not a syntactical one.
+1
Yes, I would like to have this info skipped for #isNil as well. Note that one should not use #ifNotNilDo: anymore.
Good idea. I'll include it.
Best, Marcel
Am 23.11.2022 11:00:43 schrieb Thiede, Christoph < christoph.thiede@student.hpi.uni-potsdam.de>:
Hi Eliot, hi all,
I'm skeptical about this change, as it creates or expands a special role of the selectors #ifNil:, #ifNotNil:, and their combinations. IMHO, it unnecessarily complicates the simple Smalltalk syntax. While I know and sometimes dislike these UndefinedVariable notifications, too, I don't know whether differentiating them by the selector is the right strategy to improve this situation.
Please indulge me. It's f***ing irritating to be told by the compiler
that as temp var appears to be uninitialized when one is intentionally using the fact that temps are initialized to nil. And that temp vars are initialized to nil is a) essential knowledge and b) a good thing (no uninitialized local variables a la C, a sensible value to initialize a variable with).
BTW, I find it more than sad (a little alarming in fact) that someSmalltalkers don't know that the value of several conditionals that take blocks is nil when the condition doesn't select the block. e.g. false ifTrue: [self anything] is nil. I see "expr ifNotNil: [...] ifNil: [nil]" and it strikes me as illiterate. I recently visited code written by a strong programmer who open coded a lot of point arithmetic, decomposing e.g. a * b into (a x * b x) @ (a y * b y). It's bad. It gradually degrades the code base in that it isn't always an exemplar of best practice,
Consider the following examples:
| a b c d e f g h | a ifNil: [a := 1]. c := b. c ifNil: [c := 3]. #(1 2 3) sorted: d. e := 5. (e isNil or: [f isNil]) ifTrue: [e := f := 6]. g perform: #ifNotNil: with: [b := g]. h ifNotNilDo: [h := 8].
How would you explain to a naive Smalltalker which of these variables will be marked as undefined at this point and why? (Of course, you can explain it by pointing to the implementation, but I think that's a significantly less intuitive explanation than just saying "you must declare any variable before using it".)
No. It's a hard-and-fast rule that all temp vars are initialized to nil. And initializing a variable (to other than nil) is done by assigning it. In the above a through h are declared within the vertical bars.n They are initialized in the assignments. I want a warning for the usage of b in "c := b", "d" in "#(1 2 3) sorted: d", g in "g perform: #ifNotNil: with: [b := g]". I *don't* want to be told about a in "a ifNil: [a := 1]", c in "c ifNil: [c := 3]", or e & f in "(e isNil or: [f isNil]) ifTrue: [e := f := 6]". I never want to see "ifNotNilDo", ever ;-) (* note that a couple of years back we fixed a bad bug in the compiler where block local temps were not (re)initialized to nil on each iteration, leaking their values from previous iterations, breaking the "all temp vars are initialized to nil rule, and revealing implementation details in the compiler's inlining of to:[by:]do: forms)
This behavior leads to a mental model that disambiguates between null and undefined similar to JavaScript which I never have found helpful.
I don't see how that applies. Smalltalk has no undefined. It has nil & zero, and these values are used to initialize any and all variables. This is not an artifact of the implementation. It is a fundamental part of the language design. It results in no dangling referents or uninitialized variables. The language used in Parser>>#queryUndefined is problematic. It should be "unassigned", not "undefined". There is nothing undefined about these variables. But they are indeed unassigned. In some cases (see my i=diomatic implementation of subsequences: and substrings) this can (and *should*) be used to advantage. And all Smalltalk programming courses should explain that variables are always initialized (either to nil or zero, & hence by extension 0.0, Character null, Color transparent, et al), and may need assignment before their referents get sent messages.
I see the same kind of sloppiness in people not knowing that conditionals that take blocks typically evaluate to nil when the condition doesn;t select the block. So always "expr ifNotNil: [...]", never "expr ifNotNil: [...] ifNil: [nil]", or "expr ifNotNil: [...] ifNil: []". I recently cleaned up code by as string programmer who had open coded point arithmetic (e.g. a * b written as (a x * b x) @ (a y * b y) ). This is really bad: it's exemplifying poor practice, it's verbose, it takes away at least as much understanding as it conveys, it leads to more difficult to manage code.
If we fail to teach the language properly we start on a slippery slope to duplication (which is an awful evil, leading to much increased maintennance effort, and brittleness), and rendering perfectly good, well thought-out idioms mysterious. It;'s not like Smalltalk has a lot of rules; the number, compared to C & C++ et al is tiny. And terseness has not just aesthetic benefit, but real practical benefit in terms of readability & maintainability.
Also, with this change, the compiler leaks the default value of any temporary variable, which we previously were able to hide at least partially.
But that is a MISTAKE!! The language designers didn't arrange for temps to be initialized to nil just because that's the only default. They did it to ensure that there is no such thing as an uninitialized variable in Smalltalk. That's why nil ids an object, with a class, not just nil. That's why nil ~~ false. It's carefully thought out and not just some artifact of the implementation. And that rationale (read the blue book carefully) and its implications, should be taught/learned/known, and especially exemplified by the core code of Squeak trunk, and hence supported by the compiler.
In many cases, I think explicitly setting a temporary variable to nil before it is initialized within some non-trivial conditional complex would be more explicit, thus more readable, and something which we should generally encourage programmers to do.
I disagree. You're advocating for absurdities such as
| colors | colors :=- ColorArray new: 256. colors atAllPut: Color transparent
This is the kind of thinking that leads to cycling wearing American Football clothes. It won't keep you from being run over by a truck, but it'll make you so slow and reduce your peripheral vision so much, not to mention give you a false sense of security, that you'll be much more likely to be run over by a truck...
Looking forward to your opinion!
:-) Hope I'm not too strident :-)
Best,
Christoph
*Von:* Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von commits@source.squeak.org commits@source.squeak.org *Gesendet:* Mittwoch, 23. November 2022 04:10:30 *An:* squeak-dev@lists.squeakfoundation.org; packages@lists.squeakfoundation.org *Betreff:* [squeak-dev] The Trunk: Compiler-eem.480.mcz
Eliot Miranda uploaded a new version of Compiler to project The Trunk: http://source.squeak.org/trunk/Compiler-eem.480.mcz
==================== Summary ====================
Name: Compiler-eem.480 Author: eem Time: 22 November 2022, 7:10:27.324796 pm UUID: 3e5ba19e-c44a-4390-9004-de1246736cbc Ancestors: Compiler-eem.479
Do not warn of an uninitialized temporary if it is being sent ifNil: or ifNotNil:.
=============== Diff against Compiler-eem.479 ===============
Item was changed:
----- Method: Parser>>primaryExpression (in category 'expression types')
primaryExpression hereType == #word ifTrue: [parseNode := self variable.
(parseNode isUndefTemp
and: [(#('ifNil:' 'ifNotNil:') includes: here)
not
and: [self interactive]])
ifTrue:
[self queryUndefined].
(parseNode isUndefTemp and: [self interactive])
ifTrue: [self queryUndefined]. parseNode nowHasRef. ^ true]. hereType == #leftBracket ifTrue: [self advance. self blockExpression. ^true]. hereType == #leftBrace ifTrue: [self braceExpression. ^true]. hereType == #leftParenthesis ifTrue: [self advance. self expression ifFalse: [^self expected:
'expression']. (self match: #rightParenthesis) ifFalse: [^self expected: 'right parenthesis']. ^true]. (hereType == #string or: [hereType == #number or: [hereType == #literal or: [hereType == #character]]]) ifTrue: [parseNode := encoder encodeLiteral: self advance. ^true]. (here == #- and: [tokenType == #number and: [1 + hereEnd = mark]]) ifTrue: [self advance. parseNode := encoder encodeLiteral: self advance negated. ^true]. ^false!
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- His page was intentionally left blank.
Yet, nil is only seldom a good domain object. -t
On 23. Nov 2022, at 19:34, tim Rowledge tim@rowledge.org wrote:
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
+1 Even for a loop-based algorithm, clarity would improve if the initial case would be flagged with a symbol, not nil:
x := #start. [ ... ] whileTrue: [ x = #start ifTrue: [x := ... ]. ... ].
Best, Marcel Am 23.11.2022 21:23:27 schrieb Tobias Pape das.linux@gmx.de: Yet, nil is only seldom a good domain object. -t
On 23. Nov 2022, at 19:34, tim Rowledge wrote:
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
Hi Marcel,
On Nov 24, 2022, at 1:03 AM, Marcel Taeumel marcel.taeumel@hpi.de wrote:
+1 Even for a loop-based algorithm, clarity would improve if the initial case would be flagged with a symbol, not nil:
x := #start. [ ... ] whileTrue: [ x = #start ifTrue: [x := ... ]. ... ].
Please don’t take this personally; I find the above pretentious in the extreme, and, given how much thought Smalltalk-80’s designers gave to initializing with nil (again it’s in the blue book), displaying and/or celebrating ignorance. So -1,000.
Best, Marcel
Am 23.11.2022 21:23:27 schrieb Tobias Pape das.linux@gmx.de:
Yet, nil is only seldom a good domain object. -t
On 23. Nov 2022, at 19:34, tim Rowledge wrote:
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
Hehe. Of course you do. 😉 I just wanted to highlight the fact that nil is no domain object. Exceptional usage should be done by intention. In general, ifNil checks interfere with readability of domain logic. Still, I see no harm in such ifNil checks. Who would agree on a domain-specific symbol for start anyway 😅 ________________________________ From: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org on behalf of Eliot Miranda eliot.miranda@gmail.com Sent: Thursday, November 24, 2022 7:53:35 PM To: The general-purpose Squeak developers list squeak-dev@lists.squeakfoundation.org Subject: Re: [squeak-dev] The Trunk: Compiler-eem.480.mcz
Hi Marcel,
On Nov 24, 2022, at 1:03 AM, Marcel Taeumel marcel.taeumel@hpi.de wrote:
+1 Even for a loop-based algorithm, clarity would improve if the initial case would be flagged with a symbol, not nil:
x := #start. [ ... ] whileTrue: [ x = #start ifTrue: [x := ... ]. ... ].
Please don’t take this personally; I find the above pretentious in the extreme, and, given how much thought Smalltalk-80’s designers gave to initializing with nil (again it’s in the blue book), displaying and/or celebrating ignorance. So -1,000.
Best, Marcel
Am 23.11.2022 21:23:27 schrieb Tobias Pape das.linux@gmx.de:
Yet, nil is only seldom a good domain object. -t
On 23. Nov 2022, at 19:34, tim Rowledge wrote:
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
On Nov 24, 2022, at 11:15 AM, Taeumel, Marcel Marcel.Taeumel@hpi.de wrote:
Hehe. Of course you do. 😉 I just wanted to highlight the fact that nil is no domain object. Exceptional usage should be done by intention. In general, ifNil checks interfere with readability of domain logic. Still, I see no harm in such ifNil checks. Who would agree on a domain-specific symbol for start anyway 😅
😅 <3
From: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org on behalf of Eliot Miranda eliot.miranda@gmail.com Sent: Thursday, November 24, 2022 7:53:35 PM To: The general-purpose Squeak developers list squeak-dev@lists.squeakfoundation.org Subject: Re: [squeak-dev] The Trunk: Compiler-eem.480.mcz
Hi Marcel,
On Nov 24, 2022, at 1:03 AM, Marcel Taeumel marcel.taeumel@hpi.de wrote:
+1 Even for a loop-based algorithm, clarity would improve if the initial case would be flagged with a symbol, not nil:
x := #start. [ ... ] whileTrue: [ x = #start ifTrue: [x := ... ]. ... ].
Please don’t take this personally; I find the above pretentious in the extreme, and, given how much thought Smalltalk-80’s designers gave to initializing with nil (again it’s in the blue book), displaying and/or celebrating ignorance. So -1,000.
Best, Marcel
Am 23.11.2022 21:23:27 schrieb Tobias Pape das.linux@gmx.de:
Yet, nil is only seldom a good domain object. -t
On 23. Nov 2022, at 19:34, tim Rowledge wrote:
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
On Nov 23, 2022, at 12:23 PM, Tobias Pape Das.Linux@gmx.de wrote:
Yet, nil is only seldom a good domain object.
Precisely. Being disjoint from any domain it is the ideal “I am not a domain object” marker. So when one wants a variable to range over a domain and the singleton “not a member of the domain” nil is a great choice. And that’s exactly how I use it below.
-t
On 23. Nov 2022, at 19:34, tim Rowledge tim@rowledge.org wrote:
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
Hi Tobi,
let me try again (https://youtu.be/Cj8n4MfhjUc)%E2%80%A6
On Nov 23, 2022, at 12:23 PM, Tobias Pape Das.Linux@gmx.de wrote:
Yet, nil is only seldom a good domain object.
Precisely. Being disjoint from any domain it is the ideal “I am not a domain object” marker. So when one wants a variable to range over a domain and the singleton “not a member of the domain” nil is a great choice. And that’s exactly how I use it below.
There is another excellent marker of a non-domain object, and that is a newly instantiated object. That object is known to not be any other object, since objects are unique. So if code is searching for something (eg applying a block to every literal in the system), having the newly instantiated object that implements the search use itself as the “I’m not in the domain of all pre-existing objects” is a sensible choice. This is the pattern InstructionStream uses when scanning for selectors.
-t
On 23. Nov 2022, at 19:34, tim Rowledge tim@rowledge.org wrote:
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
_,,,^..^,,,_ (phone)
Hi Eliot
First things first, I did not raise objections in my admittedly short quip.
On 24. Nov 2022, at 20:23, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Tobi,
let me try again (https://youtu.be/Cj8n4MfhjUc)…
:D
I already got my comfy chair!
On Nov 23, 2022, at 12:23 PM, Tobias Pape Das.Linux@gmx.de wrote:
Yet, nil is only seldom a good domain object.
Precisely. Being disjoint from any domain it is the ideal “I am not a domain object” marker. So when one wants a variable to range over a domain and the singleton “not a member of the domain” nil is a great choice. And that’s exactly how I use it below.
Second things second, I got that.
There is another excellent marker of a non-domain object, and that is a newly instantiated object. That object is known to not be any other object, since objects are unique. So if code is searching for something (eg applying a block to every literal in the system), having the newly instantiated object that implements the search use itself as the “I’m not in the domain of all pre-existing objects” is a sensible choice. This is the pattern InstructionStream uses when scanning for selectors.
And #someObject/#nextObject. I get that. And it is actually a beautiful thing you cannot do everywhere[0].
My fear is as follows:
I hope we can agree that "nil" is part of the "system domain" of Smalltalk, or - said differently - the meta-level.
So are the concepts of variables, classes etc.
The non-meta, base, or "domain proper" layer can be anything you want to computationally achieve.[1] Let's arbitrarily chose finance[2].
A domain object would be an account, a transaction, an account holder, etc. (Note that we can chose how to represent each, and we do not necessarily need objects for each, but I digress).
My take: in code dealing with such domain objects, nil should appear next to never, because it is an object from the Metalevel.
The problems with accepting nil as the general "nothing to see" marker include:
- There are too many. In our example, an account holder could have an instVar "account" which could be nil when not having an account yet, BUT ALSO an account could have a "closingDate" for when the account folded, which is "nil" when the account is still open, AND ALSO, a transaction could have an "auditor" which is nil as long as no audit has taken place etc. Just like that, nil takes _different roles_ just by being convenient.
- Nil has problematic provenance. When somewhere during debugging (the all-known MNU for UndefinedObject) a nil pops up, it is often reallllly hard to say whence it came from. So dealing with a lot of nil-bearing code will send you down rabbit holes after the other.
- Nil begets polymorphism, nil defies polymorphism. It is one of the most awesome feats of Smalltalk that nil is NOT like NULL, in that it can respond to messages. That is exceptionally powerful and has given Smalltalk a lot of resilience. But cluttering UndefinedObject with custom, even domain-specifc methods is a really bad idea. However, that means it is often unwise to just have object that could be nil be sent arbitrary messages. Hence a multitude of #isNil/#ifNil-Checks. Proper domain objects that model absence, pre-valid state, or error conditions can deal much better with that.
- Nil is in a collection-superposition (just like your good old USB-A plug which you have to turn at least twice to fit). You only know whether nil _actually_ could be a collection when you know that a non-nil object in its place is a collection [3]. Said differently: in contrast to LISPy languages, our nil is _by design_ no collection, while LISPy null _by design_ IS the empty list. This makes for funny messages like #isEmptyOrNil, which bails on non-nil-non-collection objects. So every time you have to deal with nil, you automatically at lease once have to answer the question "could the non-nil version of this object be a collection"?
There are a lot of interesting approaches to each or combinations of these issues. This includes Null-Object patterns, Sane-default-initializers, exceptions, or explicit models of multiplicity[4].
But back to the beginning.
In code that does, for example
blaFooAnAccount
| fooAccount | self accounts processBla: [:ea | fooAccount ifNil: [fooAccount := ea]. fooAccount := (ea doesBork: fooAccount) ifTrue: [fooAccount] ifFalse: [ea]] ^ fooAccount
we find two things:
First, we could inadvertently return nil from that method. But this is technical and I think most here can deal with that.
But second, the line "fooAccount ifNil: [fooAccount := ea]." ACTUALLY says
"if fooAccount is an uninitialized temporary variable, populate it".
This is technically correct, but conflates domains. In our world of finance, the idea of a "temporary variable" does no make sense. It is part of the meta-level domain, the system.
I don't say this is wrong _per se_ but people reading, and even more so, people writing such code MUST be aware that they are crossing domains, and especially, entering a meta level.
That's why I think these warnings are really ok. I won't fight the commit "Compiler-eem.480.mcz", especially since it more or less is descriptive of a pervasive style of writing Smalltalk of Squeak Core contributors.
I hope people find theses ideas useful.
Best regards -Tobias
[0]: I've used it in Python code to much joy. [1]: Caveat lector: the "domain layer" can surely be the "system layer". In fact that is what a lot of system code deals with. But this is messy for our considerations above and lets treat it as _exceptional_. [2]: Semi-arbitrarly, just because I received my tax returns :P [3]: yes, that is a strange sentence. It's late. Also, Ardbeg. [4]: For example, as in the work of Steimann (https://dl.acm.org/doi/10.1145/2509578.2509582 ). It seems they had a Smalltalk implementation in 2017.
-t
On 23. Nov 2022, at 19:34, tim Rowledge tim@rowledge.org wrote:
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
_,,,^..^,,,_ (phone)
On Thu, Nov 24, 2022 at 09:18:33PM +0100, Tobias Pape wrote:
I hope people find theses ideas useful.
Yes, thank you. And thanks Eliot also for the conversation. I was expecting this thread to be a flame war but instead I find it instructive and thought provoking.
Dave
Hi Tobi,
On Thu, Nov 24, 2022 at 12:18 PM Tobias Pape Das.Linux@gmx.de wrote:
Hi Eliot
First things first, I did not raise objections in my admittedly short quip.
On 24. Nov 2022, at 20:23, Eliot Miranda eliot.miranda@gmail.com
wrote:
Hi Tobi,
let me try again (https://youtu.be/Cj8n4MfhjUc)…
:D
I already got my comfy chair!
On Nov 23, 2022, at 12:23 PM, Tobias Pape Das.Linux@gmx.de wrote:
Yet, nil is only seldom a good domain object.
Precisely. Being disjoint from any domain it is the ideal “I am not a
domain object” marker. So when one wants a variable to range over a domain and the singleton “not a member of the domain” nil is a great choice. And that’s exactly how I use it below.
Second things second, I got that.
There is another excellent marker of a non-domain object, and that is a
newly instantiated object. That object is known to not be any other object, since objects are unique. So if code is searching for something (eg applying a block to every literal in the system), having the newly instantiated object that implements the search use itself as the “I’m not in the domain of all pre-existing objects” is a sensible choice. This is the pattern InstructionStream uses when scanning for selectors.
And #someObject/#nextObject. I get that. And it is actually a beautiful thing you cannot do everywhere[0].
My fear is as follows:
I hope we can agree that "nil" is part of the "system domain" of Smalltalk, or - said differently - the meta-level.
So are the concepts of variables, classes etc.
Yes, and equally, for example, collections. The Set instance in a set of accounts is not part of the domain because it's not specific to the domain. It comes from the Smalltalk library.
But so what? This feels like a straw man to me. One could argue that an account object itself is not part of the domain, but part of the model of the domain, etc.
In the end nil is an object that is useful and well-defined. It isn't special, just like true & false aren;t special, or that classes being objects aren't special. Instead these are all functional relationships that allow us to construct a system that is malleable and comprehensible.
The non-meta, base, or "domain proper" layer can be anything you want to computationally achieve.[1] Let's arbitrarily chose finance[2].
A domain object would be an account, a transaction, an account holder, etc. (Note that we can chose how to represent each, and we do not necessarily need objects for each, but I digress).
My take: in code dealing with such domain objects, nil should appear next to never, because it is an object from the Metalevel.
I don't agree. nil can serve as an absence, and absences are important.
In any query which can yield no results we have three choices: - answer nil (e.g. at end of stream) - raise an exception - require passing in a continuation to take in the event of an error (e.g. the block argument to at:ifAbsent: or the second one to detect:ifNone: etc)
Often nil is lighter-weight, much more concise, and hence more comprehensible, hence to be preferred.
The problems with accepting nil as the general "nothing to see" marker
include:
- There are too many. In our example, an account holder could have an instVar "account" which
could be nil when not having an account yet, BUT ALSO an account could have a "closingDate" for when the account folded, which is "nil" when the account is still open, AND ALSO, a transaction could have an "auditor" which is nil as long as no audit has taken place etc. Just like that, nil takes _different roles_ just by being convenient.
As do the integers, symbols, collections. As do classes. Should we have different kinds of classes for the class library, the meta level, the domain model? Or is it sufficient to have classes, metaclasses and traits? i.e. do we need to label things according to their use, or is it sufficient to differentiate them purely by function? I think the latter.
- Nil has problematic provenance.
When somewhere during debugging (the all-known MNU for UndefinedObject) a nil pops up, it is often reallllly hard to say whence it came from. So dealing with a lot of nil-bearing code will send you down rabbit holes after the other.
Whereas NaN doesn't have problematical provenance? Or getting an identically-valued instance with a different identity, or a host of other potential evils.... These are not specific to nil, and not specific to using nuil as a marker of absence, or as bottom. Programming is tricky; we have lots of ways of doping things; things happen very fast; code does what tou tell it, not what you want it to do. We find we make mistakes all the time. This is not specific to the use of nil in our domain models.
- Nil begets polymorphism, nil defies polymorphism. It is one of the most awesome feats of Smalltalk that nil is NOT like
NULL, in that it can respond to messages. That is exceptionally powerful and has given Smalltalk a lot of resilience. But cluttering UndefinedObject with custom, even domain-specifc methods is a really bad idea. However, that means it is often unwise to just have object that could be nil be sent arbitrary messages. Hence a multitude of #isNil/#ifNil-Checks. Proper domain objects that model absence, pre-valid state, or error conditions can deal much better with that.
When it's possible then fine. But in the query example above we must be able to deal with absence/have an element disjoint from a domain. And nil functions ideally for this case.
- Nil is in a collection-superposition (just like your good old USB-A plug
which you have to turn at least twice to fit). You only know whether nil _actually_ could be a collection when you know that a non-nil object in its place is a collection [3]. Said differently: in contrast to LISPy languages, our nil is _by design_ no collection, while LISPy null _by design_ IS the empty list. This makes for funny messages like #isEmptyOrNil, which bails on non-nil-non-collection objects. So every time you have to deal with nil, you automatically at lease once have to answer the question "could the non-nil version of this object be a collection"?
Many uses of isEmptyOrNil are in response to FillInTheBlank which can return nil on cancel or an empty string if the user doesn't type anything. This is a natural consequence of the affordances of FillInTheBlank, and isEmptyOrNil is simply a pragmatic response, hardly a symptom of some deep problem.
There are a lot of interesting approaches to each or combinations of these issues. This includes Null-Object patterns, Sane-default-initializers, exceptions, or explicit models of multiplicity[4].
But back to the beginning.
In code that does, for example
blaFooAnAccount
| fooAccount | self accounts processBla: [:ea | fooAccount ifNil: [fooAccount := ea]. fooAccount := (ea doesBork: fooAccount) ifTrue: [fooAccount] ifFalse: [ea]] ^ fooAccount
we find two things:
I think we find three things. The things you8 list below, plus the fact that this should have been written using detect:ifNone: ;-)
First, we could inadvertently return nil from that method. But this is technical and I think most here can deal with that.
But second, the line "fooAccount ifNil: [fooAccount := ea]." ACTUALLY says
"if fooAccount is an uninitialized temporary variable, populate it".
This is technically correct, but conflates domains. In our world of finance, the idea of a "temporary variable" does no make sense. It is part of the meta-level domain, the system.
I don't say this is wrong _per se_ but people reading, and even more so, people writing such code MUST be aware that they are crossing domains, and especially, entering a meta level.
That's why I think these warnings are really ok. I won't fight the commit "Compiler-eem.480.mcz", especially since it more or less is descriptive of a pervasive style of writing Smalltalk of Squeak Core contributors.
I hope people find theses ideas useful.
Best regards -Tobias
[0]: I've used it in Python code to much joy. [1]: Caveat lector: the "domain layer" can surely be the "system layer". In fact that is what a lot of system code deals with. But this is messy for our considerations above and lets treat it as _exceptional_. [2]: Semi-arbitrarly, just because I received my tax returns :P [3]: yes, that is a strange sentence. It's late. Also, Ardbeg. [4]: For example, as in the work of Steimann ( https://dl.acm.org/doi/10.1145/2509578.2509582 ). It seems they had a Smalltalk implementation in 2017.
-t
On 23. Nov 2022, at 19:34, tim Rowledge tim@rowledge.org wrote:
I won't quote it all again but what Eliot wrote is important. There
are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
_,,,^..^,,,_ (phone)
On Thu, Nov 24, 2022 at 12:18 PM Tobias Pape Das.Linux@gmx.de wrote:
[0]: I've used it in Python code to much joy.
I have to wave my little flag of disagreement here; having had to do a bunch of Python stuff recently I claim with some emphasis that nothing about Python could possibly bring much joy. Really - it's so disappointingly meh.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim If you never try anything new, you'll miss out on many of life's great disappointments
if you could dump obj1 or ( Error something )<—-[ you create it and fill it but you do not signal it you just return it ] or some notNil into the data pathway which could be tested at some one single place or a few places like if the pathway was like a channel in a water shed with all the channels coming together in a river at the sea and then you check for isError or something at the mouth of the river like More and Just in Haskell but way more expressive than More and Just then you know exactly what went wrong and where maybe and that could be good maybe better than signal in this case if there would be a lot of signals deprecated and a lot of on:do:s all gone into one testing at the river ending
if you put all your UndefinedObject>>methods into a Category named by the name of your Package or Project that should be ok it’s easy to hide it all then. Like Category KEGGenerator in UndefinedObject hey that’s a good idea i will do that nil is an empty KEGGenerator and that is convenient but if nil is an empty Collection also then is nil a Gen or a Coll so then you better use #( )asGen instead if that is a problem or maybe it’s fine sometimes it is fine sometimes it is not Object>>asGen ^self asGenerator
if KEGGenerator g and g next isNil then any g next later isNil also until ( g reset ). This is essential to KEGGenerator if not the whole thing quits. But i use KEGFooArray as a not nil end of something also. Probably KEGFooArray could get a way better name because i just named it with out thinking and it stuck. i guess i could use KEGGeneratorTail or something instead of nil but nil is central and has a lot of testers which would have to be isomorphically recreated slow so just use nil
I guess i could just rename KEGFooArray and the refactoring would change it everywhere in one single stroke so probably i should do this try this but what name should i choose and then the #fooArray local variables would look funny so but what could it be called instead of KEGFooArray that means nil like terminator Array in case you want to stick things into it like you do with Errors but without creating another Error Class or something KEGnotNilTerminator ? although KEGFooArray can be used for more than just terminating sequences and once you know it it sticks in your craw in your brain and you never forget it at least i never did so but KEGFooArray has a nil like quality to it. i could look at all the KEGFooArray references and see if it is mostly a not nil subsequence terminator
does anybody want KEGGenerator besides me i remember telling a co worker you should use Generators by Timothy Budd for that and he said i don’t want to use Generators for anything i just want to use to:do: to:by:do: do: collect: select: reject: and detect: ok? Generators away. So i am guessing that this is the general censure. the general opinion. Actually Generators don’t do to:do: and to:by:do: as far as i know yet but some could do. Otherwise they are similar to Collections and you can do aCollection asGen or anObject asGen or aStream asGen etc. they are between Collections and Streams so you do not have a lot of intermediate Collections created when you do ((((( aCollection asGen ) collect: […] ) select: […] ) reject: […] ) detect: […] ifNone: […] ) ifNotNil: [ :g | g asArray ] there are no intermediate Collections created none and asArray sends ( g next ) over and over so execution proceeds g next collect: select: reject: reject: detect:ifNone: etc g next collect: select: reject: detect:ifNone: etc etc etc sort of and there is a new semaphore SharedQueue thing which does the java or something version of Generators from inside of regular loops whileTrue: and so you can have the […] above communicating with each other in very convenient ways which come up a lot once you know to look and this highly collapses the code as versus the Collection way is my opinion. But KEGGenerators has got as big or bigger than Xtreams i believe i know nothing and the combination of KEGGenerators and Xtreams aught to be pretty good although i haven’t tried it yet pretty good i would think but nobody wants it and nobody is me just me alone apparently
i use it a lot just like LittleSmalltalk uses its predecessor Generator a lot also look at the Icon language book and the implementation of n Queens therein it’s like one line or something because everything in Icon is a Generator practically or something
oh god another thing to learn by spelunking no
well actually KEGGenerators is highly documented by Package comments Class comments and Method comments completely Not corporate or consistent. well it may be consistent mostly but not wholly and it’s definitely not corporate or the ancient IBM standard accounting to which i say : what me worry?
oh god another huge pile up of pages to read yes
and the older comments are highly convoluted i have found unfortunately but the newer ones are pretty good clear i am hoping well future me is hoping
i want to put the latest versions of KEGGenerators into Github but it is rather large with many optional addon Packages and there are many Experimental Methods as well which could be Categorized KEGExperimental and or KEGUnfinished and it’s Dolphin so there are Methods in multiple Categories. So i was going to try to break it up into more Packages so you just get what you want like a Core and some addons or something or should i just dump the whole thing now and break it up later if i live that long well seeing as nobody wants it and nobody is just me only then dump it otherwise it should be highly portable to all different ANSI Smalltalk just as soon as the other Smalltalks stop the prehistoric insanity and allow Method M to be in multiple Categories or something and then allow system wide searching on a Category
i could attempt allowing M in multiple Categories in Squeak if i could get some help from somebody or something it should be easy why isn’t it easy? is some prehistoric rubble is blocking? SSSSSSS HOCK people hate the idea? Now why in hell would you do that unless i wake up and go oh no i can’t do that I’ve got to do all of this clean the gutters watch a million hours of tv etc
On Sat, Nov 26, 2022 at 08:58 Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Tobi,
On Thu, Nov 24, 2022 at 12:18 PM Tobias Pape Das.Linux@gmx.de wrote:
Hi Eliot
First things first, I did not raise objections in my admittedly short quip.
On 24. Nov 2022, at 20:23, Eliot Miranda eliot.miranda@gmail.com
wrote:
Hi Tobi,
let me try again (https://youtu.be/Cj8n4MfhjUc)…
:D
I already got my comfy chair!
On Nov 23, 2022, at 12:23 PM, Tobias Pape Das.Linux@gmx.de wrote:
Yet, nil is only seldom a good domain object.
Precisely. Being disjoint from any domain it is the ideal “I am not a
domain object” marker. So when one wants a variable to range over a domain and the singleton “not a member of the domain” nil is a great choice. And that’s exactly how I use it below.
Second things second, I got that.
There is another excellent marker of a non-domain object, and that is a
newly instantiated object. That object is known to not be any other object, since objects are unique. So if code is searching for something (eg applying a block to every literal in the system), having the newly instantiated object that implements the search use itself as the “I’m not in the domain of all pre-existing objects” is a sensible choice. This is the pattern InstructionStream uses when scanning for selectors.
And #someObject/#nextObject. I get that. And it is actually a beautiful thing you cannot do everywhere[0].
My fear is as follows:
I hope we can agree that "nil" is part of the "system domain" of Smalltalk, or - said differently - the meta-level.
So are the concepts of variables, classes etc.
Yes, and equally, for example, collections. The Set instance in a set of accounts is not part of the domain because it's not specific to the domain. It comes from the Smalltalk library.
But so what? This feels like a straw man to me. One could argue that an account object itself is not part of the domain, but part of the model of the domain, etc.
In the end nil is an object that is useful and well-defined. It isn't special, just like true & false aren;t special, or that classes being objects aren't special. Instead these are all functional relationships that allow us to construct a system that is malleable and comprehensible.
The non-meta, base, or "domain proper" layer can be anything you want to computationally achieve.[1] Let's arbitrarily chose finance[2].
A domain object would be an account, a transaction, an account holder, etc. (Note that we can chose how to represent each, and we do not necessarily need objects for each, but I digress).
My take: in code dealing with such domain objects, nil should appear next to never, because it is an object from the Metalevel.
I don't agree. nil can serve as an absence, and absences are important.
In any query which can yield no results we have three choices:
- answer nil (e.g. at end of stream)
- raise an exception
- require passing in a continuation to take in the event of an error (e.g.
the block argument to at:ifAbsent: or the second one to detect:ifNone: etc)
Often nil is lighter-weight, much more concise, and hence more comprehensible, hence to be preferred.
The problems with accepting nil as the general "nothing to see" marker
include:
- There are too many. In our example, an account holder could have an instVar "account" which
could be nil when not having an account yet, BUT ALSO an account could have a "closingDate" for when the account folded, which is "nil" when the account is still open, AND ALSO, a transaction could have an "auditor" which is nil as long as no audit has taken place etc. Just like that, nil takes _different roles_ just by being convenient.
As do the integers, symbols, collections. As do classes. Should we have different kinds of classes for the class library, the meta level, the domain model? Or is it sufficient to have classes, metaclasses and traits? i.e. do we need to label things according to their use, or is it sufficient to differentiate them purely by function? I think the latter.
- Nil has problematic provenance.
When somewhere during debugging (the all-known MNU for UndefinedObject) a nil pops up, it is often reallllly hard to say whence it came from. So dealing with a lot of nil-bearing code will send you down rabbit holes after the other.
Whereas NaN doesn't have problematical provenance? Or getting an identically-valued instance with a different identity, or a host of other potential evils.... These are not specific to nil, and not specific to using nuil as a marker of absence, or as bottom. Programming is tricky; we have lots of ways of doping things; things happen very fast; code does what tou tell it, not what you want it to do. We find we make mistakes all the time. This is not specific to the use of nil in our domain models.
- Nil begets polymorphism, nil defies polymorphism. It is one of the most awesome feats of Smalltalk that nil is NOT like
NULL, in that it can respond to messages. That is exceptionally powerful and has given Smalltalk a lot of resilience. But cluttering UndefinedObject with custom, even domain-specifc methods is a really bad idea. However, that means it is often unwise to just have object that could be nil be sent arbitrary messages. Hence a multitude of #isNil/#ifNil-Checks. Proper domain objects that model absence, pre-valid state, or error conditions can deal much better with that.
When it's possible then fine. But in the query example above we must be able to deal with absence/have an element disjoint from a domain. And nil functions ideally for this case.
- Nil is in a collection-superposition (just like your good old USB-A plug
which you have to turn at least twice to fit). You only know whether nil _actually_ could be a collection when you know that a non-nil object in its place is a collection [3]. Said differently: in contrast to LISPy languages, our nil is _by design_ no collection, while LISPy null _by design_ IS the empty list. This makes for funny messages like #isEmptyOrNil, which bails on non-nil-non-collection objects. So every time you have to deal with nil, you automatically at lease once have to answer the question "could the non-nil version of this object be a collection"?
Many uses of isEmptyOrNil are in response to FillInTheBlank which can return nil on cancel or an empty string if the user doesn't type anything. This is a natural consequence of the affordances of FillInTheBlank, and isEmptyOrNil is simply a pragmatic response, hardly a symptom of some deep problem.
There are a lot of interesting approaches to each or combinations of these issues. This includes Null-Object patterns, Sane-default-initializers, exceptions, or explicit models of multiplicity[4].
But back to the beginning.
In code that does, for example
blaFooAnAccount
| fooAccount | self accounts processBla: [:ea | fooAccount ifNil: [fooAccount := ea]. fooAccount := (ea doesBork: fooAccount) ifTrue: [fooAccount] ifFalse: [ea]] ^ fooAccount
we find two things:
I think we find three things. The things you8 list below, plus the fact that this should have been written using detect:ifNone: ;-)
First, we could inadvertently return nil from that method. But this is technical and I think most here can deal with that.
But second, the line "fooAccount ifNil: [fooAccount := ea]." ACTUALLY says
"if fooAccount is an uninitialized temporary variable, populate it".
This is technically correct, but conflates domains. In our world of finance, the idea of a "temporary variable" does no make sense. It is part of the meta-level domain, the system.
I don't say this is wrong _per se_ but people reading, and even more so, people writing such code MUST be aware that they are crossing domains, and especially, entering a meta level.
That's why I think these warnings are really ok. I won't fight the commit "Compiler-eem.480.mcz", especially since it more or less is descriptive of a pervasive style of writing Smalltalk of Squeak Core contributors.
I hope people find theses ideas useful.
Best regards -Tobias
[0]: I've used it in Python code to much joy. [1]: Caveat lector: the "domain layer" can surely be the "system layer". In fact that is what a lot of system code deals with. But this is messy for our considerations above and lets treat it as _exceptional_. [2]: Semi-arbitrarly, just because I received my tax returns :P [3]: yes, that is a strange sentence. It's late. Also, Ardbeg. [4]: For example, as in the work of Steimann ( https://dl.acm.org/doi/10.1145/2509578.2509582 ). It seems they had a Smalltalk implementation in 2017.
-t
On 23. Nov 2022, at 19:34, tim Rowledge tim@rowledge.org wrote:
I won't quote it all again but what Eliot wrote is important. There
are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
_,,,^..^,,,_ (phone)
-- _,,,^..^,,,_ best, Eliot
Hi Kjell,
On Nov 26, 2022, at 12:01 PM, Kjell Godo squeaklist@gmail.com wrote:
if you could dump obj1 or ( Error something )<—-[ you create it and fill it but you do not signal it you just return it ] or some notNil into the data pathway which could be tested at some one single place or a few places like if the pathway was like a channel in a water shed with all the channels coming together in a river at the sea and then you check for isError or something at the mouth of the river like More and Just in Haskell but way more expressive than More and Just then you know exactly what went wrong and where maybe and that could be good maybe better than signal in this case if there would be a lot of signals deprecated and a lot of on:do:s all gone into one testing at the river ending
Yes, that’s a great pattern too. Sort of related to sentinels right?
if you put all your UndefinedObject>>methods into a Category named by the name of your Package or Project that should be ok it’s easy to hide it all then.
I agree. And if we had selector namespaces then collisions would be avoidable.
Like Category KEGGenerator in UndefinedObject hey that’s a good idea i will do that nil is an empty KEGGenerator and that is convenient but if nil is an empty Collection also then is nil a Gen or a Coll so then you better use #( )asGen instead if that is a problem or maybe it’s fine sometimes it is fine sometimes it is not Object>>asGen ^self asGenerator
if KEGGenerator g and g next isNil then any g next later isNil also until ( g reset ). This is essential to KEGGenerator if not the whole thing quits. But i use KEGFooArray as a not nil end of something also. Probably KEGFooArray could get a way better name because i just named it with out thinking and it stuck. i guess i could use KEGGeneratorTail or something instead of nil but nil is central and has a lot of testers which would have to be isomorphically recreated slow so just use nil
I guess i could just rename KEGFooArray and the refactoring would change it everywhere in one single stroke so probably i should do this try this but what name should i choose and then the #fooArray local variables would look funny so but what could it be called instead of KEGFooArray that means nil like terminator Array in case you want to stick things into it like you do with Errors but without creating another Error Class or something KEGnotNilTerminator ? although KEGFooArray can be used for more than just terminating sequences and once you know it it sticks in your craw in your brain and you never forget it at least i never did so but KEGFooArray has a nil like quality to it. i could look at all the KEGFooArray references and see if it is mostly a not nil subsequence terminator
does anybody want KEGGenerator besides me i remember telling a co worker you should use Generators by Timothy Budd for that and he said i don’t want to use Generators for anything i just want to use to:do: to:by:do: do: collect: select: reject: and detect: ok? Generators away. So i am guessing that this is the general censure. the general opinion. Actually Generators don’t do to:do: and to:by:do: as far as i know yet but some could do. Otherwise they are similar to Collections and you can do aCollection asGen or anObject asGen or aStream asGen etc. they are between Collections and Streams so you do not have a lot of intermediate Collections created when you do ((((( aCollection asGen ) collect: […] ) select: […] ) reject: […] ) detect: […] ifNone: […] ) ifNotNil: [ :g | g asArray ] there are no intermediate Collections created none and asArray sends ( g next ) over and over so execution proceeds g next collect: select: reject: reject: detect:ifNone: etc g next collect: select: reject: detect:ifNone: etc etc etc
Yes, great to avoid this kind of thing.
sort of and there is a new semaphore SharedQueue thing which does the java or something version of Generators from inside of regular loops whileTrue: and so you can have the […] above communicating with each other in very convenient ways which come up a lot once you know to look and this highly collapses the code as versus the Collection way is my opinion. But KEGGenerators has got as big or bigger than Xtreams i believe i know nothing and the combination of KEGGenerators and Xtreams aught to be pretty good although i haven’t tried it yet pretty good i would think but nobody wants it and nobody is me just me alone apparently
Having objects that encode functions/transformations/operations is also very much Smalltalk style (related to “every noun can be verbed”). BitBlt is one such. In VisualWorks we did compounds for querying. So instead of brosdeAllCallsOn:and:localToPackage: you could do (((query allCallsOn: a) | (query allCallsOn: b)) & (query localToPackage: p)) browse
Very powerful.
i use it a lot just like LittleSmalltalk uses its predecessor Generator a lot also look at the Icon language book and the implementation of n Queens therein it’s like one line or something because everything in Icon is a Generator practically or something
oh god another thing to learn by spelunking no
well actually KEGGenerators is highly documented by Package comments Class comments and Method comments completely Not corporate or consistent. well it may be consistent mostly but not wholly and it’s definitely not corporate or the ancient IBM standard accounting to which i say : what me worry?
oh god another huge pile up of pages to read yes
and the older comments are highly convoluted i have found unfortunately but the newer ones are pretty good clear i am hoping well future me is hoping
i want to put the latest versions of KEGGenerators into Github but it is rather large with many optional addon Packages and there are many Experimental Methods as well which could be Categorized KEGExperimental and or KEGUnfinished and it’s Dolphin so there are Methods in multiple Categories. So i was going to try to break it up into more Packages so you just get what you want like a Core and some addons or something or should i just dump the whole thing now and break it up later if i live that long well seeing as nobody wants it and nobody is just me only then dump it otherwise it should be highly portable to all different ANSI Smalltalk just as soon as the other Smalltalks stop the prehistoric insanity and allow Method M to be in multiple Categories or something and then allow system wide searching on a Category
i could attempt allowing M in multiple Categories in Squeak if i could get some help from somebody or something it should be easy why isn’t it easy? is some prehistoric rubble is blocking? SSSSSSS HOCK people hate the idea? Now why in hell would you do that unless i wake up and go oh no i can’t do that I’ve got to do all of this clean the gutters watch a million hours of tv etc
On Sat, Nov 26, 2022 at 08:58 Eliot Miranda eliot.miranda@gmail.com wrote: Hi Tobi,
On Thu, Nov 24, 2022 at 12:18 PM Tobias Pape Das.Linux@gmx.de wrote: Hi Eliot
First things first, I did not raise objections in my admittedly short quip.
On 24. Nov 2022, at 20:23, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Tobi,
let me try again (https://youtu.be/Cj8n4MfhjUc)…
:D
I already got my comfy chair!
On Nov 23, 2022, at 12:23 PM, Tobias Pape Das.Linux@gmx.de wrote:
Yet, nil is only seldom a good domain object.
Precisely. Being disjoint from any domain it is the ideal “I am not a domain object” marker. So when one wants a variable to range over a domain and the singleton “not a member of the domain” nil is a great choice. And that’s exactly how I use it below.
Second things second, I got that.
There is another excellent marker of a non-domain object, and that is a newly instantiated object. That object is known to not be any other object, since objects are unique. So if code is searching for something (eg applying a block to every literal in the system), having the newly instantiated object that implements the search use itself as the “I’m not in the domain of all pre-existing objects” is a sensible choice. This is the pattern InstructionStream uses when scanning for selectors.
And #someObject/#nextObject. I get that. And it is actually a beautiful thing you cannot do everywhere[0].
My fear is as follows:
I hope we can agree that "nil" is part of the "system domain" of Smalltalk, or - said differently - the meta-level.
So are the concepts of variables, classes etc.
Yes, and equally, for example, collections. The Set instance in a set of accounts is not part of the domain because it's not specific to the domain. It comes from the Smalltalk library.
But so what? This feels like a straw man to me. One could argue that an account object itself is not part of the domain, but part of the model of the domain, etc.
In the end nil is an object that is useful and well-defined. It isn't special, just like true & false aren;t special, or that classes being objects aren't special. Instead these are all functional relationships that allow us to construct a system that is malleable and comprehensible.
The non-meta, base, or "domain proper" layer can be anything you want to computationally achieve.[1] Let's arbitrarily chose finance[2].
A domain object would be an account, a transaction, an account holder, etc. (Note that we can chose how to represent each, and we do not necessarily need objects for each, but I digress).
My take: in code dealing with such domain objects, nil should appear next to never, because it is an object from the Metalevel.
I don't agree. nil can serve as an absence, and absences are important.
In any query which can yield no results we have three choices:
- answer nil (e.g. at end of stream)
- raise an exception
- require passing in a continuation to take in the event of an error (e.g. the block argument to at:ifAbsent: or the second one to detect:ifNone: etc)
Often nil is lighter-weight, much more concise, and hence more comprehensible, hence to be preferred.
The problems with accepting nil as the general "nothing to see" marker include:
- There are too many. In our example, an account holder could have an instVar "account" which could be nil when not having an account yet, BUT ALSO an account could have a "closingDate" for when the account folded, which is "nil" when the account is still open, AND ALSO, a transaction could have an "auditor" which is nil as long as no audit has taken place etc. Just like that, nil takes _different roles_ just by being convenient.
As do the integers, symbols, collections. As do classes. Should we have different kinds of classes for the class library, the meta level, the domain model? Or is it sufficient to have classes, metaclasses and traits? i.e. do we need to label things according to their use, or is it sufficient to differentiate them purely by function? I think the latter.
- Nil has problematic provenance. When somewhere during debugging (the all-known MNU for UndefinedObject) a nil pops up, it is often reallllly hard to say whence it came from. So dealing with a lot of nil-bearing code will send you down rabbit holes after the other.
Whereas NaN doesn't have problematical provenance? Or getting an identically-valued instance with a different identity, or a host of other potential evils.... These are not specific to nil, and not specific to using nuil as a marker of absence, or as bottom. Programming is tricky; we have lots of ways of doping things; things happen very fast; code does what tou tell it, not what you want it to do. We find we make mistakes all the time. This is not specific to the use of nil in our domain models.
- Nil begets polymorphism, nil defies polymorphism. It is one of the most awesome feats of Smalltalk that nil is NOT like NULL, in that it can respond to messages. That is exceptionally powerful and has given Smalltalk a lot of resilience. But cluttering UndefinedObject with custom, even domain-specifc methods is a really bad idea. However, that means it is often unwise to just have object that could be nil be sent arbitrary messages. Hence a multitude of #isNil/#ifNil-Checks. Proper domain objects that model absence, pre-valid state, or error conditions can deal much better with that.
When it's possible then fine. But in the query example above we must be able to deal with absence/have an element disjoint from a domain. And nil functions ideally for this case.
- Nil is in a collection-superposition (just like your good old USB-A plug which you have to turn at least twice to fit). You only know whether nil _actually_ could be a collection when you know that a non-nil object in its place is a collection [3]. Said differently: in contrast to LISPy languages, our nil is _by design_ no collection, while LISPy null _by design_ IS the empty list. This makes for funny messages like #isEmptyOrNil, which bails on non-nil-non-collection objects. So every time you have to deal with nil, you automatically at lease once have to answer the question "could the non-nil version of this object be a collection"?
Many uses of isEmptyOrNil are in response to FillInTheBlank which can return nil on cancel or an empty string if the user doesn't type anything. This is a natural consequence of the affordances of FillInTheBlank, and isEmptyOrNil is simply a pragmatic response, hardly a symptom of some deep problem.
There are a lot of interesting approaches to each or combinations of these issues. This includes Null-Object patterns, Sane-default-initializers, exceptions, or explicit models of multiplicity[4].
But back to the beginning.
In code that does, for example
blaFooAnAccount
| fooAccount | self accounts processBla: [:ea | fooAccount ifNil: [fooAccount := ea]. fooAccount := (ea doesBork: fooAccount) ifTrue: [fooAccount] ifFalse: [ea]] ^ fooAccount
we find two things:
I think we find three things. The things you8 list below, plus the fact that this should have been written using detect:ifNone: ;-)
First, we could inadvertently return nil from that method. But this is technical and I think most here can deal with that.
But second, the line "fooAccount ifNil: [fooAccount := ea]." ACTUALLY says
"if fooAccount is an uninitialized temporary variable, populate it".
This is technically correct, but conflates domains. In our world of finance, the idea of a "temporary variable" does no make sense. It is part of the meta-level domain, the system.
I don't say this is wrong _per se_ but people reading, and even more so, people writing such code MUST be aware that they are crossing domains, and especially, entering a meta level.
That's why I think these warnings are really ok. I won't fight the commit "Compiler-eem.480.mcz", especially since it more or less is descriptive of a pervasive style of writing Smalltalk of Squeak Core contributors.
I hope people find theses ideas useful.
Best regards -Tobias
[0]: I've used it in Python code to much joy. [1]: Caveat lector: the "domain layer" can surely be the "system layer". In fact that is what a lot of system code deals with. But this is messy for our considerations above and lets treat it as _exceptional_. [2]: Semi-arbitrarly, just because I received my tax returns :P [3]: yes, that is a strange sentence. It's late. Also, Ardbeg. [4]: For example, as in the work of Steimann (https://dl.acm.org/doi/10.1145/2509578.2509582 ). It seems they had a Smalltalk implementation in 2017.
-t
On 23. Nov 2022, at 19:34, tim Rowledge tim@rowledge.org wrote:
I won't quote it all again but what Eliot wrote is important. There are good solid reasons why Smalltalk has a rigorously defined UndefinedObject. We demand rigorously defined areas of doubt and uncertainty!
tim
_,,,^..^,,,_ (phone)
-- _,,,^..^,,,_ best, Eliot
Hi Eliot,
thanks for the great discussion. Your reply cleared a lot of things up for me -- I very much agree that the default value of any variable is not "undefined", but specifically nil.
What I still find an interesting trade-off is that of "implicit vs explicit" aka "terseness vs readability". I think its solution depends on the typical readers of the code I am writing. If they are Squeak Core developers or VM developers, I can assume that they know the default value for any variable. For unexperienced programmers, I hope they do know. Still, it's easier for anyone to forget if it's not explicitly written down. Yes, I can translate | a | to | a := nil | in my head, but that also might be a tiny portion of mental overhead.
Example 2, ColorArray: My first take would be indeed be to favor ColorArray new: 256 withAll: Color transparent over ColorArray new: 256, because I have never used this class before and would not be sure whether the values default to black or transparent or anything else. I could look it up, but I would likely forget it several times before remembering. Why should I do this to myself (and to any other non-ColorArray expert)?
Example 3, point arithmetic: Here I'm completely with you, but maybe I'm also biased because I have already spent some time working with visual domains in Squeak. :-)
Example 4, coreutils: ls, cat, and less are all lisping, catastrophic, or at least less intuitive names IMHO. Most of us will have got familiar with them, but from a Smalltalk perspective that is seeking for readability, I still would favor them to be named ListFiles, PrintFile, and OpenFileForRead. On the contrary, many people who are used the present nomenclature would dismiss these proposals as hard to write and maybe even hard to read. Is "terseness vs readability" as this syntactical point really a matter of data (code) quality (that must be ensured by the writer), or more a matter of the right tooling (that can be adjusted by the editor/reader)?
---
I want a warning for the usage of b in "c := b", "d" in "#(1 2 3) sorted: d", g in "g perform: #ifNotNil: with: [b := g]". I *don't* want to be told about a in "a ifNil: [a := 1]", c in "c ifNil: [c := 3]", or e & f in "(e isNil or: [f isNil]) ifTrue: [e := f := 6]".
Thank you for clarification. So the idea of #queryUndefined (will rename that) is to fix human forgetfulness. In the second group of examples, you explicitly consider these variables being nil, so you don't need a reminder from the compiler (or with regards to Tobi's argument, you're already dealing with the meta-level). In the first group of examples, one could say that you are covering tracks of your own possible forgetfulness and spreading possibly unsassigned values, so it's more important for the tooling to point you to that possible slip. Yes, I guess now that makes more sense to me. :-)
Still, the scope of this warning remains a bit blurry for me. Maybe that's because we are approximating a type analysis engine here with a *very* rough heuristic. For instance:
| a | a ifNotNil: [a := a + 1]. ^ a
In this example, I do *not* want a warning for "a + 1" but I *do* want a warning for "^ a" as a still might be unassigned. Currently, the reality is just the other way around. But that is probably out of scope for the current architecture ...
(As a side note, some other confusing thing around this notification for me is the fact that our compiler essentially integrates a few rudimentary linter tools (namely, UndefinedVariable and UnknownSelector) and forces the user to interact with them in a modal fashion. To be honest, I never liked that "modal linter style" and often wish we had some more contemporary annotation-/wiggle-line-based tooling for that. My typical interaction with #queryUndefined looks like this: me: accept new code, compiler: you did not assign foobar, me: oops, you're right, let me fix that; or alternatively: yes, that was intended, let me explicate that. Ignoring and proceeding from this warning has never felt acceptable for me as this would put the same confusion on any future editor of the method.)
:-) Hope I'm not too strident :-)
No problem, it was very interesting and I learn a lot from your replies. :-)
Best, Christoph
--- Sent from Squeak Inbox Talk
On 2022-11-22T23:36:09-08:00, eliot.miranda@gmail.com wrote:
Hi Christoph, Hi Marcel,
apologies about the font size mismatches...
On Wed, Nov 23, 2022 at 2:25 AM Marcel Taeumel <marcel.taeumel at hpi.de> wrote:
Hi Christoph --
IMHO, it unnecessarily complicates the simple Smalltalk syntax. [...]
Nah, this is just a tooling change, not a syntactical one.
+1
Yes, I would like to have this info skipped for #isNil as well. Note that one should not use #ifNotNilDo: anymore.
Good idea. I'll include it.
Best, Marcel
Am 23.11.2022 11:00:43 schrieb Thiede, Christoph < christoph.thiede at student.hpi.uni-potsdam.de>:
Hi Eliot, hi all,
I'm skeptical about this change, as it creates or expands a special role of the selectors #ifNil:, #ifNotNil:, and their combinations. IMHO, it unnecessarily complicates the simple Smalltalk syntax. While I know and sometimes dislike these UndefinedVariable notifications, too, I don't know whether differentiating them by the selector is the right strategy to improve this situation.
Please indulge me. It's f***ing irritating to be told by the compiler
that as temp var appears to be uninitialized when one is intentionally using the fact that temps are initialized to nil. And that temp vars are initialized to nil is a) essential knowledge and b) a good thing (no uninitialized local variables a la C, a sensible value to initialize a variable with).
BTW, I find it more than sad (a little alarming in fact) that someSmalltalkers don't know that the value of several conditionals that take blocks is nil when the condition doesn't select the block. e.g. false ifTrue: [self anything] is nil. I see "expr ifNotNil: [...] ifNil: [nil]" and it strikes me as illiterate. I recently visited code written by a strong programmer who open coded a lot of point arithmetic, decomposing e.g. a * b into (a x * b x) @ (a y * b y). It's bad. It gradually degrades the code base in that it isn't always an exemplar of best practice,
Consider the following examples:
| a b c d e f g h | a ifNil: [a := 1]. c := b. c ifNil: [c := 3]. #(1 2 3) sorted: d. e := 5. (e isNil or: [f isNil]) ifTrue: [e := f := 6]. g perform: #ifNotNil: with: [b := g]. h ifNotNilDo: [h := 8].
How would you explain to a naive Smalltalker which of these variables will be marked as undefined at this point and why? (Of course, you can explain it by pointing to the implementation, but I think that's a significantly less intuitive explanation than just saying "you must declare any variable before using it".)
No. It's a hard-and-fast rule that all temp vars are initialized to nil. And initializing a variable (to other than nil) is done by assigning it. In the above a through h are declared within the vertical bars.n They are initialized in the assignments. I want a warning for the usage of b in "c := b", "d" in "#(1 2 3) sorted: d", g in "g perform: #ifNotNil: with: [b := g]". I *don't* want to be told about a in "a ifNil: [a := 1]", c in "c ifNil: [c := 3]", or e & f in "(e isNil or: [f isNil]) ifTrue: [e := f := 6]". I never want to see "ifNotNilDo", ever ;-) (* note that a couple of years back we fixed a bad bug in the compiler where block local temps were not (re)initialized to nil on each iteration, leaking their values from previous iterations, breaking the "all temp vars are initialized to nil rule, and revealing implementation details in the compiler's inlining of to:[by:]do: forms)
This behavior leads to a mental model that disambiguates between null and undefined similar to JavaScript which I never have found helpful.
I don't see how that applies. Smalltalk has no undefined. It has nil & zero, and these values are used to initialize any and all variables. This is not an artifact of the implementation. It is a fundamental part of the language design. It results in no dangling referents or uninitialized variables. The language used in Parser>>#queryUndefined is problematic. It should be "unassigned", not "undefined". There is nothing undefined about these variables. But they are indeed unassigned. In some cases (see my i=diomatic implementation of subsequences: and substrings) this can (and *should*) be used to advantage. And all Smalltalk programming courses should explain that variables are always initialized (either to nil or zero, & hence by extension 0.0, Character null, Color transparent, et al), and may need assignment before their referents get sent messages.
I see the same kind of sloppiness in people not knowing that conditionals that take blocks typically evaluate to nil when the condition doesn;t select the block. So always "expr ifNotNil: [...]", never "expr ifNotNil: [...] ifNil: [nil]", or "expr ifNotNil: [...] ifNil: []". I recently cleaned up code by as string programmer who had open coded point arithmetic (e.g. a * b written as (a x * b x) @ (a y * b y) ). This is really bad: it's exemplifying poor practice, it's verbose, it takes away at least as much understanding as it conveys, it leads to more difficult to manage code.
If we fail to teach the language properly we start on a slippery slope to duplication (which is an awful evil, leading to much increased maintennance effort, and brittleness), and rendering perfectly good, well thought-out idioms mysterious. It;'s not like Smalltalk has a lot of rules; the number, compared to C & C++ et al is tiny. And terseness has not just aesthetic benefit, but real practical benefit in terms of readability & maintainability.
Also, with this change, the compiler leaks the default value of any temporary variable, which we previously were able to hide at least partially.
But that is a MISTAKE!! The language designers didn't arrange for temps to be initialized to nil just because that's the only default. They did it to ensure that there is no such thing as an uninitialized variable in Smalltalk. That's why nil ids an object, with a class, not just nil. That's why nil ~~ false. It's carefully thought out and not just some artifact of the implementation. And that rationale (read the blue book carefully) and its implications, should be taught/learned/known, and especially exemplified by the core code of Squeak trunk, and hence supported by the compiler.
In many cases, I think explicitly setting a temporary variable to nil before it is initialized within some non-trivial conditional complex would be more explicit, thus more readable, and something which we should generally encourage programmers to do.
I disagree. You're advocating for absurdities such as
| colors | colors :=- ColorArray new: 256. colors atAllPut: Color transparent
This is the kind of thinking that leads to cycling wearing American Football clothes. It won't keep you from being run over by a truck, but it'll make you so slow and reduce your peripheral vision so much, not to mention give you a false sense of security, that you'll be much more likely to be run over by a truck...
Looking forward to your opinion!
:-) Hope I'm not too strident :-)
Best,
Christoph
*Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von commits at source.squeak.org <commits at source.squeak.org> *Gesendet:* Mittwoch, 23. November 2022 04:10:30 *An:* squeak-dev at lists.squeakfoundation.org; packages at lists.squeakfoundation.org *Betreff:* [squeak-dev] The Trunk: Compiler-eem.480.mcz
Eliot Miranda uploaded a new version of Compiler to project The Trunk: http://source.squeak.org/trunk/Compiler-eem.480.mcz
==================== Summary ====================
Name: Compiler-eem.480 Author: eem Time: 22 November 2022, 7:10:27.324796 pm UUID: 3e5ba19e-c44a-4390-9004-de1246736cbc Ancestors: Compiler-eem.479
Do not warn of an uninitialized temporary if it is being sent ifNil: or ifNotNil:.
=============== Diff against Compiler-eem.479 ===============
Item was changed:
----- Method: Parser>>primaryExpression (in category 'expression types')
primaryExpression hereType == #word ifTrue: [parseNode := self variable.
(parseNode isUndefTemp
and: [(#('ifNil:' 'ifNotNil:') includes: here)
not
and: [self interactive]])
ifTrue:
[self queryUndefined].
(parseNode isUndefTemp and: [self interactive])
ifTrue: [self queryUndefined]. parseNode nowHasRef. ^ true]. hereType == #leftBracket ifTrue: [self advance. self blockExpression. ^true]. hereType == #leftBrace ifTrue: [self braceExpression. ^true]. hereType == #leftParenthesis ifTrue: [self advance. self expression ifFalse: [^self expected:
'expression']. (self match: #rightParenthesis) ifFalse: [^self expected: 'right parenthesis']. ^true]. (hereType == #string or: [hereType == #number or: [hereType == #literal or: [hereType == #character]]]) ifTrue: [parseNode := encoder encodeLiteral: self advance. ^true]. (here == #- and: [tokenType == #number and: [1 + hereEnd = mark]]) ifTrue: [self advance. parseNode := encoder encodeLiteral: self advance negated. ^true]. ^false!
-- _,,,^..^,,,_ best, Eliot
Hi Christoph,
On Fri, Nov 25, 2022 at 7:47 AM christoph.thiede@student.hpi.uni-potsdam.de wrote:
Hi Eliot,
thanks for the great discussion. Your reply cleared a lot of things up for me -- I very much agree that the default value of any variable is not "undefined", but specifically nil.
What I still find an interesting trade-off is that of "implicit vs explicit" aka "terseness vs readability". I think its solution depends on the typical readers of the code I am writing. If they are Squeak Core developers or VM developers, I can assume that they know the default value for any variable. For unexperienced programmers, I hope they do know. Still, it's easier for anyone to forget if it's not explicitly written down. Yes, I can translate | a | to | a *:= nil* | in my head, but that also might be a tiny portion of mental overhead.
Right. One of the design principles of Smalltalk, one of its more important design principles, is to use the least amount of rules possible (but no fewer) and apply them consistently. This applies, for example, to Maxwell's equations, right? It is a critical attribute of good scientific models, good legal systems, good road traffic laws, not just good programming languages. Finding a small set of easily memorable rules, or principles, that are orthogonal and cohere to provide a calculus which can be used to construct or predict behaviour or consequence, is a delight and a blessing.
There was one rule in Smalltalk that was awful; that was a result of the implementation showing through, (that IIRC got into the ANSI standard), that we fixed. That was that the value of an empty block that took arguments, was the value of the last argument. So [:a :b|] value: 1 value: 2 would evaluate to 2. Thankfully this has been fixed and the value of an empty block is always nil, irrespective of its arguments. ColorArray new: n is not something that needs to be fixed, or alarmed at. It can be experienced (I discovered it this last week, I was not astonished). I wager you will never forget that the default value of a ColorArray indexed instance variable is Color transparent. You may even use it in your teaching as an amusing and interesting example for the initialization rule. I doubt that it would tax any new Smalltalker that has already encountered bits collections, and would serve to help them remember the initialize-with-zero sub-rule.
Example 2, ColorArray: My first take would be indeed be to favor ColorArray
new: 256 withAll: Color transparent over ColorArray new: 256, because I have never used this class before and would not be sure whether the values default to black or transparent or anything else. I could look it up, but I would likely forget it several times before remembering. Why should I do this to myself (and to any other non-ColorArray expert)?
The thing about ColorArray that's important to our discussion is that a new instance is initialized; it doesn't contain random crap. This is the point about initialization that applies to all Smalltalk variables, temporary, global, named instance, and indexed instance. We have one rule, if the variable holds pointers it is initialized by nil, if the variable holds bits it is initialized with the all-zeros bit string. This has some unexpected consequences such as (ColorArray new: 10) anyOne = Color transparent, but the nice thing is that the rule is very easy to learn, and so when one discovers (ColorArray new: 10) anyOne = Color transparent one is not astonished. [the principle of least astonishment)
Example 3, point arithmetic: Here I'm completely with you, but maybe I'm also biased because I have already spent some time working with visual domains in Squeak. :-)
:-) indeed!
Example 4, coreutils: ls, cat, and less are all lisping, catastrophic, or at least less intuitive names IMHO. Most of us will have got familiar with them, but from a Smalltalk perspective that is seeking for readability, I still would favor them to be named ListFiles, PrintFile, and OpenFileForRead. On the contrary, many people who are used the present nomenclature would dismiss these proposals as hard to write and maybe even hard to read. Is "terseness vs readability" as this syntactical point really a matter of data (code) quality (that must be ensured by the writer), or more a matter of the right tooling (that can be adjusted by the editor/reader)?
Learning to play a musical instrument, learning to wield a golf club, learning to drive a car, ride a bike, accumulate any useful skill, involves a tradeoff. Masterful use always drops the training wheels. Eventually we always advance to using a functionally efficient interface and/or set of skills. This makes like hard for the learner. They have to acquire the skills that allow them to reach competence on their way to mastery. But we do not serve ourselves by preventing mastery, by protecting us against mastery. If we do, we end up with inexpressive music, shirt and unexciting drives, 30 kph autobahns, 70 kg bikes that don;'t lean in the corners.
The Prussian Model of Education was set up under Bismark, its intent being to produce a generation of easily ruled consumers and cannon fodder. Its means were to teach badly, to render its subjects insecure, consciously inferior to those educated in elite private schools. This model was adopted by th eUS senate immediately after the civil war as the blueprint for the US's public education system. It serves as the ideal instrument of oppression in faux democracies. [I was incredibly fortunate that it was not used in my public primary school; unsurprisingly, teachers that love children hate the Prussian Model].
Driving a child to school, rather than having them walk, and/or catch a bus, results in an adult less at ease moving around the city. It infantilises them. It constrains them their whole lives.
Wittgenstein's realisation was that language means what we collectively decide it means, not what it is defined to mean, and the fact that a substantial number of American adults think that "inflammable" means the opposite of "flammable" is alarming, an example that militates for better teaching, not Newspeak.
If we are to live meaningful, rich, lives that enrich those of others, we must enculturate, we must take off the training wheels, we must fashion that which is concise, powerful and call it elegant.
---
I want a warning for the usage of b in "c := b", "d" in "#(1 2 3)
sorted: d", g in "g perform: #ifNotNil: with: [b := g]". I *don't* want to be told about a in "a ifNil: [a := 1]", c in "c ifNil: [c := 3]", or e & f in "(e isNil or: [f isNil]) ifTrue: [e := f := 6]".
Thank you for clarification. So the idea of #queryUndefined (will rename that) is to fix human forgetfulness.
Yes.
In the second group of examples, you explicitly consider these variables being nil, so you don't need a reminder from the compiler (or with regards to Tobi's argument, you're already dealing with the meta-level). In the first group of examples, one could say that you are covering tracks of your own possible forgetfulness and spreading possibly unsassigned values, so it's more important for the tooling to point you to that possible slip. Yes, I guess now that makes more sense to me. :-)
Cool.
Still, the scope of this warning remains a bit blurry for me. Maybe that's because we are approximating a type analysis engine here with a *very* rough heuristic.
I think that's exactly why.
For instance:
| a | a ifNotNil: [a *:=* a + 1]. ^ a
In this example, I do *not* want a warning for "a + 1" but I *do* want a warning for "^ a" as a still might be unassigned. Currently, the reality is just the other way around. But that is probably out of scope for the current architecture ...
I love the pragmatic design principle from the LRG/SCG at PARC that I was taught on arrival at ParcPlace Systems: don't let the perfect be the enemy of the good. It's ok to have 95% of the solution.
(As a side note, some other confusing thing around this notification for me
is the fact that our compiler essentially integrates a few rudimentary linter tools (namely, UndefinedVariable and UnknownSelector) and forces the user to interact with them in a modal fashion. To be honest, I never liked that "modal linter style" and often wish we had some more contemporary annotation-/wiggle-line-based tooling for that.
+1000 !!!!! Thanks for pointing this out. This is an excellent suggestion. I *hate* clicking on the "declare as a foo variable" dialog for a sequence of as-yet-undeclared temp vars, only to hit a lint rule, or syntax error, and have all those declarations discarded.
My typical interaction with #queryUndefined looks like this: me: accept new code, compiler: you did not assign foobar, me: oops, you're right, let me fix that; or alternatively: yes, that was intended, let me explicate that. Ignoring and proceeding from this warning has never felt acceptable for me as this would put the same confusion on any future editor of the method.)
:-) Hope I'm not too strident :-)
No problem, it was very interesting and I learn a lot from your replies. :-)
Best, Christoph
*Sent from **Squeak Inbox Talk https://github.com/hpi-swa-lab/squeak-inbox-talk*
On 2022-11-22T23:36:09-08:00, eliot.miranda@gmail.com wrote:
Hi Christoph, Hi Marcel,
apologies about the font size mismatches...
On Wed, Nov 23, 2022 at 2:25 AM Marcel Taeumel <marcel.taeumel at hpi.de
wrote:
Hi Christoph --
IMHO, it unnecessarily complicates the simple Smalltalk syntax. [...]
Nah, this is just a tooling change, not a syntactical one.
+1
Yes, I would like to have this info skipped for #isNil as well. Note
that one
should not use #ifNotNilDo: anymore.
Good idea. I'll include it.
Best, Marcel
Am 23.11.2022 11:00:43 schrieb Thiede, Christoph < christoph.thiede at student.hpi.uni-potsdam.de>:
Hi Eliot, hi all,
I'm skeptical about this change, as it creates or expands a special
role
of the selectors #ifNil:, #ifNotNil:, and their combinations. IMHO, it unnecessarily complicates the simple Smalltalk syntax. While I know and sometimes dislike these UndefinedVariable notifications, too, I don't
know
whether differentiating them by the selector is the right strategy to improve this situation.
Please indulge me. It's f***ing irritating to be told by the compiler
that as temp var appears to be uninitialized when one is intentionally using the fact that temps are initialized to nil. And that temp vars are initialized to nil is a) essential knowledge and b) a good thing (no uninitialized local variables a la C, a sensible value to initialize a variable with).
BTW, I find it more than sad (a little alarming in fact) that someSmalltalkers don't know that the value of several conditionals that take blocks is nil when the condition doesn't select the block. e.g.
false
ifTrue: [self anything] is nil. I see "expr ifNotNil: [...] ifNil: [nil]" and it strikes me as illiterate. I recently visited code written by a strong programmer who open coded a lot of point arithmetic, decomposing e.g. a * b into (a x * b x) @ (a y * b y). It's bad. It gradually degrades the code base in that it isn't always an exemplar of best practice,
Consider the following examples:
| a b c d e f g h | a ifNil: [a := 1]. c := b. c ifNil: [c := 3]. #(1 2 3) sorted: d. e := 5. (e isNil or: [f isNil]) ifTrue: [e := f := 6]. g perform: #ifNotNil: with: [b := g]. h ifNotNilDo: [h := 8].
How would you explain to a naive Smalltalker which of these variables
will
be marked as undefined at this point and why? (Of course, you can
explain
it by pointing to the implementation, but I think that's a
significantly
less intuitive explanation than just saying "you must declare any
variable
before using it".)
No. It's a hard-and-fast rule that all temp vars are initialized to nil. And initializing a variable (to other than nil) is done by assigning it. In the above a through h are declared within the vertical bars.n They are initialized in the assignments. I want a warning for the usage of b in "c := b", "d" in "#(1 2 3) sorted: d", g in "g perform: #ifNotNil: with: [b
:=
g]". I *don't* want to be told about a in "a ifNil: [a := 1]", c in "c ifNil: [c := 3]", or e & f in "(e isNil or: [f isNil]) ifTrue: [e := f := 6]". I never want to see "ifNotNilDo", ever ;-) (* note that a couple of years back we fixed a bad bug in the compiler where block local temps were not (re)initialized to nil on each
iteration,
leaking their values from previous iterations, breaking the "all temp
vars
are initialized to nil rule, and revealing implementation details in the compiler's inlining of to:[by:]do: forms)
This behavior leads to a mental model that disambiguates between null
and
undefined similar to JavaScript which I never have found helpful.
I don't see how that applies. Smalltalk has no undefined. It has nil & zero, and these values are used to initialize any and all variables. This is not an artifact of the implementation. It is a fundamental part of the language design. It results in no dangling referents or uninitialized variables. The language used in Parser>>#queryUndefined is problematic. It should be "unassigned", not "undefined". There is nothing undefined about these variables. But they are indeed unassigned. In some cases (see my i=diomatic implementation of subsequences: and substrings) this can (and *should*) be used to advantage. And all Smalltalk programming courses should explain that variables are always initialized (either to nil or zero, & hence by extension 0.0, Character null, Color transparent, et al), and may need assignment before their referents get sent messages.
I see the same kind of sloppiness in people not knowing that conditionals that take blocks typically evaluate to nil when the condition doesn;t select the block. So always "expr ifNotNil: [...]", never "expr ifNotNil: [...] ifNil: [nil]", or "expr ifNotNil: [...] ifNil: []". I recently cleaned up code by as string programmer who had open coded point
arithmetic
(e.g. a * b written as (a x * b x) @ (a y * b y) ). This is really bad: it's exemplifying poor practice, it's verbose, it takes away at least as much understanding as it conveys, it leads to more difficult to manage
code.
If we fail to teach the language properly we start on a slippery slope to duplication (which is an awful evil, leading to much increased
maintennance
effort, and brittleness), and rendering perfectly good, well thought-out idioms mysterious. It;'s not like Smalltalk has a lot of rules; the number, compared to C & C++ et al is tiny. And terseness has not just aesthetic benefit, but real practical benefit in terms of readability & maintainability.
Also, with this change, the compiler leaks the default value of any temporary variable, which we previously were able to hide at least partially.
But that is a MISTAKE!! The language designers didn't arrange for temps
to
be initialized to nil just because that's the only default. They did it
to
ensure that there is no such thing as an uninitialized variable in Smalltalk. That's why nil ids an object, with a class, not just nil. That's why nil ~~ false. It's carefully thought out and not just some artifact of the implementation. And that rationale (read the blue book carefully) and its implications, should be taught/learned/known, and especially exemplified by the core code of Squeak trunk, and hence supported by the compiler.
In many cases, I think explicitly setting a temporary variable to nil before it is initialized within some non-trivial conditional complex
would
be more explicit, thus more readable, and something which we should generally encourage programmers to do.
I disagree. You're advocating for absurdities such as
| colors | colors :=- ColorArray new: 256. colors atAllPut: Color transparent
This is the kind of thinking that leads to cycling wearing American Football clothes. It won't keep you from being run over by a truck, but it'll make you so slow and reduce your peripheral vision so much, not to mention give you a false sense of security, that you'll be much more
likely
to be run over by a truck...
Looking forward to your opinion!
:-) Hope I'm not too strident :-)
Best,
Christoph
*Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org>
im
Auftrag von commits at source.squeak.org <commits at source.squeak.org
*Gesendet:* Mittwoch, 23. November 2022 04:10:30 *An:* squeak-dev at lists.squeakfoundation.org; packages at lists.squeakfoundation.org *Betreff:* [squeak-dev] The Trunk: Compiler-eem.480.mcz
Eliot Miranda uploaded a new version of Compiler to project The Trunk: http://source.squeak.org/trunk/Compiler-eem.480.mcz
==================== Summary ====================
Name: Compiler-eem.480 Author: eem Time: 22 November 2022, 7:10:27.324796 pm UUID: 3e5ba19e-c44a-4390-9004-de1246736cbc Ancestors: Compiler-eem.479
Do not warn of an uninitialized temporary if it is being sent ifNil: or ifNotNil:.
=============== Diff against Compiler-eem.479 ===============
Item was changed: ----- Method: Parser>>primaryExpression (in category 'expression
types')
primaryExpression hereType == #word ifTrue: [parseNode := self variable.
- (parseNode isUndefTemp
- and: [(#('ifNil:' 'ifNotNil:') includes: here)
not
- and: [self interactive]])
- ifTrue:
- [self queryUndefined].
- (parseNode isUndefTemp and: [self interactive])
- ifTrue: [self queryUndefined].
parseNode nowHasRef. ^ true]. hereType == #leftBracket ifTrue: [self advance. self blockExpression. ^true]. hereType == #leftBrace ifTrue: [self braceExpression. ^true]. hereType == #leftParenthesis ifTrue: [self advance. self expression ifFalse: [^self expected: 'expression']. (self match: #rightParenthesis) ifFalse: [^self expected: 'right parenthesis']. ^true]. (hereType == #string or: [hereType == #number or: [hereType == #literal or: [hereType == #character]]]) ifTrue: [parseNode := encoder encodeLiteral: self advance. ^true]. (here == #- and: [tokenType == #number and: [1 + hereEnd = mark]]) ifTrue: [self advance. parseNode := encoder encodeLiteral: self advance negated. ^true]. ^false!
_,,,^..^,,,_ best, Eliot
On 2022-11-26, at 9:37 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
a new instance is initialized; it doesn't contain random crap.
This is a really, really, important fact that gets far too little appreciation. We (as in those of us that have done significant amounts of living in the bowels of VM implementations) had to put non-trivial effort into making that happen correctly and efficiently, because in the Olden Times (when computers did maybe one million instructions a second if you could afford a professional workstation class machine) it was *painful* to spend that time and we had to cheat in creative ways. The pay-off is that we get to not worry about the potential for horrifying things happening in the way it can with so many other programming systems. It makes garbage collection more practical, which in turn makes other parts of programming life nicer.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- If she was any dumber, she'd be a green plant.
squeak-dev@lists.squeakfoundation.org