Hi Eliot,
thanks for the great discussion. Your reply cleared a lot of things up for me -- I very much agree that the default value of any variable is not "undefined", but specifically nil.
What I still find an interesting trade-off is that of "implicit vs explicit" aka "terseness vs readability". I think its solution depends on the typical readers of the code I am writing. If they are Squeak Core developers or VM developers, I can assume that they know the default value for any variable. For unexperienced programmers, I hope they do know. Still, it's easier for anyone to forget if it's not explicitly written down. Yes, I can translate | a | to | a := nil | in my head, but that also might be a tiny portion of mental overhead.
Example 2, ColorArray: My first take would be indeed be to favor ColorArray new: 256 withAll: Color transparent over ColorArray new: 256, because I have never used this class before and would not be sure whether the values default to black or transparent or anything else. I could look it up, but I would likely forget it several times before remembering. Why should I do this to myself (and to any other non-ColorArray expert)?
Example 3, point arithmetic: Here I'm completely with you, but maybe I'm also biased because I have already spent some time working with visual domains in Squeak. :-)
Example 4, coreutils: ls, cat, and less are all lisping, catastrophic, or at least less intuitive names IMHO. Most of us will have got familiar with them, but from a Smalltalk perspective that is seeking for readability, I still would favor them to be named ListFiles, PrintFile, and OpenFileForRead. On the contrary, many people who are used the present nomenclature would dismiss these proposals as hard to write and maybe even hard to read. Is "terseness vs readability" as this syntactical point really a matter of data (code) quality (that must be ensured by the writer), or more a matter of the right tooling (that can be adjusted by the editor/reader)?
---
I want a warning for the usage of b in "c := b", "d" in "#(1 2 3) sorted: d", g in "g perform: #ifNotNil: with: [b := g]". I *don't* want to be told about a in "a ifNil: [a := 1]", c in "c ifNil: [c := 3]", or e & f in "(e isNil or: [f isNil]) ifTrue: [e := f := 6]".
Thank you for clarification. So the idea of #queryUndefined (will rename that) is to fix human forgetfulness. In the second group of examples, you explicitly consider these variables being nil, so you don't need a reminder from the compiler (or with regards to Tobi's argument, you're already dealing with the meta-level). In the first group of examples, one could say that you are covering tracks of your own possible forgetfulness and spreading possibly unsassigned values, so it's more important for the tooling to point you to that possible slip. Yes, I guess now that makes more sense to me. :-)
Still, the scope of this warning remains a bit blurry for me. Maybe that's because we are approximating a type analysis engine here with a *very* rough heuristic. For instance:
| a | a ifNotNil: [a := a + 1]. ^ a
In this example, I do *not* want a warning for "a + 1" but I *do* want a warning for "^ a" as a still might be unassigned. Currently, the reality is just the other way around. But that is probably out of scope for the current architecture ...
(As a side note, some other confusing thing around this notification for me is the fact that our compiler essentially integrates a few rudimentary linter tools (namely, UndefinedVariable and UnknownSelector) and forces the user to interact with them in a modal fashion. To be honest, I never liked that "modal linter style" and often wish we had some more contemporary annotation-/wiggle-line-based tooling for that. My typical interaction with #queryUndefined looks like this: me: accept new code, compiler: you did not assign foobar, me: oops, you're right, let me fix that; or alternatively: yes, that was intended, let me explicate that. Ignoring and proceeding from this warning has never felt acceptable for me as this would put the same confusion on any future editor of the method.)
:-) Hope I'm not too strident :-)
No problem, it was very interesting and I learn a lot from your replies. :-)
Best, Christoph
--- Sent from Squeak Inbox Talk
On 2022-11-22T23:36:09-08:00, eliot.miranda@gmail.com wrote:
Hi Christoph, Hi Marcel,
apologies about the font size mismatches...
On Wed, Nov 23, 2022 at 2:25 AM Marcel Taeumel <marcel.taeumel at hpi.de> wrote:
Hi Christoph --
IMHO, it unnecessarily complicates the simple Smalltalk syntax. [...]
Nah, this is just a tooling change, not a syntactical one.
+1
Yes, I would like to have this info skipped for #isNil as well. Note that one should not use #ifNotNilDo: anymore.
Good idea. I'll include it.
Best, Marcel
Am 23.11.2022 11:00:43 schrieb Thiede, Christoph < christoph.thiede at student.hpi.uni-potsdam.de>:
Hi Eliot, hi all,
I'm skeptical about this change, as it creates or expands a special role of the selectors #ifNil:, #ifNotNil:, and their combinations. IMHO, it unnecessarily complicates the simple Smalltalk syntax. While I know and sometimes dislike these UndefinedVariable notifications, too, I don't know whether differentiating them by the selector is the right strategy to improve this situation.
Please indulge me. It's f***ing irritating to be told by the compiler
that as temp var appears to be uninitialized when one is intentionally using the fact that temps are initialized to nil. And that temp vars are initialized to nil is a) essential knowledge and b) a good thing (no uninitialized local variables a la C, a sensible value to initialize a variable with).
BTW, I find it more than sad (a little alarming in fact) that someSmalltalkers don't know that the value of several conditionals that take blocks is nil when the condition doesn't select the block. e.g. false ifTrue: [self anything] is nil. I see "expr ifNotNil: [...] ifNil: [nil]" and it strikes me as illiterate. I recently visited code written by a strong programmer who open coded a lot of point arithmetic, decomposing e.g. a * b into (a x * b x) @ (a y * b y). It's bad. It gradually degrades the code base in that it isn't always an exemplar of best practice,
Consider the following examples:
| a b c d e f g h | a ifNil: [a := 1]. c := b. c ifNil: [c := 3]. #(1 2 3) sorted: d. e := 5. (e isNil or: [f isNil]) ifTrue: [e := f := 6]. g perform: #ifNotNil: with: [b := g]. h ifNotNilDo: [h := 8].
How would you explain to a naive Smalltalker which of these variables will be marked as undefined at this point and why? (Of course, you can explain it by pointing to the implementation, but I think that's a significantly less intuitive explanation than just saying "you must declare any variable before using it".)
No. It's a hard-and-fast rule that all temp vars are initialized to nil. And initializing a variable (to other than nil) is done by assigning it. In the above a through h are declared within the vertical bars.n They are initialized in the assignments. I want a warning for the usage of b in "c := b", "d" in "#(1 2 3) sorted: d", g in "g perform: #ifNotNil: with: [b := g]". I *don't* want to be told about a in "a ifNil: [a := 1]", c in "c ifNil: [c := 3]", or e & f in "(e isNil or: [f isNil]) ifTrue: [e := f := 6]". I never want to see "ifNotNilDo", ever ;-) (* note that a couple of years back we fixed a bad bug in the compiler where block local temps were not (re)initialized to nil on each iteration, leaking their values from previous iterations, breaking the "all temp vars are initialized to nil rule, and revealing implementation details in the compiler's inlining of to:[by:]do: forms)
This behavior leads to a mental model that disambiguates between null and undefined similar to JavaScript which I never have found helpful.
I don't see how that applies. Smalltalk has no undefined. It has nil & zero, and these values are used to initialize any and all variables. This is not an artifact of the implementation. It is a fundamental part of the language design. It results in no dangling referents or uninitialized variables. The language used in Parser>>#queryUndefined is problematic. It should be "unassigned", not "undefined". There is nothing undefined about these variables. But they are indeed unassigned. In some cases (see my i=diomatic implementation of subsequences: and substrings) this can (and *should*) be used to advantage. And all Smalltalk programming courses should explain that variables are always initialized (either to nil or zero, & hence by extension 0.0, Character null, Color transparent, et al), and may need assignment before their referents get sent messages.
I see the same kind of sloppiness in people not knowing that conditionals that take blocks typically evaluate to nil when the condition doesn;t select the block. So always "expr ifNotNil: [...]", never "expr ifNotNil: [...] ifNil: [nil]", or "expr ifNotNil: [...] ifNil: []". I recently cleaned up code by as string programmer who had open coded point arithmetic (e.g. a * b written as (a x * b x) @ (a y * b y) ). This is really bad: it's exemplifying poor practice, it's verbose, it takes away at least as much understanding as it conveys, it leads to more difficult to manage code.
If we fail to teach the language properly we start on a slippery slope to duplication (which is an awful evil, leading to much increased maintennance effort, and brittleness), and rendering perfectly good, well thought-out idioms mysterious. It;'s not like Smalltalk has a lot of rules; the number, compared to C & C++ et al is tiny. And terseness has not just aesthetic benefit, but real practical benefit in terms of readability & maintainability.
Also, with this change, the compiler leaks the default value of any temporary variable, which we previously were able to hide at least partially.
But that is a MISTAKE!! The language designers didn't arrange for temps to be initialized to nil just because that's the only default. They did it to ensure that there is no such thing as an uninitialized variable in Smalltalk. That's why nil ids an object, with a class, not just nil. That's why nil ~~ false. It's carefully thought out and not just some artifact of the implementation. And that rationale (read the blue book carefully) and its implications, should be taught/learned/known, and especially exemplified by the core code of Squeak trunk, and hence supported by the compiler.
In many cases, I think explicitly setting a temporary variable to nil before it is initialized within some non-trivial conditional complex would be more explicit, thus more readable, and something which we should generally encourage programmers to do.
I disagree. You're advocating for absurdities such as
| colors | colors :=- ColorArray new: 256. colors atAllPut: Color transparent
This is the kind of thinking that leads to cycling wearing American Football clothes. It won't keep you from being run over by a truck, but it'll make you so slow and reduce your peripheral vision so much, not to mention give you a false sense of security, that you'll be much more likely to be run over by a truck...
Looking forward to your opinion!
:-) Hope I'm not too strident :-)
Best,
Christoph
*Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von commits at source.squeak.org <commits at source.squeak.org> *Gesendet:* Mittwoch, 23. November 2022 04:10:30 *An:* squeak-dev at lists.squeakfoundation.org; packages at lists.squeakfoundation.org *Betreff:* [squeak-dev] The Trunk: Compiler-eem.480.mcz
Eliot Miranda uploaded a new version of Compiler to project The Trunk: http://source.squeak.org/trunk/Compiler-eem.480.mcz
==================== Summary ====================
Name: Compiler-eem.480 Author: eem Time: 22 November 2022, 7:10:27.324796 pm UUID: 3e5ba19e-c44a-4390-9004-de1246736cbc Ancestors: Compiler-eem.479
Do not warn of an uninitialized temporary if it is being sent ifNil: or ifNotNil:.
=============== Diff against Compiler-eem.479 ===============
Item was changed:
----- Method: Parser>>primaryExpression (in category 'expression types')
primaryExpression hereType == #word ifTrue: [parseNode := self variable.
(parseNode isUndefTemp
and: [(#('ifNil:' 'ifNotNil:') includes: here)
not
and: [self interactive]])
ifTrue:
[self queryUndefined].
(parseNode isUndefTemp and: [self interactive])
ifTrue: [self queryUndefined]. parseNode nowHasRef. ^ true]. hereType == #leftBracket ifTrue: [self advance. self blockExpression. ^true]. hereType == #leftBrace ifTrue: [self braceExpression. ^true]. hereType == #leftParenthesis ifTrue: [self advance. self expression ifFalse: [^self expected:
'expression']. (self match: #rightParenthesis) ifFalse: [^self expected: 'right parenthesis']. ^true]. (hereType == #string or: [hereType == #number or: [hereType == #literal or: [hereType == #character]]]) ifTrue: [parseNode := encoder encodeLiteral: self advance. ^true]. (here == #- and: [tokenType == #number and: [1 + hereEnd = mark]]) ifTrue: [self advance. parseNode := encoder encodeLiteral: self advance negated. ^true]. ^false!
-- _,,,^..^,,,_ best, Eliot