Hi Christoph,
On Fri, Nov 25, 2022 at 7:47 AM christoph.thiede@student.hpi.uni-potsdam.de wrote:
Hi Eliot,
thanks for the great discussion. Your reply cleared a lot of things up for me -- I very much agree that the default value of any variable is not "undefined", but specifically nil.
What I still find an interesting trade-off is that of "implicit vs explicit" aka "terseness vs readability". I think its solution depends on the typical readers of the code I am writing. If they are Squeak Core developers or VM developers, I can assume that they know the default value for any variable. For unexperienced programmers, I hope they do know. Still, it's easier for anyone to forget if it's not explicitly written down. Yes, I can translate | a | to | a *:= nil* | in my head, but that also might be a tiny portion of mental overhead.
Right. One of the design principles of Smalltalk, one of its more important design principles, is to use the least amount of rules possible (but no fewer) and apply them consistently. This applies, for example, to Maxwell's equations, right? It is a critical attribute of good scientific models, good legal systems, good road traffic laws, not just good programming languages. Finding a small set of easily memorable rules, or principles, that are orthogonal and cohere to provide a calculus which can be used to construct or predict behaviour or consequence, is a delight and a blessing.
There was one rule in Smalltalk that was awful; that was a result of the implementation showing through, (that IIRC got into the ANSI standard), that we fixed. That was that the value of an empty block that took arguments, was the value of the last argument. So [:a :b|] value: 1 value: 2 would evaluate to 2. Thankfully this has been fixed and the value of an empty block is always nil, irrespective of its arguments. ColorArray new: n is not something that needs to be fixed, or alarmed at. It can be experienced (I discovered it this last week, I was not astonished). I wager you will never forget that the default value of a ColorArray indexed instance variable is Color transparent. You may even use it in your teaching as an amusing and interesting example for the initialization rule. I doubt that it would tax any new Smalltalker that has already encountered bits collections, and would serve to help them remember the initialize-with-zero sub-rule.
Example 2, ColorArray: My first take would be indeed be to favor ColorArray
new: 256 withAll: Color transparent over ColorArray new: 256, because I have never used this class before and would not be sure whether the values default to black or transparent or anything else. I could look it up, but I would likely forget it several times before remembering. Why should I do this to myself (and to any other non-ColorArray expert)?
The thing about ColorArray that's important to our discussion is that a new instance is initialized; it doesn't contain random crap. This is the point about initialization that applies to all Smalltalk variables, temporary, global, named instance, and indexed instance. We have one rule, if the variable holds pointers it is initialized by nil, if the variable holds bits it is initialized with the all-zeros bit string. This has some unexpected consequences such as (ColorArray new: 10) anyOne = Color transparent, but the nice thing is that the rule is very easy to learn, and so when one discovers (ColorArray new: 10) anyOne = Color transparent one is not astonished. [the principle of least astonishment)
Example 3, point arithmetic: Here I'm completely with you, but maybe I'm also biased because I have already spent some time working with visual domains in Squeak. :-)
:-) indeed!
Example 4, coreutils: ls, cat, and less are all lisping, catastrophic, or at least less intuitive names IMHO. Most of us will have got familiar with them, but from a Smalltalk perspective that is seeking for readability, I still would favor them to be named ListFiles, PrintFile, and OpenFileForRead. On the contrary, many people who are used the present nomenclature would dismiss these proposals as hard to write and maybe even hard to read. Is "terseness vs readability" as this syntactical point really a matter of data (code) quality (that must be ensured by the writer), or more a matter of the right tooling (that can be adjusted by the editor/reader)?
Learning to play a musical instrument, learning to wield a golf club, learning to drive a car, ride a bike, accumulate any useful skill, involves a tradeoff. Masterful use always drops the training wheels. Eventually we always advance to using a functionally efficient interface and/or set of skills. This makes like hard for the learner. They have to acquire the skills that allow them to reach competence on their way to mastery. But we do not serve ourselves by preventing mastery, by protecting us against mastery. If we do, we end up with inexpressive music, shirt and unexciting drives, 30 kph autobahns, 70 kg bikes that don;'t lean in the corners.
The Prussian Model of Education was set up under Bismark, its intent being to produce a generation of easily ruled consumers and cannon fodder. Its means were to teach badly, to render its subjects insecure, consciously inferior to those educated in elite private schools. This model was adopted by th eUS senate immediately after the civil war as the blueprint for the US's public education system. It serves as the ideal instrument of oppression in faux democracies. [I was incredibly fortunate that it was not used in my public primary school; unsurprisingly, teachers that love children hate the Prussian Model].
Driving a child to school, rather than having them walk, and/or catch a bus, results in an adult less at ease moving around the city. It infantilises them. It constrains them their whole lives.
Wittgenstein's realisation was that language means what we collectively decide it means, not what it is defined to mean, and the fact that a substantial number of American adults think that "inflammable" means the opposite of "flammable" is alarming, an example that militates for better teaching, not Newspeak.
If we are to live meaningful, rich, lives that enrich those of others, we must enculturate, we must take off the training wheels, we must fashion that which is concise, powerful and call it elegant.
---
I want a warning for the usage of b in "c := b", "d" in "#(1 2 3)
sorted: d", g in "g perform: #ifNotNil: with: [b := g]". I *don't* want to be told about a in "a ifNil: [a := 1]", c in "c ifNil: [c := 3]", or e & f in "(e isNil or: [f isNil]) ifTrue: [e := f := 6]".
Thank you for clarification. So the idea of #queryUndefined (will rename that) is to fix human forgetfulness.
Yes.
In the second group of examples, you explicitly consider these variables being nil, so you don't need a reminder from the compiler (or with regards to Tobi's argument, you're already dealing with the meta-level). In the first group of examples, one could say that you are covering tracks of your own possible forgetfulness and spreading possibly unsassigned values, so it's more important for the tooling to point you to that possible slip. Yes, I guess now that makes more sense to me. :-)
Cool.
Still, the scope of this warning remains a bit blurry for me. Maybe that's because we are approximating a type analysis engine here with a *very* rough heuristic.
I think that's exactly why.
For instance:
| a | a ifNotNil: [a *:=* a + 1]. ^ a
In this example, I do *not* want a warning for "a + 1" but I *do* want a warning for "^ a" as a still might be unassigned. Currently, the reality is just the other way around. But that is probably out of scope for the current architecture ...
I love the pragmatic design principle from the LRG/SCG at PARC that I was taught on arrival at ParcPlace Systems: don't let the perfect be the enemy of the good. It's ok to have 95% of the solution.
(As a side note, some other confusing thing around this notification for me
is the fact that our compiler essentially integrates a few rudimentary linter tools (namely, UndefinedVariable and UnknownSelector) and forces the user to interact with them in a modal fashion. To be honest, I never liked that "modal linter style" and often wish we had some more contemporary annotation-/wiggle-line-based tooling for that.
+1000 !!!!! Thanks for pointing this out. This is an excellent suggestion. I *hate* clicking on the "declare as a foo variable" dialog for a sequence of as-yet-undeclared temp vars, only to hit a lint rule, or syntax error, and have all those declarations discarded.
My typical interaction with #queryUndefined looks like this: me: accept new code, compiler: you did not assign foobar, me: oops, you're right, let me fix that; or alternatively: yes, that was intended, let me explicate that. Ignoring and proceeding from this warning has never felt acceptable for me as this would put the same confusion on any future editor of the method.)
:-) Hope I'm not too strident :-)
No problem, it was very interesting and I learn a lot from your replies. :-)
Best, Christoph
*Sent from **Squeak Inbox Talk https://github.com/hpi-swa-lab/squeak-inbox-talk*
On 2022-11-22T23:36:09-08:00, eliot.miranda@gmail.com wrote:
Hi Christoph, Hi Marcel,
apologies about the font size mismatches...
On Wed, Nov 23, 2022 at 2:25 AM Marcel Taeumel <marcel.taeumel at hpi.de
wrote:
Hi Christoph --
IMHO, it unnecessarily complicates the simple Smalltalk syntax. [...]
Nah, this is just a tooling change, not a syntactical one.
+1
Yes, I would like to have this info skipped for #isNil as well. Note
that one
should not use #ifNotNilDo: anymore.
Good idea. I'll include it.
Best, Marcel
Am 23.11.2022 11:00:43 schrieb Thiede, Christoph < christoph.thiede at student.hpi.uni-potsdam.de>:
Hi Eliot, hi all,
I'm skeptical about this change, as it creates or expands a special
role
of the selectors #ifNil:, #ifNotNil:, and their combinations. IMHO, it unnecessarily complicates the simple Smalltalk syntax. While I know and sometimes dislike these UndefinedVariable notifications, too, I don't
know
whether differentiating them by the selector is the right strategy to improve this situation.
Please indulge me. It's f***ing irritating to be told by the compiler
that as temp var appears to be uninitialized when one is intentionally using the fact that temps are initialized to nil. And that temp vars are initialized to nil is a) essential knowledge and b) a good thing (no uninitialized local variables a la C, a sensible value to initialize a variable with).
BTW, I find it more than sad (a little alarming in fact) that someSmalltalkers don't know that the value of several conditionals that take blocks is nil when the condition doesn't select the block. e.g.
false
ifTrue: [self anything] is nil. I see "expr ifNotNil: [...] ifNil: [nil]" and it strikes me as illiterate. I recently visited code written by a strong programmer who open coded a lot of point arithmetic, decomposing e.g. a * b into (a x * b x) @ (a y * b y). It's bad. It gradually degrades the code base in that it isn't always an exemplar of best practice,
Consider the following examples:
| a b c d e f g h | a ifNil: [a := 1]. c := b. c ifNil: [c := 3]. #(1 2 3) sorted: d. e := 5. (e isNil or: [f isNil]) ifTrue: [e := f := 6]. g perform: #ifNotNil: with: [b := g]. h ifNotNilDo: [h := 8].
How would you explain to a naive Smalltalker which of these variables
will
be marked as undefined at this point and why? (Of course, you can
explain
it by pointing to the implementation, but I think that's a
significantly
less intuitive explanation than just saying "you must declare any
variable
before using it".)
No. It's a hard-and-fast rule that all temp vars are initialized to nil. And initializing a variable (to other than nil) is done by assigning it. In the above a through h are declared within the vertical bars.n They are initialized in the assignments. I want a warning for the usage of b in "c := b", "d" in "#(1 2 3) sorted: d", g in "g perform: #ifNotNil: with: [b
:=
g]". I *don't* want to be told about a in "a ifNil: [a := 1]", c in "c ifNil: [c := 3]", or e & f in "(e isNil or: [f isNil]) ifTrue: [e := f := 6]". I never want to see "ifNotNilDo", ever ;-) (* note that a couple of years back we fixed a bad bug in the compiler where block local temps were not (re)initialized to nil on each
iteration,
leaking their values from previous iterations, breaking the "all temp
vars
are initialized to nil rule, and revealing implementation details in the compiler's inlining of to:[by:]do: forms)
This behavior leads to a mental model that disambiguates between null
and
undefined similar to JavaScript which I never have found helpful.
I don't see how that applies. Smalltalk has no undefined. It has nil & zero, and these values are used to initialize any and all variables. This is not an artifact of the implementation. It is a fundamental part of the language design. It results in no dangling referents or uninitialized variables. The language used in Parser>>#queryUndefined is problematic. It should be "unassigned", not "undefined". There is nothing undefined about these variables. But they are indeed unassigned. In some cases (see my i=diomatic implementation of subsequences: and substrings) this can (and *should*) be used to advantage. And all Smalltalk programming courses should explain that variables are always initialized (either to nil or zero, & hence by extension 0.0, Character null, Color transparent, et al), and may need assignment before their referents get sent messages.
I see the same kind of sloppiness in people not knowing that conditionals that take blocks typically evaluate to nil when the condition doesn;t select the block. So always "expr ifNotNil: [...]", never "expr ifNotNil: [...] ifNil: [nil]", or "expr ifNotNil: [...] ifNil: []". I recently cleaned up code by as string programmer who had open coded point
arithmetic
(e.g. a * b written as (a x * b x) @ (a y * b y) ). This is really bad: it's exemplifying poor practice, it's verbose, it takes away at least as much understanding as it conveys, it leads to more difficult to manage
code.
If we fail to teach the language properly we start on a slippery slope to duplication (which is an awful evil, leading to much increased
maintennance
effort, and brittleness), and rendering perfectly good, well thought-out idioms mysterious. It;'s not like Smalltalk has a lot of rules; the number, compared to C & C++ et al is tiny. And terseness has not just aesthetic benefit, but real practical benefit in terms of readability & maintainability.
Also, with this change, the compiler leaks the default value of any temporary variable, which we previously were able to hide at least partially.
But that is a MISTAKE!! The language designers didn't arrange for temps
to
be initialized to nil just because that's the only default. They did it
to
ensure that there is no such thing as an uninitialized variable in Smalltalk. That's why nil ids an object, with a class, not just nil. That's why nil ~~ false. It's carefully thought out and not just some artifact of the implementation. And that rationale (read the blue book carefully) and its implications, should be taught/learned/known, and especially exemplified by the core code of Squeak trunk, and hence supported by the compiler.
In many cases, I think explicitly setting a temporary variable to nil before it is initialized within some non-trivial conditional complex
would
be more explicit, thus more readable, and something which we should generally encourage programmers to do.
I disagree. You're advocating for absurdities such as
| colors | colors :=- ColorArray new: 256. colors atAllPut: Color transparent
This is the kind of thinking that leads to cycling wearing American Football clothes. It won't keep you from being run over by a truck, but it'll make you so slow and reduce your peripheral vision so much, not to mention give you a false sense of security, that you'll be much more
likely
to be run over by a truck...
Looking forward to your opinion!
:-) Hope I'm not too strident :-)
Best,
Christoph
*Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org>
im
Auftrag von commits at source.squeak.org <commits at source.squeak.org
*Gesendet:* Mittwoch, 23. November 2022 04:10:30 *An:* squeak-dev at lists.squeakfoundation.org; packages at lists.squeakfoundation.org *Betreff:* [squeak-dev] The Trunk: Compiler-eem.480.mcz
Eliot Miranda uploaded a new version of Compiler to project The Trunk: http://source.squeak.org/trunk/Compiler-eem.480.mcz
==================== Summary ====================
Name: Compiler-eem.480 Author: eem Time: 22 November 2022, 7:10:27.324796 pm UUID: 3e5ba19e-c44a-4390-9004-de1246736cbc Ancestors: Compiler-eem.479
Do not warn of an uninitialized temporary if it is being sent ifNil: or ifNotNil:.
=============== Diff against Compiler-eem.479 ===============
Item was changed: ----- Method: Parser>>primaryExpression (in category 'expression
types')
primaryExpression hereType == #word ifTrue: [parseNode := self variable.
- (parseNode isUndefTemp
- and: [(#('ifNil:' 'ifNotNil:') includes: here)
not
- and: [self interactive]])
- ifTrue:
- [self queryUndefined].
- (parseNode isUndefTemp and: [self interactive])
- ifTrue: [self queryUndefined].
parseNode nowHasRef. ^ true]. hereType == #leftBracket ifTrue: [self advance. self blockExpression. ^true]. hereType == #leftBrace ifTrue: [self braceExpression. ^true]. hereType == #leftParenthesis ifTrue: [self advance. self expression ifFalse: [^self expected: 'expression']. (self match: #rightParenthesis) ifFalse: [^self expected: 'right parenthesis']. ^true]. (hereType == #string or: [hereType == #number or: [hereType == #literal or: [hereType == #character]]]) ifTrue: [parseNode := encoder encodeLiteral: self advance. ^true]. (here == #- and: [tokenType == #number and: [1 + hereEnd = mark]]) ifTrue: [self advance. parseNode := encoder encodeLiteral: self advance negated. ^true]. ^false!
_,,,^..^,,,_ best, Eliot