New subject: Proposal3: Make $_ a valid identifier character

30 May 2000


      On Tue, 30 May 2000, Richard A. O'Keefe wrote:
...
Richard, you misunderstood me. I questioned that the *reason*, the
   *motivation*, for Smalltalk not having underscores in identifiers was that
   ASCII lacked them, not that ASCII lacked them.
I already *asked* in this thread whether anyone who was there at the time
knows about this.
So point me to your received answer which allows you to state, with the
absolute confidence that you do, that the *original* reason for not
including the underscore, *and* the reason for it not being included in
Smaltalk-80, was it's original unavailibility in ASCII. I'm prepared to
believe it, obviously, but only with some suitable confirmation.
...
Given that Smalltalk developed on computers custom built, and mostly
   intended for internal use, I find this is touch hard to believe. A
   reference would be welcome.
However, I'd put it the other way around.  The identity of the old Smalltalk
character set with old ASCII is too much of a coincidence to accept as the
result of some special Smalltalk decisions, especially since the Blue Book
(p650) refers to the keyboard delivering ASCII codes (when ALTO codes were
also available).  We're told on p114 of the Blue Book that
"Each [character] is associated with a code in an extended ASCII
character set".  So how come the *only* non-ASCII characters shown in
the Blue Book are *precisely* the old versions of the two ASCII characters
that are *not shown*?  Presumably if they were inherited from old ASCII,
and the extensions were other characters again.
Nice speculation. But, while plausible, it doesn't establish your
case.
...
For the record, I don't think your linguistics based argument is nearly as
   air tight as you seemed to thing it was. I notice you didn't say a word
   about the fact that I take "multi word" identifiers to be single words,
   not collections of them.
Well, that's a rather unusual way to view them.  My native language (English)
It's my native language too.
...
is not an agglutinating one, so it's practically impossible for me to
Well, one of us must be horribly idiosyncratic. Which, regardless of
issues of taste, makes us, without establishing which of our
psycholinguistic structures is deviant (and they *both* could be), bad
bases for determining what, rationally speaking, should be the case here.
...
mentally process multi-word identifiers as single "words", just as it's
very hard for me to hear "r" and "l" as vowels.
So you have trouble with 'groundhog'?
[snip]
...
Furthermore, I didn't see anything above about
   the linguistic data on the superiority of $- to $_ as separators.
I don't have any.  I'd expect the hyphen to be the winner.
Lisp gets away with it because it doesn't have any infix operators.
None of the Fortran family (Fortran, Algol, Pascal, Ada, ...) has
adopted hyphen as word separator, not that I know of, and for the
obvious reason that it would be ambiguous with infix minus.
Others have pointed out the infix _ in Squeak as it stands, and the
possibility of distinguishing the minus and the hyphen.
Needless to say Sequencable-Collection is still very bad, IMHO. But, let
us not debate this further.
[snip]
...
So would I.  However, I note that the fact that Sequenceable_Collection
is visibly broken into chunks seems to make it easier to see and check
each of the chunks.  There was so _much_ text in one great fat lump in
"SquencableCollection" that it was hard to see the typo.
Which is why the lovely correction facilities of a Smalltalk IDE make life
so much easier. Sicne I ma prune to tipos evan in spcede dilminted words,
I don't take this is as at all decisive.
...
of course, we might just look at extant bodies of Smalltalk code with and
without $_ed indentifiers. That might provide a few clues.
Given the existance of a free VisualWorks and a free Squeak, it might
actually be possible to _do_ such experiments.
Of course. Realize that my strong opposition stems from engaging to port
some Smalltalk code which made fairly heavy use of _s from VisualWorks to
Squeak. The indentifiers which used _s tended to, imho, pointlessly to do
so. The indentifiers were bad in themselves (imho) and the underscores
merely...er..underscored that fact.
A fully underscored set of indentifiers would not recognizably be
Smalltalk. Nor would a fully hyphenated one. It might *be* Smalltalk, but
it would be difficult to recognize as such. I think that is a
consideration against full underscoration. I know you do not, so I'll let
it drop here.
...
What would be particularly interesting to know, of course, would be why
some of the more commercial (NOT the same as more valuable!) Smalltalks
have already made the transition.
I'll have to see if the draft ansi spec has a rationale for this. I'll
note that AFAICT, systematic underscorification doesn't seem to have
occured.
Cheers,
Bijan Parsia.

Re: Proposal3: Make $_ a valid identifier character