Proposal3: Make $_ a valid identifier character

Bijan Parsia bparsia at email.unc.edu
Tue May 30 16:25:59 UTC 2000


On Tue, 30 May 2000, Richard A. O'Keefe wrote:

> 	Richard, you misunderstood me. I questioned that the *reason*, the
> 	*motivation*, for Smalltalk not having underscores in identifiers was that
> 	ASCII lacked them, not that ASCII lacked them.
> 	
> I already *asked* in this thread whether anyone who was there at the time
> knows about this.

So point me to your received answer which allows you to state, with the
absolute confidence that you do, that the *original* reason for not
including the underscore, *and* the reason for it not being included in
Smaltalk-80, was it's original unavailibility in ASCII. I'm prepared to
believe it, obviously, but only with some suitable confirmation.

> 	Given that Smalltalk developed on computers custom built, and mostly
> 	intended for internal use, I find this is touch hard to believe. A
> 	reference would be welcome.
> 	
> However, I'd put it the other way around.  The identity of the old Smalltalk
> character set with old ASCII is too much of a coincidence to accept as the
> result of some special Smalltalk decisions, especially since the Blue Book
> (p650) refers to the keyboard delivering ASCII codes (when ALTO codes were
> also available).  We're told on p114 of the Blue Book that
> "Each [character] is associated with a code in an extended ASCII
> character set".  So how come the *only* non-ASCII characters shown in
> the Blue Book are *precisely* the old versions of the two ASCII characters
> that are *not shown*?  Presumably if they were inherited from old ASCII,
> and the extensions were other characters again.

Nice speculation. But, while plausible, it doesn't establish your
case.

> 	For the record, I don't think your linguistics based argument is nearly as
> 	air tight as you seemed to thing it was. I notice you didn't say a word
> 	about the fact that I take "multi word" identifiers to be single words,
> 	not collections of them.
> 
> Well, that's a rather unusual way to view them.  My native language (English)

It's my native language too.

> is not an agglutinating one, so it's practically impossible for me to

Well, one of us must be horribly idiosyncratic. Which, regardless of
issues of taste, makes us, without establishing which of our
psycholinguistic structures is deviant (and they *both* could be), bad
bases for determining what, rationally speaking, should be the case here.

> mentally process multi-word identifiers as single "words", just as it's
> very hard for me to hear "r" and "l" as vowels.

So you have trouble with 'groundhog'?

[snip]
> 	Furthermore, I didn't see anything above about
> 	the linguistic data on the superiority of $- to $_ as separators.
> 
> I don't have any.  I'd expect the hyphen to be the winner.
> Lisp gets away with it because it doesn't have any infix operators.
> None of the Fortran family (Fortran, Algol, Pascal, Ada, ...) has
> adopted hyphen as word separator, not that I know of, and for the
> obvious reason that it would be ambiguous with infix minus.

Others have pointed out the infix _ in Squeak as it stands, and the
possibility of distinguishing the minus and the hyphen.

Needless to say Sequencable-Collection is still very bad, IMHO. But, let
us not debate this further.

[snip]
> So would I.  However, I note that the fact that Sequenceable_Collection
> is visibly broken into chunks seems to make it easier to see and check
> each of the chunks.  There was so _much_ text in one great fat lump in
> "SquencableCollection" that it was hard to see the typo.

Which is why the lovely correction facilities of a Smalltalk IDE make life
so much easier. Sicne I ma prune to tipos evan in spcede dilminted words,
I don't take this is as at all decisive.

>of course, we might just look at extant bodies of Smalltalk code with and
> without $_ed indentifiers. That might provide a few clues.
> 	
> Given the existance of a free VisualWorks and a free Squeak, it might
> actually be possible to _do_ such experiments.

Of course. Realize that my strong opposition stems from engaging to port
some Smalltalk code which made fairly heavy use of _s from VisualWorks to
Squeak. The indentifiers which used _s tended to, imho, pointlessly to do
so. The indentifiers were bad in themselves (imho) and the underscores
merely...er..underscored that fact.

A fully underscored set of indentifiers would not recognizably be
Smalltalk. Nor would a fully hyphenated one. It might *be* Smalltalk, but
it would be difficult to recognize as such. I think that is a
consideration against full underscoration. I know you do not, so I'll let
it drop here.

> What would be particularly interesting to know, of course, would be why
> some of the more commercial (NOT the same as more valuable!) Smalltalks
> have already made the transition.	

I'll have to see if the draft ansi spec has a rationale for this. I'll
note that AFAICT, systematic underscorification doesn't seem to have
occured.

Cheers,
Bijan Parsia.





More information about the Squeak-dev mailing list