Proposal3: Make $_ a valid identifier character

Richard A. O'Keefe ok at atlas.otago.ac.nz
Wed May 31 00:39:27 UTC 2000


I wrote:
	> I already *asked* in this thread whether anyone who was there at the time
	> knows about this.
Bijan Barsia (for whom I have the highest respect) wrote:
	So point me to your received answer which allows you to state, with the
	absolute confidence that you do, that the *original* reason for not
	including the underscore, *and* the reason for it not being included in
	Smaltalk-80, was it's original unavailibility in ASCII.

All I said was that I had *asked*.  I didn't ever say that I had a reply.
However, as I've stated in another message, it is *certain* that the BCPL
character set used on the Alto was compatible with ASCII-63 and incompatible
with ASCII-67.

Now the Alto system supported bitmapped fonts, indeed. it supported the
first WYSIWYG editor I ever heard of: Bravo.  So it would not have been
beyond the powers of the Smalltalk designers to build a special Smalltalk
font (or set of fonts) that was consistent with ASCII-67.  Mind you,
they'd still have had to contend with whatever the Alto keycaps were.
Presumably it was easier to go along with the same characters that BCPL
and Mesa used, and no-one can deny that the left arrow and up arrow are
*useful* in programming.

I think it's time for anyone who disputes this to go and find some Alto
manuals for themselves.

	> So how come the *only* non-ASCII characters shown in
	> the Blue Book are *precisely* the old versions of the two ASCII characters
	> that are *not shown*?  Presumably if they were inherited from old ASCII,
	> and the extensions were other characters again.
	
	Nice speculation. But, while plausible, it doesn't establish your
	case.
	
Presumably only a signed testimonial from one of the Smalltalk designers
would do that.  But I *have* established that

	- *none* of the characters present in ASCII-67 but not in
	  (October 1963) ASCII-63 are used in Smalltalk-80 (as of the Blue
	  Book)

	- *all* of the characters used in Smalltalk-80 that are not
	  in ASCII-67 *are* in ASCII-63.

	- other languages (BCPL, Mesa) on the Alto also appear to have
	  been based on (October 1963) ASCII-63 (reference manual for
	  BCPL, Mesa fragments in other documents), even in 1979.

	- at least for BCPL, prior coding conventions were altered to
	  conform to the same characters as ASCII-63 rather than adapting
	  the character set to the language.

Don't forget, in 1979 at least one 6-bit character set was still (just
barely) in use (BCL) and US-TTY (a 5-bit code) is still in use today.
There has _never_ been a time when everyone used the same character set,
even in America, and Smalltalk's designers certainly had other concerns
than compatibility with C.

	Well, one of us must be horribly idiosyncratic. Which, regardless of
	issues of taste, makes us, without establishing which of our
	psycholinguistic structures is deviant (and they *both* could be), bad
	bases for determining what, rationally speaking, should be the case here.
	
It is really frustrating having the point of my argument ignored.
It is not *my* psycholinguistic structure or *my* anything which is
at issue here.  The Ada Quality and Style Guidelines (the most thorough
style manual I've ever seen for _any_ language, would that there were a
similar book for Smalltalk), other software engineering books I've checked,
_all_ recommend separated_words (AQ&S recommends capitals after the low-
lines as well).

	> mentally process multi-word identifiers as single "words", just as it's
	> very hard for me to hear "r" and "l" as vowels.
	
	So you have trouble with 'groundhog'?
	
I didn't say I can't hear "r" and "l", I said I can't hear them AS VOWELS.
That is, I have trouble with words like Krshna (no, I didn't leave out an
"i", that's the point).

	> Given the existance of a free VisualWorks and a free Squeak, it might
	> actually be possible to _do_ such experiments.
	
	Of course. Realize that my strong opposition stems from engaging to port
	some Smalltalk code which made fairly heavy use of _s from VisualWorks to
	Squeak. The indentifiers which used _s tended to, imho, pointlessly to do
	so. The indentifiers were bad in themselves (imho) and the underscores
	merely...er..underscored that fact.
	
I'm a wee bit baffled here.  You are _opposing_ a change which would
have made _your_ life easier?

Most identifiers in most programs are bad.  Sturgeon's Law.
We've seen a low-line-less example in this mailing list recently,
where someone used #rep: where #repeat: would have been better.

	A fully underscored set of indentifiers would not recognizably be
	Smalltalk.

To some people it would.  We're back in the area of personal taste, here.

	> What would be particularly interesting to know, of course, would be why
	> some of the more commercial (NOT the same as more valuable!) Smalltalks
	> have already made the transition.	
	
	I'll have to see if the draft ansi spec has a rationale for this. I'll
	note that AFAICT, systematic underscorification doesn't seem to have
	occured.
	
First question of interest:  who moved first?  Did the commercial systems
adopt low-line first, and then the standard adopt it because it _was_
de facto standard?  Or did ANSI Smalltalk adopt low-line first, and the
commercial systems move to conform to the standard?





More information about the Squeak-dev mailing list