Case-insensitive selectors

Joe Davison jwdavison at lucent.com
Wed Jan 28 15:30:03 UTC 1998


Mark Wai said: 
> The reason I brought this thing up is because I don't just use a
> language.  I also want to understand why a particular "feature" was
> implemented.  This can help me to more effectively use a language if I
> understand the design and the philosophy behind it.  It is a pity if the
> existence of a lanaguge's feature is due to the fact that traditionally
> that particular feature is implemented the same way in other languages
> without considering the language's own unique capabilities and nature. In
> order to improve a language, it needs to be evolved to a better one.  In
> this case, I just don't understand why Smalltalk method has to be case
> sensitive while I can think of a stronger case for it being
> case-insensitive. That's why I ask for design rationale behind it.  This
> is probably a tiny little issue for 99.9% of people that is not worth
> discussing but in my work the case sensitive and its ambiguity affect
> some of my design decisions.
> 


I can't speak as to why it was done with smalltalk, but if I were designing
a language, I'd recognize that:

a) Most users are human, and most human languages are case-insensitive.
   
b) It is useful to know the scope of a variable, and, for humans,
   recognition is easier than recall -- thus, using the case of the initial
   letter to indicate local/global eases recognition of the scope
   
c) In case-insensitive computer languages, there may be several different
   case-spellings of the same identifier, which can cause confusion to
   someone used to case sensitive languages.  Many of us are C programmers
   as well as smalltalk programmers.  C is definitely case-sensitive.
   Thus, many of us would expect aMethodSelector and amethodselector to be
   different.  Not a big issue, but it can be annoying.


I can probably come up with a few contrived examples where two legitimate
selectors might differ only in capitalization on word boundaries -- that
is, the selectors would consist of several words, with the boundaries
delimited by capitalization -- if the letter sequence is the same but the
word boundaries differ, they would be indistinguishable: #useLess and
#useless being a trivial example -- I've encountered more interesting
examples, but they're harder to generate


If one were to go for case insensitivity, it might be useful to have the
case preserved with the symbol, and a mechanism be provided to flag variant
capitalizations  when encountered -- so inadvertent differences, such as
#asUpperCase and #asUppercase would at least be correctible.  

It seems to me that hardware and historical considerations lead to
case-insensitivity -- Lisp, Fortran, and their peers were invented at a
time when lowercase characters were, in general, unavailable on computing
equipment -- a 64 character code was used.  Once the 128 character code
became widely available, lowercase was initially only allowed in comments.
It is, however, much easier to read programs in mixed case, or lowercase
(SOMEHOW THIS SEEMS LIKE SHOUTING), so there's a reason for allowing lower
case -- then one has to deal with compatibility issues, and
case-insensitivity seems like a reasonable approach.

C and Smalltalk came in to existence after the 128 character codes were
common, and thus case sensitivity could be used for semantic purposes...

The introduction of unicode for "modern" langauges may make it even more
important to revisit this question.  Likewise the ready availability of
fonts -- should keywords be in boldface, as was the case for the Algol-60
publication langauge?  If so, the same letter-string not bold, is free to
use as an identifier...  Gag!

joe





More information about the Squeak-dev mailing list