String refactoring

Andreas Raab andreas.raab at gmx.de
Tue Apr 12 08:07:56 UTC 2005


Hi Yoshiki,

>   Ah, not denying, but we haven't have the privilage to have Russian
> eToys projects floating around on the net yet.  For the future
> releases, #capitalize should try to capitalize a character if the
> Unicode Consortium thinks the character is capitalizable.
> 
>   A more dirty way to keep the backward compatibility is that
> #capitalize changes its behavior when the primary language setting is
> Japanese or Korean and only when the image is dealing with the
> projects from the older images.

I wish we'd have a way of just fixing the symbols as they come in... the 
more I'm thinking about it the more it seems the Right Solution to me... 
it would allow us to stay internally clean and yet deal with any 
existing projects. The points you make (about primary language etc) is a 
good one - I need to think about it.

>>I think I remember what you refer to and as far as it goes I will say 
>>that I agree that using upper/lower case for semantic distinctions (such 
>>as whether a character is allowed as the first character of a global 
>>name) is a not a good idea. That said, we (that is "we western guys") 
>>understand fairly little about eventual cultural constraints that other 
>>languages may make.
> 
>   It is not really cultural constraint.  Take Java as an example.
> Class names can begin with any "letter" (close to Unicode definition +
> some chars like "_"), instance variable names (for, err, instance) can
> begin with any "letter" (close to Unicode definition + some chars like
> "_"), etc., etc.  Just we shouldn't assume "letters" don't necessarily
> have the counterpart.  German has "eszett" so it isn't that alian...

Oh yes - it's a perfectly legal letter just one that cannot be 
capitalized. But notice that Character deals ... or used to deal? I 
haven't looked at this in quite a while but there was code that dealt 
with it ;-) ... anyway, let's just say that Character *can* deal with 
this just fine. "eszett capitalized" will just answer the same thing 
since there ain't no capitalized version.

>   In Squeak, one could make an instance variable that begins with
> eszett, but not a class name or other globals.  This kind of
> restriction will be too strict in general in other languages.

True. And yet, culturally speaking that is exactly the right thing to 
do! An eszett can *never* start a word so disallowing this would be one 
of the "cultural constraints" I talked about. I will note though that 
this particular discussion starts at a bad point since in German all 
nouns are spelled with a capital letter first so it makes a *ton* of 
sense to have the restriction that globals (which are always nouns) must 
begin with a capital letter ;-)

Cheers,
   - Andreas



More information about the Squeak-dev mailing list