UTC-8 (was Re: Celeste encoding (was: Duplicate messages in Celeste))

AGREE at CarltonFields.com AGREE at CarltonFields.com
Thu Mar 16 20:11:25 UTC 2000


> I dunno, these are just some of the issues that spring to mind.  It
> would be awesome to do the switch while Squeak is still relatively
> young, but it ain't trivial.

Of course it ain't trivial, but perhaps there's an interim, if not ad hoc solution that serves every relevant purpose?  It seems to me that the Number hierarchy is proof positive that widely disparate, differently sized and incomparable models with similar features can be resolved into a seamless whole.

In a sense, isn't a pure ASCII string just a subset of UTC-8?  Can't a hierarchy with built-in coercion be used to preserve ALL of the efficiencies of the status quo, while still permitting (or at least paving the way) toward the full generality of UTC-8 and Unicode?

Why can't the ASCII string be the SmallInteger of a new STRINGTHING hierarchy, where operations within the string world be seamless?  Every time I raise this point, there were countless objections about things Squeak so configured could not do (the biggest deal was auto-reversing Hebrew/Anglo-Numeric text), but it seems that we could still accomodate many of the advantages of Unicode, integrate the whole into Squeak, while preserving ALL of the efficiencies of the present ASCII world for unmixed ASCII and Character stuff.

Or at least we should try real hard to think (or hack) through the question before doing nothing because of an apparent lack of purity.





More information about the Squeak-dev mailing list