UTC-8 (was Re: Celeste encoding (was: Duplicate messages inCeleste))
AGREE at CarltonFields.com
AGREE at CarltonFields.com
Thu Mar 16 22:42:14 UTC 2000
Fair warning -- I'm going to do this if nobody else does. But for the next few days, I have to wind up my legal practice, and then I have about a week off. While it isn't a hot date, I am going to be doing a few all-nighters more than I would like and all the time I will be far away from my favorite Squeak machine. Then I have to finish my Squeakbook chapter. So for someone else who is looking for a deep, fun and seminal project whose result if done right is likely to get your progeny in the SqC release on an important module, this is the one. But don't wait too long, for I'll have some idle time soon. So speak up. Any takers?
> -----Original Message-----
> From: MIME :Dan.Ingalls at disney.com > Sent: Thursday, March 16, 2000 5:26 PM
> To: squeak at cs.uiuc.edu
> Subject: RE: UTC-8 (was Re: Celeste encoding (was: Duplicate messages
> > > AGREE at CarltonFields.com wrote...
> >Of course it ain't trivial, but perhaps there's an interim, > if not ad hoc solution that serves every relevant purpose? > It seems to me that the Number hierarchy is proof positive > that widely disparate, differently sized and incomparable > models with similar features can be resolved into a seamless whole.
> >In a sense, isn't a pure ASCII string just a subset of > UTC-8? Can't a hierarchy with built-in coercion be used to > preserve ALL of the efficiencies of the status quo, while > still permitting (or at least paving the way) toward the full > generality of UTC-8 and Unicode?
> >Why can't the ASCII string be the SmallInteger of a new > STRINGTHING hierarchy, where operations within the string > world be seamless? Every time I raise this point, there were > countless objections about things Squeak so configured could > not do (the biggest deal was auto-reversing > Hebrew/Anglo-Numeric text), but it seems that we could still > accomodate many of the advantages of Unicode, integrate the > whole into Squeak, while preserving ALL of the efficiencies > of the present ASCII world for unmixed ASCII and Character stuff.
> > I agree with this approach entirely. It's a great Squeak > Samuri project (I would do it tonight, but I've got a hot > date ;-). Just put StringThing between ArrayedCollection and > String, move all of String's methods up a level, leaving only > those that have to do with String's primitive behavior. It > shouldn't take more than an hour, and everything should still work.
> > Then... define, say, String16 (*) that uses 16 bits and > produces characters with codes up to 65535. Make one up like > 'Squ<999>eak', and see if it prints. Then see if it > displays. Etc. Lots of things will break, but that's half > the fun. You'll find out if text display handles characters > that are not in the font, and you'll have to decide whether > all characters will still be unique, but this is what life on > the frontier is all about.
> > When in doubt, try it out.
> > - Dan
> > (*) It's probably worth starting with the most general > expansion first. Then from there on, it's only optimization > and engineering to do the others -- the interfaces will have > all been worked out.
> > PS: I'm not saying SqC will embrace unicode, I'm just saying > that it may only take a couple of days to understand most of > what is involved.
More information about the Squeak-dev