Unicode support

agree at carltonfields.com agree at carltonfields.com
Tue Sep 14 20:07:59 UTC 1999


Of course, a TwoByteArray class can easily be defined that will transparently mimic what you want using either the one-byte, four-byte or other solutions.  Check out, for example, class ShortIntegerArray.

> -----Original Message-----
> From: MIME :bert at isgnw.CS.Uni-Magdeburg.De > Sent: Tuesday, September 14, 1999 2:58 PM
> To: tblanchard at etranslate.com
> Cc: squeak at cs.uiuc.edu
> Subject: Re: Unicode support
> > > On Tue, 14 Sep 1999, Todd Blanchard wrote:
> > > > On Mon, 13 Sep 1999, Todd Blanchard wrote:
> > >
> > > > I'm wanting to implement some  unicode support.  Who > can tell me -   > > > how big is a word?
> > > > Is it two bytes?
> > >
> > > No, it's four bytes. There is no two-byte primitive > supported array in > > Squeak (yet).
> > > So whats it going to take to get one? Is this something > that could  > be put together by an experienced C programmer > with some high-level  > Smalltalk experience by cloning the > variableByteArray class and  > adjusting the data sizes?
> > Currently there are only 1-byte arrays (ByteArray) and 4-byte arrays
> (object pointers and words). You would have to find all places that
> accesses the class format and change them to recognize the new 2-byte
> format. These are a lot. Look, for example, into
> Interpreter>>primitiveStringReplace which you certainly would > want to use
> for fast Unicode string manipulations. > > But basically you could just start using the byte-wise stuff > and adjusting
> all sizes by a factor of 2. In #at: you would construct a Unicode
> character from 2 bytes etc. I'd think this would be not even > that slow,
> and you could still switch to primitives later.
> > > Can you point me to info on low-level data formats in squeak?
> > No ... except for that's all in the image ;-)
> > I'll copy this back to the list, maybe someone else knows better.
> >   /bert
> > > > > 





More information about the Squeak-dev mailing list