Unicode support
Bert Freudenberg
bert at isgnw.CS.Uni-Magdeburg.De
Tue Sep 14 18:40:50 UTC 1999
On Tue, 14 Sep 1999, Todd Blanchard wrote:
> > On Mon, 13 Sep 1999, Todd Blanchard wrote:
> >
> > > I'm wanting to implement some unicode support. Who can tell me -
> > > how big is a word?
> > > Is it two bytes?
> >
> > No, it's four bytes. There is no two-byte primitive supported array in
> > Squeak (yet).
>
> So whats it going to take to get one? Is this something that could
> be put together by an experienced C programmer with some high-level
> Smalltalk experience by cloning the variableByteArray class and
> adjusting the data sizes?
Currently there are only 1-byte arrays (ByteArray) and 4-byte arrays
(object pointers and words). You would have to find all places that
accesses the class format and change them to recognize the new 2-byte
format. These are a lot. Look, for example, into
Interpreter>>primitiveStringReplace which you certainly would want to use
for fast Unicode string manipulations.
But basically you could just start using the byte-wise stuff and adjusting
all sizes by a factor of 2. In #at: you would construct a Unicode
character from 2 bytes etc. I'd think this would be not even that slow,
and you could still switch to primitives later.
> Can you point me to info on low-level data formats in squeak?
No ... except for that's all in the image ;-)
I'll copy this back to the list, maybe someone else knows better.
/bert
More information about the Squeak-dev
mailing list
|