Unicode patch

stephane ducasse stephane.ducasse at free.fr
Thu Jun 14 20:18:33 UTC 2007

> Just from glancing at the code this cannot possibly be right.
> Like, in many places the isWideString test is simply replaced with  
> isFourByteString. But the distinction we need to make is wether we  
> have character values below 256 or above (for example to choose  
> between the old and the MultiByteScanner). So #isWideString needs  
> to be preserved and answer true for all Strings that have character  
> values >= 256.
> As for the internal representation of TwoByteStrings; I'm not sure  
> using big endian on all platforms is a good idea. Should certainly  
> be discussed - like, it might be valuable to hand that string to a  
> primitive and then platform order would be better.
> Also, the renaming of WideString without providing proper  
> conversion methods will most certainly break existing projects.
> Then there are a lot of nits to pick - like the class comments are  
> wrong, ByteString>>replaceFrom:... only creates 32 bit strings,  
> bitShift is used all over the place when Smalltalk code  
> traditionally uses * and //, what is TwoByteString>>printString  
> good for, why does TwoByteString>>asByteString do an unnecessary  
> copy etc.
> Before inclusion this still needs a lot of work and testing.

Sounds like. Thanks for the feedback bert.


More information about the Squeak-dev mailing list