[squeak-dev] ASN1 encoding of UTF8

tim Rowledge tim at rowledge.org
Mon Sep 18 16:29:38 UTC 2017


We do have assorted string encoding stuff in the current image but the actual UTF8 results of #squeakToUtf8 (for example) are just ByteStrings. Which is actually rather confusing and annoying because now you have no way to know what encoding is relevant other than be carefully keeping track manually. Normally of course, within the image we have perfectly usable strings because any time a unicode character that is outside the 1-byte range is used the string becomes a WideString.

We need to do better. Look at TextEncoder and its hierarchy for more info.

tim
--
tim Rowledge; tim at rowledge.org; http://www.rowledge.org/tim
Strange OpCodes: RLBM: Ruin Logic Board Multiple




More information about the Squeak-dev mailing list