UTF8 Squeak
Alan Lovejoy
squeak-dev.sourcery at forum-mail.net
Fri Jun 8 03:24:47 UTC 2007
<Alan L>UTF-8 should be the default</Alan L>
<J J (Jason)>Wouldn't that be a pretty big speed impact given how much
strings are used?</J J (Jason)>
Now that I think about it, that could very well be the case. There might be
clever ways to make the impact much less than one might otherwise expect
(for example, RunArrays were a clever way to make Text objects reasonably
efficient)--but I haven't actually implmented it, so there's no guarantee.
So, perhaps the default internal String encoding should be UTF-32, instead
of UTF-8 or UTF-16, in order to avoid the performance issue. But that
raises a memory usage issue--which is the primary reason I don't think a
"one size fits all" approach is sufficient.
--Alan
More information about the Squeak-dev
mailing list
|