UTF8 Squeak

Yoshiki Ohshima yoshiki at squeakland.org
Tue Jun 12 22:19:36 UTC 2007


> Both Bert and Yoshiki are right, but to be more precise, this is the 
> latin1 code page from dos (CP1252) see 
> http://www.microsoft.com/globaldev/reference/sbcs/1252.mspx
> So macToSqueak and squeakToMac should rather be named macRomanToCP1252 
> and cp1252toMacRoman...

  Yes.  That is more sensible.

> So yes, i vote like Michael, fix it!
> Otherwise, Squeak is not unicode internally for some of the characters 
> from 128 to 255, and this contradict this basic assumptions made in this 
> thread and by naive readers like me...

  Otherwise?  I'm not sure if this statement is true.  I would phrase
it in this way: Squeak is Unicode internally, but some conversion that
happens at the image boundary have bugs.  If you create a character
within the range of 160 to 255, you get the right/acceptable glyph for
the character.

-- Yoshiki

