Standard Squeak Font Encodings

Yoshiki Ohshima ohshima at is.titech.ac.jp
Fri Feb 4 13:01:32 UTC 2000


  Hi,

>> If the problem is that we just need several characters
>> aside from Latin-1, I think it makes sense to use the code
>> points in C0.

> Bad news:  the Unicode consortium noticed them too.
> There's a "standard scheme for Unicode compression" that uses
> 8 fixed windows of 128 characters and 8 dynamically settable
> of 128 characters together with a range of locking and
> non-locking shift codes to get text in most alphabetic scripts
> down to one byte per character.  If you're writing Māori
> or Greek or Russian or Hindi or ... you'll really appreciate
> that; it saves quite a bit of space compared with UTF-8.
>
> The scheme has the pleasant property that strings containing only
> Latin-1 printing characters, space, TAB, CR, and LF remain unchanged.
> 
> Guess where the special codes for the compression scheme go?

  Umm, I didn't know this "Unicode compression".  Would you
give me a reference?  I just hope it is not going to be
popular:-)
 
  Seriously, the "if" I used in the last mail was intended
to be a big one.  I mean, basically I don't think it is good
idea to allocate characters outside of GL or GR.

  The problems we have are 1) we need more code points, and
2) we want to be compatible with other world.  However,
allocating more characters satisfies 1, but breaks 2.  It
also creates the characters *invisible* when exported to
outside.  Introducing invisible character seems bad.

  Another possible approach is that we use Latin-1 and
replace some characters in the GL and/or GR.  The
alternatives are 1) put the leftarrow at the underscore's
position and replace a character (guess which one:-) in GR
with underscore, or 2) put the leftarrow at a position in
GR.

  The best way I'm imagining is that we keep the current
glyphs for the first 128 code points, but switch to use the
right part of Latin-1 encoding as soon as possible.  Then
wait for someone to (re-)engage the multilingual project and
more exhaustive solution:-)

  -- Yoshiki





More information about the Squeak-dev mailing list