ISO 8859-1 compatibility

ohshima at is.titech.ac.jp ohshima at is.titech.ac.jp
Sun May 9 11:31:08 UTC 1999


  Hi,

  Does anyone have modified the 'NewYork' and 'Comic' fonts
which have ISO 8859-1 compatible characters in the "right
half" (the characters with 8th bit on) code points?  I'm
working on m17n of Squeak and I want to my image have such
fonts.

		    *        *        *

  BTW, I think it's nice if the fonts Squeak uses conform
ISO 8859-1 standard.

  I think one reason Squeak doesn't conform the standard is
those fonts have much longer history than the standard:-)
And another reason is the choice of internal code of Squeak
have been rarely cause any problems.

  But now Squeak has the mail reader, the web browser, the
newsreader: those programs communicate with the non-Squeak
environments.  If the right-half plane is compatible with
some well-known standard, much more people can use Squeak
conviniently.  And the choice of the standard for the
right-half should be ISO 8859-1 in this world. (it supports
14 major languages in Western Europe.)

  Currently, Squeak uses several characters that have code
in the right plane.  Fortunately, the number of such
character is very small (around twenty) and most of them have
valid code point in the ISO 8859-1.  So, I think replacing
them is not so difficult.

  The following is a boring list.  Read it with much
patience if you want to do so.

-------------------------------

  The characters which have valid code point in the standard
include:
    "MIDDLE_DOT"                      (Character value: 16rA5)
    "DOUBLE ANGLE QUOTATION MARK"s    (Character value: 16rC7 and 16rC8)
    "PILCROW SIGN"                    (Character value: 16rA6)
    "SECTION SIGN"                    (Character value: 16rA4).

  A character (Character value: 16rBB "MASCULINE ORDINAL
INDICATOR") is used just as a "sentinel" (to signal
non-valid character).

  (Character value: 16rD7) which is used in two places.  It
seems to me this character doesn't have the code point in
the ISO 8859-1.  Virtually this is the only problem.

  (Character value: 16rD5) is used to avoid double-quote
syntax in the string constant.

  A comment is written in German.  And another comment is
written in German with French word! (though I think the
position of the acute is wrong:-)

  "COPYRIGHT SIGN" (Character value: 16rA9) is used.  It has
the same code point in the standard.

  Just one "Beta" (Character value: 16rA7) is used in the
comment in a C source.  It seems to me that it was an
accident.  (Or replace it with eszett?

  There are several characters which have no valid face are
used in comments.
    (Character value: 16rC3)
    (Character value: 16rC4).


                                             OHSHIMA Yoshiki
                Dept. of Mathematical and Computing Sciences
                               Tokyo Institute of Technology 





More information about the Squeak-dev mailing list