Celeste encoding (was: Duplicate messages in Celeste)

Lex Spoon lex at cc.gatech.edu
Tue Mar 14 12:36:19 UTC 2000


> The suggestion to use Windows CP 1252 as a base instead of Latin1 or Latin9
> is a good one; it will make the transition from MacRoman _much_ more painless
> as it includes a lot of the worthwhile MacRoman characters, like
> 6...9 66...99 quotation marks.  UNIX users _can_ get their electronic hands
> on compatible fonts for the X Window system, so not only would the switch
> support all the characters UNIX users are used to, it _could_ support the
> others as well.


Well, my man page formatter has a flag for Latin-1 encodings, but
nothing else.  When Netscape works, it displays things in Latin-1.  So
at least this Unix user would find Latin-1 more familiar and convenient than
anything else.


> 	
> Someone else asked:
> 	Well, isn't Latin-1 the de facto standard for the Internet?
> 
> In a word, NO.  HTML 3.2 is defined in terms of Latin 1, but a lot of the
> Web pages I _have_ to deal with are actually CP 1525.  HTML 4.0 is
> defined in terms of ISO 10646, but it's still not really practical to
> actually _rely_ on that to any marked extent.

So it's not a de facto standard, but an actual specified standard.  All
the more reason to go with Latin-1, it would seem.

Sure there are systems that do it wrong, but how are we to detect which
ones do it wrong and which ones do it right?  Should we really display
correct pages wrong, just so that we can display shabby pages correctly?


Surely, we'd still do isoToSqueak by default when we download a web
page or an email.  If Squeak uses Latin-1 internally, then it's an
identity translation and we display correct pages/emails perfectly.  If
Squeak uses something else, then we are going to lose something.  And
if the page is incorrect, it's going to look bad no matter which way we
go.

Given all this, Latin-1 (or this ISO 10646 thing) seems like the way to go
for compatibility with the Internet.  Perhaps have *overrides* for bad
pages/emails, but surely we need to at least support correct ones.


Lex





More information about the Squeak-dev mailing list