[Bug][Fix] Unicode characters in HtmlParser
Bert Freudenberg
bert at isgnw.CS.Uni-Magdeburg.De
Thu Mar 2 16:32:20 UTC 2000
On Thu, 2 Mar 2000, Mark Guzdial wrote:
> My Sophomore Squeak-learners are developing personal newspapers drawn
> from other news web sites, and they found that ESPN's site has
> Unicode characters in it (via ampersand stuff) that breaks the
> HtmlParser and Scamper.
Well, there are a lot of specialEntities like umlauts (ä) etc. that
are not currently handled correctly. Also, iso8859-1 to Squeak charset
conversion is not done. I posted a changeset a while ago but it didn't
make it into the image - it's the third attachment in
http://swiki.gsug.org/SQFIXES/275
-Bert-
More information about the Squeak-dev
mailing list
|