[squeak-dev] XMLTokenizer problem with ampersand

Jakob Reschke jakob.reschke at student.hpi.de
Mon Jun 1 21:01:20 UTC 2015


I guess this will not help you, but a standalone ampersand is not
valid XML (it is the leader for entities, if you want to have a
literal ampersand in the text, the markup must be &), hence I
would not expect any XML tokenizer or parser implementation to accept
it.

HTML is more relaxed about this, so a standalone amapersand is valid,
but you would need some kind of HTMLTokenizer and I do not know
whether there is such thing for Squeak. Anyone else knows one?

Best regards
Jakob

2015-06-01 20:05 GMT+02:00 karl ramberg <karlramberg at gmail.com>:
> Hi,
> I'm parsing some html docs but the XMLTokenizer chockes on a '&' followed by
> a space in a string.
> I guess '&' is used for other stuff than a 'and' in html and it causes error
> when used in plain text.
>
> Does anybody have fix for this?
>
> Karl


More information about the Squeak-dev mailing list