[squeak-dev] XMLTokenizer problem with ampersand

karl ramberg karlramberg at gmail.com
Mon Jun 1 22:20:49 UTC 2015


Hi,
thanks for the info.
I guess I need a HTMLTokenizer for what I'm doing. I had issues with &nbsp
as well, with the current XMLTokenizer

Karl

On Mon, Jun 1, 2015 at 11:01 PM, Jakob Reschke <jakob.reschke at student.hpi.de
> wrote:

> I guess this will not help you, but a standalone ampersand is not
> valid XML (it is the leader for entities, if you want to have a
> literal ampersand in the text, the markup must be &amp;), hence I
> would not expect any XML tokenizer or parser implementation to accept
> it.
>
> HTML is more relaxed about this, so a standalone amapersand is valid,
> but you would need some kind of HTMLTokenizer and I do not know
> whether there is such thing for Squeak. Anyone else knows one?
>
> Best regards
> Jakob
>
> 2015-06-01 20:05 GMT+02:00 karl ramberg <karlramberg at gmail.com>:
> > Hi,
> > I'm parsing some html docs but the XMLTokenizer chockes on a '&'
> followed by
> > a space in a string.
> > I guess '&' is used for other stuff than a 'and' in html and it causes
> error
> > when used in plain text.
> >
> > Does anybody have fix for this?
> >
> > Karl
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20150602/7882ffbf/attachment.htm


More information about the Squeak-dev mailing list