[squeak-dev] XMLParser weirdness

Andreas Raab andreas.raab at gmx.de
Tue Aug 10 19:21:34 UTC 2010


Hi -

I just spent about two hours staring at code because of an oddity in the 
XML parser's printing of nodes. Here's an example:

node:= (XMLElement new) name: 'foo';
	addContent: (XMLStringNode string: 'Hello World');
	setAttributes: (Dictionary new);
	yourself.

This prints '<foo>Hello World</foo>' which is fine. However, the 
following construction, which adds just a single attribute:

node:= (XMLElement new) name: 'foo';
	addContent: (XMLStringNode string: 'Hello World');
	setAttributes: (Dictionary newFromPairs: {#id. 1});
	yourself.

prints now as '<foo id="1"/>' (i.e., losing its content string). Looking 
at the code in XMLElement>>printXmlOn: it does something weird if the 
writer is considered "non-canonical", i.e.,

	"... snip ..."
	(writer canonical not
		and: [self isEmpty and: [self attributes isEmpty not]])
		ifTrue: [writer endEmptyTag: self name]
	"... snap ..."

Two questions about this: 1) What's the meaning of 'canonical' XML? Is 
this a well-defined (sub-)set of XML? If so, where can I read about it? 
2) Is the above a bug or a feature? I'm wondering in particular about 
XMLElement>>isEmpty which only considers the elements but not eventual 
contents.

Any help is greatly welcome.

Cheers,
   - Andreas



More information about the Squeak-dev mailing list