XML Parser choice (was Re: [ENH] ??? MD5 in Squeak.)

Duane Maxwell dmaxwell at san.rr.com
Tue Nov 27 07:20:31 UTC 2001


> Let me put in a word for the VisualWorks one...it's probably the most
> complete. It would let us share work with the VisualWorks and the Camp
> Smalltalk communities (and with Cincom itself). Cincom is definitely all
> the way behind it (they're even putting together some (hork) XML Schema
> stuff). I'm pretty sure they're open sourcing all bits, so we have an XSLT
> processer, XPath support, etc.

OK, then I get to advocate the exobox one :)

The exobox parser is a complete well-formedness, non-validating parser minus
Unicode support - every obscure little syntax weirdness is handled, even if
the result is eventually dropped on the floor. It is set up much like a
SAX-style parser, in that what actually happens to the result of the parse
is handled through overridden methods.  There is a subclass provided that
constructs a tree built from OrderedCollections (for nodes) and Dictionaries
(for attributes).  Also, the exobox parser handles some peculiar cases,
including the very tricky Jabber one, where a entire session is in fact one
XML stream - in other words, the parser can spit out subtrees immediately as
they close rather than the entire tree, without blocking on waiting for the
next token.

Plus it's released unambiguously under the Squeak license and doesn't
require convincing anyone to make Squeak-friendly changes.  It's not,
however, a speed demon, but that's fixable.

-- Duane









More information about the Squeak-dev mailing list