On Wed, 28 Nov 2001, Duane Maxwell wrote:
[snip]
I agree that a wellformedness parser is relatively easy, which is why there are so many of them - when I wrote it, however, there weren't any for Squeak. On the other hand, I think you can count on one hand all of the fully validating parsers in *any* language. It's very tough to implement everything correctly, and generally unnecessary.
Validation is a shifting goal of course. There's DTD validation, XML Schema (run away!), Relax-ng, etc.
If we were to wait until such a parser existed under an appropriate Squeak-compatible license in Smalltalk, we'd never have anything. By putting something in now that has the potential of being extended, we at least open to the door to handling XML data even if we let stuff through that might not otherwise survive validation.
Er...but it seems these considerations support the arguments I've been making. At least the VWXML stuff *attempts* DTD validation, the code owners regard failures in this realm to be bugs, and are committed to extending it (unto XML Schema validation!!!).
If we're going to go for the brass ring, I want to be standing ontop of a tall horse. Or something. :)
I've not picked apart the problem Andreas had with the VWXML parser. It is true that it niether had a terrific interface, nor documentation.
Plus, more than the Squeak community is working on it. Not just Cincom, but other folks. And not just the VisualWorks community.
The VWXML parser is partial. So if being "something incomplete but useful" is a measure of worthiness, it's worthy :)
The licence issue has to be investigated, yes. So too does the new code base (i.e., VW 5i.4). I'm willing to spearhead an effort to port all that which we can pry loose from Cincom (which would result, at least for now, in a (somewhat) validating parser, a partial XSLT engine, some XPath stuff, maybe SOAP support; these are the extant things, the things that already exist).
But I'm willing to believe that it's premature to standardize on the parser/node set. It's not premature to get unicode support, though :)
OTOH, I've mentioned that a very rational, super minimal core can rest under quite a variety of superstructures, including those for validation, etc. If such a core would sensibly replace the guts of the VWXML parser (and a number of others) I'm for it. Or something :)
Cheers, Bijan Parsia.