On Tue, 27 Nov 2001, Andreas Raab wrote: [snip]
From an outsider's perspective, this seems like a really strange
strategy. Code size isn't a terribly big deal -- the thing the Squeak community is most constrained by is programmer time.
Code size isn't but complexity is. Usually these two go hand in hand and therefore it's no strange argument at all. In fact I'd argue that programmer time is (in this particular case) mostly dictated by the complexity involved in the parser itself - most people will want to do pretty simple stuff.
Well there's complexity and there's complexity. And there's various interfaces to manage that complexity. And dealing with missing functionality can be more complex than dealing with unneeded functionality.
It makes sense, in general, to have a SAX layer with a useful interface for generating application objects. One kind of application object is a DOM like (in the sense of representing most of the Infoset) tree. As long as you support all the infoset features, it shouldn't be that difficult to support whatever interface the application programmer wants to see. In other words, the parser isn't as interesting, generally speaking, as the output *except* that you might want the parser to take care of a bunch of standard tasks (validation against DTDs or Schemas is just one example) *or* you need certain programming or performance characterisitcs (e.g., the jabber needs mentioned earlier).
So, what do you put in the base image? What *are* we standardizing?
One reason to work with VWXML parse nodes, even given all their ugliness, is that you can easily port your application to VisualWorks or any Smalltalk that supports a parser that generates those nodes. And vice versa.
This seems like *some* sort of win to me :)
OTOH, what would *really* be nice is Unicode support. Why don't we get that first, and then argue about the other layers? ;)
Hmm. Looking at VW5i.4, most of the node names look reasonable. There are a few with underscores still, but I bet I can get Steve to change them....
OYTOH, I don't see any problem having multiple XML parsers/node sets, etc. Picking one to bless with bundling is a purely political matter at this point: What do we want to "force" folks to use (at least by default). Anything that gets pulled in will be *very* hard to avoid if you're doing XML stuff.
This goes even if it's modularized. The psycho-social impact remains the same.
The VisualWorks parser/node set supports, overall, a larger community and isn't technically horrible. (To be precise, it's rather featureful, though not complete. It seems to be reasonably nippy. It's flexible. It's under active development. The variety of interfaces seem sane if not wholely exciting or beautiful.)
Cheers, Bijan Parsia.