[ANN] BhrXmlParser ported to Squeak
Helge Horch
Helge.Horch at munich.netsurf.de
Fri May 19 02:56:31 UTC 2000
Dave, Folks,
As promised, I have completed the port of the Burry Holms Research XML
Parser, written by Dave Harris, and it's now available at
http://home.munich.netsurf.de/Helge.Horch/SqueakSharesSoar.html#XML
It may not be the perfect "I grok your stoopid DTD" parser, but I'm happy
with it. It's small, very well written (all kudos go to Dave -- and it
wasn't as unportable as you thought!), and does all I need at the moment.
(Minnow seems down at the moment, or else I'd have updated a page or two.)
I can't post an example right now, but have a look at the code and the unit
tests, the usage patterns are simple. XmlBuilder is a good start,
especially for using the SAX-inspired stuff. Have a look at XmlBuilderTest
and its #assertParse:yields: and #events methods. Then follow the leads to
XmlBuilderTestRoot (a node class) that understands #onBody: etc.
To quote from the aforementioned page:
I have ported Dave's lightweight (partial and nonvalidating) XML Parser
from Dolphin to Squeak. I found its approach (3 classes + 1 exception
class) to be very appealing and the code to be very transparent.
Dave originally published the package for Dolphin 3.06 under the LGPL (just
before Camp Smalltalk 2000). He has remarked in private communication that
he didn't really want to encourage its widespread use because the Camp
Smalltalk crew was working on a more complete implementation. Alas, it was
useful and lightweight enough for me to carry out the port, and since it's
LGPLed, I think I'm supposed to share. ;-)
This (21KB) is the zipped distribution. It requires Squeak 2.8a with at
least update 2126 (Stefan Matthias Aust's Collection and assert: changes).
You'll need to file in the contained change sets in this order:
1.) EOS.6.cs -- enhances Squeak's ReadStream>>next to signal an Exception
at the stream end (Thanks to Bob Arning for guidance!)
2.) XmlParser.1.cs -- the XmlParser classes (all four of them)
3.) (optional, you need SUnit) XmlParserTests.1.cs -- a bunch of Unit Tests
for the XmlParser classes
Here are some quotes from the README (included):
[---]
This is a partial XML parser with a SAX-like event-driven interface. It
does not validate or handle entities (other than the standard ones) or
Unicode; it is probably not very fast. However, it is relatively small and
useful for setting up fixtures for unit tests etc.
[...] When I started, I couldn't find a reasonable standard lightweight
parser for Dolphin. That will probably change with Camp Smalltalk (14th
March 2000) and this code will probably not be supported after that date. [...]
I have included a class called XmlBuilder which wraps the SAX interface
with something a bit more Smalltalk-friendly. You provide it with a
dictionary which maps XML element names onto message selectors. The Builder
keeps a stack of element-objects. When it sees a #startElement:attributes:
event, it looks up the name, sends the corresponding message to the
top-of-stack, and pushes its result onto the stack so that it will receive
subsequent events. On #endElement the stack is popped. XML text is
forwarded to the top-of-stack and other SAX messages are quietly ignored.
The idea is to have domain data structures that build themselves. Objects
correspond to elements and know how to deal with the elements they contain:
typically either by creating a new object for the contents and returning
it; by returning self and dealing with its contents themselves; or
returning a DeafObject to ignore part of the tree. For XML output you would
have the objects write themselves with messages like #printXmlOn:.
This approach mixes XML knowledge into the domain objects. The alternative
is to use a SAX-like "application" to build the objects from the outside,
in which case you should also use the Visitor pattern to render them as XML
from the outside. Sometimes it's simpler to just accept XML as a core
format and read/write it directly, with the minimum of extra support
classes. I find the Builder's mix of context (in the form of the element
stack) and the element->message dictionary provides a good mix of
assistance and flexibility.
[---]
More information about the Squeak-dev
mailing list
|