Squeak as Metaverse reminds me of something concrete...

Richard A. O'Keefe ok at atlas.otago.ac.nz
Thu Jul 20 00:47:51 UTC 2000


Andreas Raab wrote:
	> The "largest" difference (read that literally) is
	> that XML is 90% overhead. So I think that rearranging groups of objects
	> mainly helps compressing the overhead better :)

"Martin B. Pomije" <pomije at inav.net> replied:
	There is a lot of outrageous hype about XML.
and pointed to Philip Wadler's stuff.

In the beginning there was SCRIPT, an IBM formatter not unlike roff.
Then there was GML, a semantics-oriented markup language layered atop SCRIPT.
Then there was SGML, the Standard Generalised Markup Language, which added a
bunch of nice stuff and yanked SCRIPT out from underneath so you had a common
*meta*language for definining markup languages.

XML was intended to be a simplified version of SGML, able to carry the same
semantics (a document "is" a "grove", basically a labelled attributed tree
with text and the leaves and some of the attributes serving as crosslinks),
but stripped down so as to be maximally convenient for machines (and it
sometimes seems, maximally inconvenient for people, unlike SGML, which is
quite habitable).

XML succeeded in its aim:  it is so simple to build an XML parser that a
whole lot of people have.  (As Wadler notes, it's not *really* all that
much more complicated than Lisp s-expressions.)

But that wasn't enough!  There seems to be a law of increasing complexity.
Make people's lives easier in one way, and they respond by making it much
harder in other ways.  XML has accreted a lot of stuff which makes it
_harder_ to work with than SGML:
	XSchema (first XML said "you don't really need a grammar (DTD)",
                 now it says "we've got something HARDER for you!")
	XInclude
	XNamespaces
	XPath ("there are lots of twisty little languages for specifying
		paths to nodes in a tree, let's have another one")
	XLink
	...

And then there is the Document Object Model.
What kind of *object* model is it that makes it impossible for you
to use the Interpreter pattern?  Where trying to extract text from a
node _may_ fail because it's too big, but there is no way to find out
in advance?  (You can find out "how big", but not "whether too big".)

I tried implementing the DOM in Squeak as an exercise, and then decided
that the design was such a mess that I didn't *want* to use it anyway.

As an interesting aside, a student here built an oo toolkit for our robot,
and the configuration files have the form of Windows INI files.  The
robot actually uses QNX, not Windows, but he _knew_ Windows INI files.
I immediately thought in terms of Lisp S-expressions, but then I _know_
Lisp S-expressions (and yeah, they ARE better for this kind of thing).
I suppose that if he had known XML, he'd have used XML for this task.
Sometimes I get the feeling that people use XML (especially for things
like RDF) because they've never heard of Lisp.  Smalltalk programmers
might think of some kind of FileOut or ChangeSet instead.





More information about the Squeak-dev mailing list