Question about YAXO-XML and possible bug

Boris.Gaertner Boris.Gaertner at gmx.net
Tue Oct 30 19:31:54 UTC 2007


I ran into a problem when I tried to process the
svg example files that come with the SVG specification
from W3C

The following is valid xml and even a valid SVG
(Scalable Vector Graphics) file.

<?xml version="1.0" standalone="no"?>
<svg width="10cm" height="3cm" viewBox="0 0 1000 300"
     xmlns="http://www.w3.org/2000/svg">

    <text x="200" y="150" fill="blue"
            font-family="Verdana"
            font-size="45">
        Text with
        <tspan  fill="red" >read</tspan> and 
        <tspan fill="green">green</tspan> text spans
    </text>
</svg>

(This code draws one line of blue text, the words 'red' and
'green' are displayed in red and in green. The tspan
elements are used to encode text adornment. Try a 
newer  release of Mozilla Firefox to render this svg file.
Note also that some xml readers expect a DOCTYPE
specification. A suitable DOCTYPE specification for
svg is 
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20010904//EN" 
  "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">

) 

The text element is a mixed contents element (in XML
terminology, see section 3.2.2 of the XML reference
from Oct 2000), it contains three #PCDATA items and
two child elementrs (The  tspan elements).

When we parse this piece of xml with XMLDOMParser,
the text element is translated into an instance of 
XMLElement with the following values of its more important
instance variables:
 
  name = 'text'
  element
      an OrderedCollection with two XMLElements, one
      for each tspan item
  contents
     an OrderedCollection with the three instances of
      XMLStringNode for the strings 'Text with ',
      'and', 'text spans'.
 attributes
    a Dictionary with 5 elements.

The problem here is that this XMLElement does not 
contain information about the sequence of the strings and
the text spans. This is however a very important information
when you want to render a svg text element. I really think
that it is an error to put #PCDATA and child elements into
separate collections.

For now, I changed XMLDOMParser>>characters
from

characters: aString
     | newElement |
   newElement _ XMLStringNode string: aString.
   self top addContent: newElement.

to

characters: aString
     | newElement |
   newElement _ XMLStringNode string: aString.
   self top addElement: newElement.

With that change, I put all substructures into the
'elements' collection; the 'contents' collection becomes
obsolete.

This solves my problem with decorated svg text
but my change will certainly break a lot of other
applications that use the XMLDOMParser.
 
My questions:
* what is your experience with xml elements
   that have mixed contents?
* what do you think should be done with svg
   text like the one of my example?

Any comments are welcome.

Greetings
Boris 



More information about the Squeak-dev mailing list