[ENH] XML namespace handling, take two

Simon Dobson simon.dobson at computer.org
Wed May 28 09:18:31 UTC 2003


> Do you know of any standard test cases for namespace handling? The XML
> tests I was using for unit testing Yaxo did not include namespaces.

I did all the testing with RDF files, since that's what I wanted
namespaces for. Your question prompted me to build some non-RDF test
files grabbed from the examples out of W3C's namespace spec -- and of
course my code promptly fell over the more esoteric samples :-) I've
attached (yes, remembered this time...) a corrected change set and the
set of test files. I'll try to find time to convert them into SUint
tests.

(The parser actually fails to pick the first illegal case in
xmlns/unique-attributes-bad.xml, but the failure's due to the main
parser not barfing on duplicated attributes -- I'll leave that one for
you :-)

Cheers,


-- 
-- Simon
--
-------------- next part --------------
'From Squeak3.4 of 1 March 2003 [latest update: #5170] on 28 May 2003 at 10:12:45 am'!
XMLTokenizer subclass: #SAXDriver
	instanceVariableNames: 'saxHandler namespaces defaultNamespace level '
	classVariableNames: ''
	poolDictionaries: ''
	category: 'XML-Parser'!
XMLNodeWithElements subclass: #XMLElement
	instanceVariableNames: 'name namespace contents attributes uri '
	classVariableNames: ''
	poolDictionaries: ''
	category: 'XML-Parser'!

!SAXDriver methodsFor: 'handling tokens' stamp: 'sd 5/25/2003 19:02'!
handleEndTag: elementName
	| ns qualifiedName uri localName |

	(self hasNamespace: elementName) ifTrue: [
		qualifiedName _ elementName.
		localName _ self localNameOf: elementName.
		ns _ self namespaceOf: elementName.

		"ensure our namespace is defined"
		(self namespaceURIOf: ns) ifNil: [ self error: 'End tag /' , elementName , ' refers to undefined namespace' ].
	] ifFalse: [
		"qualify the name"
		ns _ self defaultNamespace.
		localName _ elementName.
		qualifiedName _ self qualify: elementName withNamespace: ns.
	].
	uri _ self namespaceURIOf: ns.

	"call the handler"
	self saxHandler
		checkEOD; 
		endElement: localName namespaceURI: uri qualifiedName: qualifiedName.

	self outdent
! !

!SAXDriver methodsFor: 'handling tokens' stamp: 'sd 5/25/2003 17:35'!
handleStartDocument
	self saxHandler startDocument! !

!SAXDriver methodsFor: 'handling tokens' stamp: 'sd 5/28/2003 09:52'!
handleStartTag: elementName attributes: attributeList
	| ns uri localName qualifiedName |
	self indent.

	"declare any namespaces"	
	attributeList keysAndValuesDo: [ :k :v |
		"namespace introduction"
		((self namespaceOf: k) = 'xmlns') ifTrue: [
			| n |
			n _ self localNameOf: k.
			self declareNamespace: n URI: v.
		] ifFalse: [

			"namespace defaulting"
			(k = 'xmlns') ifTrue: [
				self globalNamespaceURI: v
			]
		]
	].

	(self hasNamespace: elementName) ifTrue: [
		qualifiedName _ elementName.
		ns _ self namespaceOf: elementName.
		localName _ self localNameOf: elementName.

		"ensure our namespace is defined"
		(self namespaceURIOf: ns) ifNil: [ self error: 'Start tag ' , elementName , ' refers to undefined namespace' ].
		self defaultNamespace: ns.
	] ifFalse: [
		"qualify the name"
		ns _ self defaultNamespace.
		qualifiedName _ self qualify: elementName withNamespace: ns.
		localName _ elementName.
	].
	uri _ self namespaceURIOf: ns.

	"check all attribute namespaces are defined and not duplicated by aliasing"
	attributeList keysAndValuesDo: [ :k :v |
		(self hasNamespace: k) ifTrue: [
			| kns kuri aliases |
			kns _ self namespaceOf: k.
			kuri _ self namespaceURIOf: kns ifAbsent: [ self error: 'Attribute ' , k , ' refers to undefined namespace' ].
			aliases _ self namespaceAliases: kns.
			aliases do: [ :a |
				| qk |
				qk _ self qualify: k withNamespace: a.
				(attributeList includesKey: qk) ifTrue: [ self error: 'Attributes ' , k , ' and ' , qk , ' are aliased to namespace ' , kuri ]
			]
		]
	].

	"call the handler"
	self saxHandler
		checkEOD; 
		startElement: localName namespaceURI: uri qualifiedName: qualifiedName attributeList: attributeList! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/26/2003 12:14'!
declareDefaultNamespaces
	"Declare any namespaces that should be visible automagically"

	self declareNamespace: 'xmlns' URI: 'http://www.w3.org/TR/REC-xml-names'! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/25/2003 18:59'!
declareNamespace: ns URI: uri
	"Declare the given name space prefix with the given URL"

	| nss |
	nss _ namespaces at: level ifAbsent: [
		| dic |
		dic _ Dictionary new.
		namespaces at: level put: dic.
		dic
	].
	nss at: ns put: uri! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/25/2003 18:39'!
defaultNamespace
	"Retrieve the default namespace at the current parse level"

	| nss ns |
	level to: 1 by: -1 do: [ :l |
		nss _ namespaces at: l ifAbsent: [ nil ].
		nss ifNotNil: [
			ns _ nss at: '__default__' ifAbsent: [ nil ].
			ns ifNotNil: [ ^ ns ].
		]
	].

	"if we get here we didn't find one"
	^ nil
! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/25/2003 18:59'!
defaultNamespace: ns
	"Declare the default namespace. This is a it dangerous as it makes use of the
	fact that __defualt__ isn't a legal XML namespace prefix to store a value that
	isn't a URI....however, it lets us get away without using a parallel stack just to hold 
	the default, so it's probably worth it"

	self declareNamespace: '__default__' URI: ns! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/28/2003 09:39'!
globalNamespaceURI
	"Return the global namespace being used in this document"

	| uri |
	uri _ self namespaceURIOf: '__global__'.
	uri ifNil: [ ^ nil ].
	^ (uri size = 0) ifTrue: [ nil ] ifFalse: [ uri ]   "an empty string is the same as a nil URI"! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/28/2003 09:31'!
globalNamespaceURI: uri
	"Declare the global namespace being used at this point"

	self declareNamespace: '__global__' URI: uri! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/25/2003 16:55'!
hasNamespace: elem
	"Test whether the element or tag name has a namespace part"

	^ ((elem findString: ':') > 0)
! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/25/2003 16:59'!
indent
	"Enter a new parsing level"

	level _ level + 1
! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/25/2003 17:33'!
localNameOf: elem
	"Extract the local name part of a tag or attribute name"

	| i |
	i _ elem findString: ':'.
	^ (i > 0) ifTrue: [ elem copyFrom: (i + 1) to: elem size ] ifFalse: [ elem ]! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/28/2003 10:03'!
namespaceAliases: ns
	"Locate any other namespaces that are aliases of the given one, i.e. if they have
	the same defining URI"

	| uri aliases |
	aliases _ Set new.
	uri _ self namespaceURIOf: ns ifAbsent: [ ^ aliases ].
	self namespaces keysAndValuesDo: [ :n :u |
		((u = uri) and: [ n ~= ns ]) ifTrue: [ aliases add: n ]
	].
	^ aliases! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/25/2003 17:33'!
namespaceOf: elem
	"Extract the name space part of a tag or attribute name, if any"

	| i |
	i _ elem findString: ':'.
	^ (i > 0) ifTrue: [ elem copyFrom: 1 to: (i - 1) ] ifFalse: [ nil ]! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/28/2003 09:33'!
namespaceURIOf: ns
	"Retrieve the URI of the given namespace prefix, if it is defined. A nil namespace
	returns the global namespace"

	^ self namespaceURIOf: ns ifAbsent: [ nil ]! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/28/2003 09:32'!
namespaceURIOf: ns ifAbsent: b
	"Retrieve the URI of the given namespace prefix, if it is defined. A nil namespace
	returns the global namespace. If no namespace can be found the value of the block is returned"

	| nss uri |
	ns ifNil: [ ^ self globalNamespaceURI ].
	level to: 0 by: -1 do: [ :l |
		nss _ namespaces at: l ifAbsent: [ nil ].
		nss ifNotNil: [
			uri _ nss at: ns ifAbsent: [ nil ].
			uri ifNotNil: [ ^ uri ].
		]
	].

	"if we get here we didn't find one"
	^ b value! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/28/2003 10:01'!
namespaces
	"Return a hash of the namespaces currently in scope"

	| t nss |
	t _ Dictionary new.
	level to: 0 by: -1 do: [ :l |
		nss _ namespaces at: l ifAbsent: [ nil ].
		nss ifNotNil: [ nss associationsDo: [ :a | t at: a key put: a value ] ]
	].
	^ t! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/25/2003 16:59'!
outdent
	"Move out a parsing level"

	namespaces removeKey: level ifAbsent: [].
	level _ level - 1
! !

!SAXDriver methodsFor: 'name spaces' stamp: 'sd 5/25/2003 17:33'!
qualify: n withNamespace: ns
	"Qualify the name with the namespace (if present)"

	| ln |
	ln _ self localNameOf: n.
	^ ns ifNil: [ ln ] ifNotNil: [ ns , ':' , ln ]! !

!SAXDriver methodsFor: 'initialization' stamp: 'sd 5/25/2003 17:42'!
initialize
	super initialize.
	namespaces _ Dictionary new.
	level _ 0.
	self declareDefaultNamespaces! !


!SAXHandler methodsFor: 'accessing' stamp: 'sd 5/23/2003 15:58'!
namespaces
	^ self driver namespaces! !


!SAXHandler class methodsFor: 'instance creation' stamp: 'sd 5/23/2003 15:19'!
on: aStream
	| driver parser |
	driver _ SAXDriver on: aStream.
	driver validating: true.
	parser _ self new driver: driver.
	^parser! !


!XMLDOMParser methodsFor: 'content' stamp: 'sd 5/25/2003 18:31'!
endElement: localName namespaceURI: uri qualifiedName: qualifiedName
	| currentElement |
	currentElement _ self pop.
	currentElement qualifiedName = qualifiedName
		ifFalse: [self driver errorExpected: 'End tag "', qualifiedName , '" doesn''t match "' , currentElement qualifiedName  , '".']! !

!XMLDOMParser methodsFor: 'content' stamp: 'sd 5/25/2003 18:28'!
startElement: localName namespaceURI: namespaceUri qualifiedName: qualifiedName attributeList: attributeList
	| newElement |
	newElement _ XMLElement named: localName namespace: (self driver namespaceOf: qualifiedName) uri: namespaceUri attributes: attributeList.
	self incremental
		ifFalse: [self stack isEmpty
			ifFalse: [self top addElement: newElement]].
	self push: newElement! !


!XMLElement methodsFor: 'accessing' stamp: 'sd 5/25/2003 18:25'!
name
	^ self qualifiedName! !

!XMLElement methodsFor: 'accessing' stamp: 'sd 5/25/2003 18:25'!
tag
	^ self name asSymbol! !

!XMLElement methodsFor: 'name space' stamp: 'sd 5/25/2003 18:24'!
localName
	^ name! !

!XMLElement methodsFor: 'name space' stamp: 'sd 5/25/2003 18:24'!
namespace
	^ namespace! !

!XMLElement methodsFor: 'name space' stamp: 'sd 5/25/2003 18:59'!
namespace: ns URI: u
	namespace _ ns.
	uri _ u! !

!XMLElement methodsFor: 'name space' stamp: 'sd 5/25/2003 18:24'!
namespaceURI
	^ uri! !

!XMLElement methodsFor: 'name space' stamp: 'sd 5/25/2003 18:26'!
qualifiedName
	^ namespace ifNil: [ self localName ] ifNotNil: [ namespace , ':' , self localName ]! !


!XMLElement class methodsFor: 'instance creation' stamp: 'sd 5/25/2003 18:58'!
named: aString namespace: ns uri: uri attributes: attributeList
	^self new
		name: aString;
		namespace: ns URI: uri;
		setAttributes: attributeList! !

XMLElement removeSelector: #namespace:!
XMLElement removeSelector: #namespace:URL:!
XMLElement removeSelector: #namespace:uri:!
XMLNodeWithElements subclass: #XMLElement
	instanceVariableNames: 'name namespace uri contents attributes '
	classVariableNames: ''
	poolDictionaries: ''
	category: 'XML-Parser'!
XMLDOMParser removeSelector: #endElement:!
SAXDriver removeSelector: #declareNamespace:uri:!
SAXDriver removeSelector: #namespaceURI:!
SAXDriver removeSelector: #namespaceURI:ifAbsent:!
SAXDriver removeSelector: #namespaceUri:!
SAXDriver removeSelector: #namespaceUri:ifAbsent:!
SAXDriver removeSelector: #namespaceUriOf:!
XMLTokenizer subclass: #SAXDriver
	instanceVariableNames: 'saxHandler namespaces level '
	classVariableNames: ''
	poolDictionaries: ''
	category: 'XML-Parser'!

!SAXDriver reorganize!
('accessing' saxHandler saxHandler:)
('handling tokens' handleCData: handleEndDocument handleEndTag: handlePCData: handlePI:data: handleStartDocument handleStartTag:attributes: handleXMLDecl:)
('name spaces' declareDefaultNamespaces declareNamespace:URI: defaultNamespace defaultNamespace: globalNamespaceURI globalNamespaceURI: hasNamespace: indent localNameOf: namespace:uri: namespaceAliases: namespaceOf: namespaceURIOf: namespaceURIOf:ifAbsent: namespaces outdent qualify:withNamespace:)
('initialization' initialize)
!

-------------- next part --------------
A non-text attachment was scrubbed...
Name: xmlns-test.tar.gz
Type: application/x-gzip
Size: 1189 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20030528/02b6e005/xmlns-test.tar.bin


More information about the Squeak-dev mailing list