[squeak-dev] The Trunk: Collections-pre.762.mcz

H. Hirzel hannes.hirzel at gmail.com
Tue Aug 29 15:22:04 UTC 2017


Thx for the comment. Very useful.

--Hannes

On Tue, 29 Aug 2017 14:50:19 0000, commits at source.squeak.org
<commits at source.squeak.org> wrote:
> Patrick Rein uploaded a new version of Collections to project The Trunk:
> http://source.squeak.org/trunk/Collections-pre.762.mcz
>
> ==================== Summary ====================
>
> Name: Collections-pre.762
> Author: pre
> Time: 29 August 2017, 4:50:11.458834 pm
> UUID: d7838b91-7ce4-c34c-ac5a-c46cee281140
> Ancestors: Collections-bf.761
>
> Changes the HTMLReadWriter to deal correctly with nested tags and their
> mapping to text attributes. Also adds a comment to the class.
>
> =============== Diff against Collections-bf.761 ===============
>
> Item was changed:
>   TextReadWriter subclass: #HtmlReadWriter
>   	instanceVariableNames: 'count offset runStack runArray string breakLines'
>   	classVariableNames: ''
>   	poolDictionaries: ''
>   	category: 'Collections-Text'!
> +
> + !HtmlReadWriter commentStamp: 'pre 8/29/2017 16:14' prior: 0!
> + A HtmlReadWriter is used to read a Text object from a string containing
> HTML or writing a Text object to a string with HTML tags representing the
> text attributes.
> +
> + It does two things currently:
> + 1) Setting text attributes on the beginning of tags, e.g. setting a bold
> text attribute when seeing a <b> tag.
> + 2) Changing the resulting string, e.g. replacing a <br> with a Character
> cr.
> +
> + The implementation works by pushing attributes on a stack on every opening
> tag. On the corresponding closing tag, the attribute is poped from the stack
> and stored in an array of attribute runs. From this array the final string
> is constructed.
> +
> + ## Notes on the implementation
> + - The final run array is completely constructed while parsing so it has to
> be correct with regard to the length of the runs. There is no consolidation
> except for merging neighboring runs which include the same attributes.
> + - The *count* variable is the position in the source string, the *offset*
> is the number of skipped characters, for example ones that denote a tag.
> + - The stack contains elements which are of the form: {text attributes.
> current start index. original start}!
>
> Item was added:
> + ----- Method: HtmlReadWriter>>addCharacter: (in category 'private') -----
> + addCharacter: aCharacter
> +
> + 	string add: aCharacter.
> + 	count := count + 1.!
>
> Item was added:
> + ----- Method: HtmlReadWriter>>addString: (in category 'private') -----
> + addString: aString
> +
> + 	string addAll: aString.
> + 	count := count + aString size.!
>
> Item was changed:
>   ----- Method: HtmlReadWriter>>isTagIgnored: (in category 'testing') -----
>   isTagIgnored: aTag
>
>   	| space t |
> + 	t := aTag copyWithoutAll: '</>'.
> + 	space := t indexOf: Character space.
> - 	space := aTag indexOf: Character space.
>   	t := space > 0
> + 		ifTrue: [t copyFrom: 1 to: space - 1]
> + 		ifFalse: [t].
> - 		ifTrue: [aTag copyFrom: 2 to: space - 1]
> - 		ifFalse: [aTag copyFrom: 2 to: aTag size - 1].
>   	^ self ignoredTags includes: t!
>
> Item was changed:
>   ----- Method: HtmlReadWriter>>mapCloseCodeTag (in category 'mapping')
> -----
>   mapCloseCodeTag
>
>   	| theDoIt |
>   	theDoIt := runStack top first
>   		detect: [:attribute | attribute isKindOf: TextDoIt]
>   		ifNone: [^ self "nothing found, ignore"].
> + 	theDoIt evalString: (String withAll: (string copyFrom: runStack top third
> to: string size)).!
> - 	theDoIt evalString: (String withAll: (string copyFrom: runStack top
> second to: string size)).!
>
> Item was changed:
> + ----- Method: HtmlReadWriter>>nextPutText: (in category 'private') -----
> - ----- Method: HtmlReadWriter>>nextPutText: (in category 'accessing') -----
>   nextPutText: aText
>
>   	| previous |
>   	previous := #().
>   	self activateAttributesEnding: #() starting: previous. "for consistency"
>   	aText runs
>   		withStartStopAndValueDo: [:start :stop :attributes |
>   			self
>   				deactivateAttributesEnding: previous starting: attributes;
>   				activateAttributesEnding: previous starting: attributes;
>   				writeContent: (aText string copyFrom: start to: stop).
>   			previous := attributes].
>   	self deactivateAttributesEnding: previous starting: #().!
>
> Item was changed:
> + ----- Method: HtmlReadWriter>>nextText (in category 'private') -----
> - ----- Method: HtmlReadWriter>>nextText (in category 'accessing') -----
>   nextText
>
>   	count := 0.
>   	offset := 0. "To ignore characters in the input string that are used by
> tags."
>   	
>   	runStack := Stack new.
>   	
>   	runArray := RunArray new.
>   	string := OrderedCollection new.
>   	
> + 	"{text attributes. current start index. original start}"
> + 	runStack push: {OrderedCollection new. 1. 1}.
> - 	"{text attributes. start index. end index. number of open tags}"
> - 	runStack push: {OrderedCollection new. 1. nil. 0}.
>
>   	[stream atEnd] whileFalse: [self processNextTag].
>   	self processRunStackTop. "Add last run."
>
>   	string := String withAll: string.
>   	runArray coalesce.
>   	
>   	^ Text
>   		string: string
>   		runs: runArray!
>
> Item was changed:
>   ----- Method: HtmlReadWriter>>processEmptyTag: (in category 'reading')
> -----
>   processEmptyTag: aTag
>
>   	(aTag beginsWith: '<br') ifTrue: [
> + 		self addCharacter: Character cr.
> - 		string add: Character cr.
> - 		count := count + 1.
>   		^ self].
>   	
> + 	(self isTagIgnored: aTag)
> - 	(self ignoredTags includes: (aTag copyFrom: 2 to: aTag size - 3))
>   		ifTrue: [^ self].
>   		
> + 	"TODO... what?"!
> - 	"TODO..."!
>
> Item was changed:
>   ----- Method: HtmlReadWriter>>processEndTag: (in category 'reading') -----
>   processEndTag: aTag
>
>   	| index tagName |
>   	index := count - offset.
>   	tagName := aTag copyFrom: 3 to: aTag size - 1.
>
> + 	(self isTagIgnored: tagName) ifTrue: [^ self].
> + 	
> - 	(self ignoredTags includes: tagName) ifTrue: [^ self].
>   	tagName = 'code' ifTrue: [self mapCloseCodeTag].
>   	tagName = 'pre' ifTrue: [self breakLines: true].
> -
> - 	"De-Accumulate adjacent tags."
> - 	runStack top at: 4 put: runStack top fourth - 1.
> - 	runStack top fourth > 0
> - 		ifTrue: [^ self "not yet"].
>   		
>   	self processRunStackTop.
>
>   	runStack pop.
>   	runStack top at: 2 put: index + 1.!
>
> Item was changed:
>   ----- Method: HtmlReadWriter>>processHtmlEscape: (in category 'reading')
> -----
>   processHtmlEscape: aString
>   	| escapeSequence |
>   	escapeSequence := aString copyFrom: 2 to: aString size - 1.
>   	escapeSequence first = $# ifTrue: [^ self processHtmlEscapeNumber:
> escapeSequence allButFirst].
>   	(String htmlEntities at: (aString copyFrom: 2 to: aString size - 1)
> ifAbsent: [])
>   		ifNotNil: [:char |
> + 			self addCharacter: char].!
> - 			string add: char.
> - 			count := count + 1].!
>
> Item was changed:
> + ----- Method: HtmlReadWriter>>processHtmlEscapeNumber: (in category
> 'private') -----
> - ----- Method: HtmlReadWriter>>processHtmlEscapeNumber: (in category
> 'reading') -----
>   processHtmlEscapeNumber: aString
>   	| number |
>   	number := aString first = $x
>   		ifTrue: [ '16r', aString allButFirst ]
>   		ifFalse: [ aString ].
> + 	self addCharacter: number asNumber asCharacter.
> + 	!
> - 	string add: number asNumber asCharacter!
>
> Item was changed:
>   ----- Method: HtmlReadWriter>>processNextTag (in category 'reading') -----
>   processNextTag
>
>   	| tag htmlEscape lookForNewTag lookForHtmlEscape tagFound valid inComment
> inTagString |
>   	lookForNewTag := true.
>   	lookForHtmlEscape := false.
>   	tagFound := false.
>   	tag := OrderedCollection new.
>   	htmlEscape := OrderedCollection new.
>   	inComment := false.
>   	inTagString := false.
>   	
>   	[stream atEnd not and: [tagFound not]] whileTrue: [
>   		| character |
>   		character := stream next.
>   		valid := (#(10 13) includes: character asciiValue) not.
>   		count := count + 1.
>   	
>   		character = $< ifTrue: [lookForNewTag := false].
> + 		character = $& ifTrue: [inComment ifFalse: [lookForHtmlEscape := true]].
> - 		character = $& ifTrue: [
> - 			inComment ifFalse: [lookForHtmlEscape := true]].
>   		
>   		lookForNewTag
>   			ifTrue: [
>   				lookForHtmlEscape
>   					ifFalse: [
>   						(valid or: [self breakLines not])
>   							ifTrue: [string add: character]
>   							ifFalse: [offset := offset + 1]]
>   					ifTrue: [valid ifTrue: [htmlEscape add: character]. offset := offset
> + 1]]
>   			ifFalse: [valid ifTrue: [tag add: character]. offset := offset + 1].
>
>   		"Toggle within tag string/text."
>   		(character = $" and: [lookForNewTag not])
>   			ifTrue: [inTagString := inTagString not].
>   		
>   		inComment := ((lookForNewTag not and: [tag size >= 4])
>   			and: [tag beginsWith: '<!!--'])
>   			and: [(tag endsWith: '-->') not].
>
>   		(((character = $> and: [inComment not]) and: [lookForNewTag not]) and:
> [inTagString not]) ifTrue: [
>   			lookForNewTag := true.
>   			(tag beginsWith: '<!!--')
>   				ifTrue: [self processComment: (String withAll: tag)]
>   				ifFalse: [tag second ~= $/
>   					ifTrue: [
>   						(tag atLast: 2) == $/
>   							ifTrue: [self processEmptyTag: (String withAll: tag)]
>   							ifFalse: [self processStartTag: (String withAll: tag)]]
>   					ifFalse: [self processEndTag: (String withAll: tag)]].			
>   			tagFound := true].
>
>   		(((character = $; and: [lookForNewTag])
>   			and: [htmlEscape notEmpty]) and: [htmlEscape first = $&]) ifTrue: [
>   				lookForHtmlEscape := false.
>   				self processHtmlEscape: (String withAll: htmlEscape).
>   				htmlEscape := OrderedCollection new]].
>   !
>
> Item was changed:
>   ----- Method: HtmlReadWriter>>processRunStackTop (in category 'reading')
> -----
>   processRunStackTop
>   	"Write accumulated attributes to run array."
>   	
> + 	| currentIndex start attrs |
> + 	currentIndex := count - offset.
> - 	| index start end attrs |
> - 	index := count - offset.
> - 	
> - 	"Set end index."
> - 	runStack top at: 3 put: index.
> - 	"Write to run array."
>   	start := runStack top second.
> - 	end := runStack top third.
>   	attrs := runStack top first.
>   	runArray
>   		addLast: attrs asArray
> + 		times: currentIndex - start + 1.!
> - 		times: end - start + 1.!
>
> Item was changed:
>   ----- Method: HtmlReadWriter>>processStartTag: (in category 'reading')
> -----
>   processStartTag: aTag
>
>   	| index |
>   	(self isTagIgnored: aTag) ifTrue: [^ self].
>
>   	index := count - offset.
>
>   	aTag = '<br>' ifTrue: [
> + 		self addCharacter: Character cr.
> - 		string add: Character cr.
> - 		count := count + 1.
>   		^ self].
>   	(aTag beginsWith: '<img') ifTrue: [
> + 		self addString: '[image]'.
> - 		string addAll: '[image]'.
> - 		count := count + 7.
>   		^ self].
>   	
> + 	self processRunStackTop. "To add all attributes before the next tag adds
> some."
> - 	"Accumulate adjacent tags."
> - 	(runStack size > 1 and: [runStack top second = (index + 1) "= adjacent
> start tags"])
> - 		ifTrue: [
> - 			runStack top at: 1 put: (runStack top first copy addAll: (self
> mapTagToAttribute: aTag); yourself).
> - 			runStack top at: 4 put: (runStack top fourth + 1). "increase number of
> open tags"
> - 			^self].
> - 	
> - 	self processRunStackTop.
>
> - 	"Remove start/end info to reuse attributes later."
> - 	runStack top at: 2 put: nil.
> - 	runStack top at: 3 put: nil.
>   	"Copy attr list and add new attr."
> + 	runStack push: ({runStack top first copy addAll: (self mapTagToAttribute:
> aTag); yourself. index + 1 . index + 1}).
> + 	!
> - 	runStack push: ({runStack top first copy addAll: (self mapTagToAttribute:
> aTag); yourself. index + 1. nil. 1}).!
>
>
>


More information about the Squeak-dev mailing list