[squeak-dev] The Trunk: Regex-Core-ul.40.mcz

commits at source.squeak.org commits at source.squeak.org
Sat Aug 22 20:49:23 UTC 2015


Levente Uzonyi uploaded a new version of Regex-Core to project The Trunk:
http://source.squeak.org/trunk/Regex-Core-ul.40.mcz

==================== Summary ====================

Name: Regex-Core-ul.40
Author: ul
Time: 22 August 2015, 10:43:08.836 pm
UUID: ea54abd3-fd5c-44b5-90a4-f129cd1848eb
Ancestors: Regex-Core-ul.39

RxsPredicate:
- added \t \n escapes for tab and lf
- unified all single character predicates as #beCharacter:
- postscript reinitializes the dictionary

RxMatchOptimizer:
- removed lookarounds, because they were never used, just collected
- all collections get initialized lazily
- CharacterSet are used for storing characters
- IdentitySets are used where unique objects are stored (e.g. symbols)
- use #== and #~~ for character comparison (Spur-specific change)

RxMatcher:
- reuse previous markerPositions
- use String >> #new:streamContents: where possible
- use #== and #~~ for character comparison (Spur-specific change)
- quick returns in #tryMatch

General:
- renamed RxmLookahaed to RxmLookahead
- in RxmPredicate >> #matchAgainst: use the fact that  aMatcher is providing a ReadStream of characters, so nil means end of stream
- reimplemented RxmSubstring

=============== Diff against Regex-Core-ul.39 ===============

Item was changed:
  Object subclass: #RxMatchOptimizer
+ 	instanceVariableNames: 'ignoreCase prefixes nonPrefixes conditions testBlock methodPredicates nonMethodPredicates predicates nonPredicates'
- 	instanceVariableNames: 'ignoreCase prefixes nonPrefixes conditions testBlock methodPredicates nonMethodPredicates predicates nonPredicates lookarounds'
  	classVariableNames: ''
  	poolDictionaries: ''
  	category: 'Regex-Core'!
  
  !RxMatchOptimizer commentStamp: 'Tbn 11/12/2010 23:13' prior: 0!
  -- Regular Expression Matcher v 1.1 (C) 1996, 1999 Vassili Bykov
  --
  A match start optimizer, handy for searching a string. Takes a regex syntax tree and sets itself up so that prefix characters or matcher states that cannot start a match are later recognized with #canStartMatch:in: method.
  
  Used by RxMatcher, but can be used by other matchers (if implemented) as well.!

Item was added:
+ ----- Method: RxMatchOptimizer>>addCondition: (in category 'private') -----
+ addCondition: aSymbol
+ 
+ 	^(conditions ifNil: [ conditions := IdentitySet new: 1 ]) add: aSymbol!

Item was added:
+ ----- Method: RxMatchOptimizer>>addMethodPredicate: (in category 'private') -----
+ addMethodPredicate: aSelector
+ 
+ 	^(methodPredicates ifNil: [ methodPredicates := IdentitySet new: 1 ]) add: aSelector!

Item was added:
+ ----- Method: RxMatchOptimizer>>addNonMethodPredicate: (in category 'private') -----
+ addNonMethodPredicate: aSelector
+ 
+ 	^(nonMethodPredicates ifNil: [ nonMethodPredicates := IdentitySet new: 1 ]) add: aSelector!

Item was added:
+ ----- Method: RxMatchOptimizer>>addNonPredicate: (in category 'private') -----
+ addNonPredicate: nonPredicate
+ 
+ 	^(nonPredicates ifNil: [ nonPredicates := Set new: 1 ]) add: nonPredicate!

Item was added:
+ ----- Method: RxMatchOptimizer>>addNonPrefix: (in category 'private') -----
+ addNonPrefix: aCharacter
+ 
+ 	^(nonPrefixes ifNil: [ nonPrefixes := CharacterSet new ]) add: aCharacter!

Item was added:
+ ----- Method: RxMatchOptimizer>>addPredicate: (in category 'private') -----
+ addPredicate: predicate
+ 
+ 	^(predicates ifNil: [ predicates := Set new: 1 ]) add: predicate!

Item was added:
+ ----- Method: RxMatchOptimizer>>addPrefix: (in category 'private') -----
+ addPrefix: aCharacter
+ 
+ 	^(prefixes ifNil: [ prefixes := CharacterSet new ]) add: aCharacter!

Item was changed:
  ----- Method: RxMatchOptimizer>>conditionTester (in category 'accessing') -----
  conditionTester
  	"#any condition is filtered at the higher level;
  	it cannot appear among the conditions here."
  
  	| matchConditions size |
+ 	conditions ifNil: [ ^nil ].
+ 	(size := conditions size) = 0 ifTrue: [ ^nil ].
- 	(size := conditions size) = 0ifTrue: [ ^nil ].
  	size = 1 ifTrue: [
  		| matchCondition |
  		matchCondition := conditions anyOne.
  		"Special case all of the possible conditions."
  		#atBeginningOfLine == matchCondition ifTrue: [^[:c :matcher | matcher atBeginningOfLine]].
  		#atEndOfLine == matchCondition ifTrue: [^[:c :matcher | matcher atEndOfLine]].
  		#atBeginningOfWord == matchCondition ifTrue: [^[:c :matcher | matcher atBeginningOfWord]].
  		#atEndOfWord == matchCondition ifTrue: [^[:c :matcher | matcher atEndOfWord]].
  		#atWordBoundary == matchCondition ifTrue: [^[:c :matcher | matcher atWordBoundary]].
  		#notAtWordBoundary == matchCondition ifTrue: [^[:c :matcher | matcher notAtWordBoundary]].
  		RxParser signalCompilationException: 'invalid match condition'].
  	"More than one condition. Capture them as an array in scope."
  	matchConditions := conditions asArray.
  	^[ :c :matcher |
  		matchConditions anySatisfy: [ :conditionSelector |
  			matcher perform: conditionSelector ] ]!

Item was changed:
  ----- Method: RxMatchOptimizer>>determineTestMethod (in category 'private') -----
  determineTestMethod
  	"Answer a block closure that will work as a can-match predicate.
  	Answer nil if no viable optimization is possible (too many chars would
  	be able to start a match)."
  
  	| testers size |
+ 	conditions ifNotNil: [
+ 		(conditions includes: #any) ifTrue: [ ^nil ] ].
- 	(conditions includes: #any) ifTrue: [^nil].
  	testers := {
  		self prefixTester.
  		self nonPrefixTester.
  		self conditionTester.
  		self methodPredicateTester.
  		self nonMethodPredicateTester.
  		self predicateTester.
  		self nonPredicateTester } reject: [ :each | each isNil ].
  	(size := testers size) = 0 ifTrue: [ ^nil ].
  	size = 1 ifTrue: [ ^testers first ].
  	^[ :char :matcher | testers anySatisfy: [ :t | t value: char value: matcher ] ]!

Item was changed:
  ----- Method: RxMatchOptimizer>>initialize:ignoreCase: (in category 'initialize-release') -----
  initialize: aRegex ignoreCase: aBoolean 
  	"Set `testMethod' variable to a can-match predicate block:
  	two-argument block which accepts a lookahead character
  	and a matcher (presumably built from aRegex) and answers 
  	a boolean indicating whether a match could start at the given
  	lookahead. "
  
  	ignoreCase := aBoolean.
+ 	aRegex dispatchTo: self.
+ 	"If the whole expression is nullable, end-of-line is an implicit can-match condition!!"
+ 	aRegex isNullable ifTrue: [ self addCondition: #atEndOfLine ].
- 	prefixes := IdentitySet new: 10.
- 	nonPrefixes := IdentitySet new: 10.
- 	conditions := IdentitySet new: 3.
- 	methodPredicates := Set new: 3.
- 	nonMethodPredicates := Set new: 3.
- 	predicates := Set new: 3.
- 	nonPredicates := Set new: 3.
- 	lookarounds := Set new: 3.
- 	aRegex dispatchTo: self.	"If the whole expression is nullable, 
- 		end-of-line is an implicit can-match condition!!"
- 	aRegex isNullable ifTrue: [conditions add: #atEndOfLine].
  	testBlock := self determineTestMethod!

Item was changed:
  ----- Method: RxMatchOptimizer>>methodPredicateTester (in category 'accessing') -----
  methodPredicateTester
  
  	| p size |
+ 	methodPredicates ifNil: [ ^nil ].
  	(size := methodPredicates size) = 0 ifTrue: [ ^nil ].
  	size = 1 ifTrue: [
  		| selector |
  		"might be a pretty common case"
  		selector := methodPredicates anyOne.
  		^[ :char :matcher | 
  			RxParser doHandlingMessageNotUnderstood: [
  				char perform: selector ] ] ].
  	p := methodPredicates asArray.
  	^[ :char :matcher | 
  		RxParser doHandlingMessageNotUnderstood: [
  			p anySatisfy: [ :sel | char perform: sel ] ] ]!

Item was changed:
  ----- Method: RxMatchOptimizer>>nonMethodPredicateTester (in category 'accessing') -----
  nonMethodPredicateTester
  
  	| p size |
+ 	nonMethodPredicates ifNil: [ ^nil ].
  	(size := nonMethodPredicates size) = 0 ifTrue: [ ^nil ].
  	size = 1 ifTrue: [
  		| selector |
  		selector := nonMethodPredicates anyOne.
  		^[ :char :matcher | 
  			RxParser doHandlingMessageNotUnderstood: [
  				(char perform: selector) not ] ] ].
  	p := nonMethodPredicates asArray.
  	^[:char :m | 
  		RxParser doHandlingMessageNotUnderstood: [
  			(p allSatisfy: [:sel | char perform: sel ]) not ] ]!

Item was changed:
  ----- Method: RxMatchOptimizer>>nonPredicateTester (in category 'private') -----
  nonPredicateTester
  
  	| p size |
+ 	nonPredicates ifNil: [ ^nil ].
  	(size := nonPredicates size) = 0 ifTrue: [ ^nil ].
  	size = 1 ifTrue:  [
  		| predicate |
  		predicate := nonPredicates anyOne.
  		^[ :char :matcher | (predicate value: char) not] ].
  	p := nonPredicates asArray.
  	^[ :char :m | (p allSatisfy: [:some | some value: char ]) not ]!

Item was changed:
  ----- Method: RxMatchOptimizer>>nonPrefixTester (in category 'private') -----
  nonPrefixTester
  
  	| size |
+ 	nonPrefixes ifNil: [ ^nil ].
  	(size := nonPrefixes size) = 0 ifTrue: [ ^nil ].
  	size = 1 ifTrue: [
  		| nonPrefixChar |
  		nonPrefixChar := nonPrefixes anyOne.
+ 		^[ :char :matcher | char ~~ nonPrefixChar ] ].
- 		^[ :char :matcher | char ~= nonPrefixChar ] ].
  	^[ :char : matcher | (nonPrefixes includes: char) not ]!

Item was changed:
  ----- Method: RxMatchOptimizer>>predicateTester (in category 'private') -----
  predicateTester
  
  	| p size |
+ 	predicates ifNil: [ ^nil ].
  	(size := predicates size) = 0 ifTrue: [ ^nil ].
  	size = 1 ifTrue: [
  		| pred |
  		pred := predicates anyOne.
  		^[ :char :matcher | pred value: char ] ].
  	p := predicates asArray. 
  	^[ :char :matcher | p anySatisfy: [:some | some value: char ] ]!

Item was changed:
  ----- Method: RxMatchOptimizer>>prefixTester (in category 'private') -----
  prefixTester
  
  	| p size |
+ 	prefixes ifNil: [ ^nil ].
  	(size := prefixes size) = 0 ifTrue: [ ^nil ].
  	size = 1 ifTrue: [
  		| prefixChar |
  		prefixChar := prefixes anyOne.
  		ignoreCase ifTrue: [ ^[ :char :matcher | char sameAs: prefixChar ] ].
+ 		^[ :char :matcher | char == prefixChar ] ].
- 		^[ :char :matcher | char = prefixChar ] ].
  	ignoreCase ifFalse: [ ^[ :char :matcher | prefixes includes: char ] ].
  	p := prefixes collect: [ :each | each asUppercase ].
  	^[ :char :matcher | p includes: char asUppercase ]!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxAny (in category 'double dispatch') -----
  syntaxAny
  	"Any special char is among the prefixes."
  
+ 	self addCondition: #any!
- 	conditions add: #any!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxBeginningOfLine (in category 'double dispatch') -----
  syntaxBeginningOfLine
  	"Beginning of line is among the prefixes."
  
+ 	self addCondition: #atBeginningOfLine!
- 	conditions add: #atBeginningOfLine!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxBeginningOfWord (in category 'double dispatch') -----
  syntaxBeginningOfWord
  	"Beginning of line is among the prefixes."
  
+ 	self addCondition: #atBeginningOfWord!
- 	conditions add: #atBeginningOfWord!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxCharSet: (in category 'double dispatch') -----
  syntaxCharSet: charSetNode 
  	"All these (or none of these) characters is the prefix."
  
  	(charSetNode enumerableSetIgnoringCase: ignoreCase) ifNotNil: [ :enumerableSet |
  		charSetNode isNegated
+ 			ifTrue: [ enumerableSet do: [ :each | self addNonPrefix: each ] ]
+ 			ifFalse: [ enumerableSet do: [ :each | self addPrefix: each ] ] ].
+ 
- 			ifTrue: [ nonPrefixes addAll: enumerableSet ]
- 			ifFalse: [ prefixes addAll: enumerableSet ] ].
  	charSetNode predicates ifNotNil: [ :charsetPredicates |
  		charSetNode isNegated
+ 			ifTrue: [ 
+ 				charsetPredicates do: [ :each | self addNonPredicate: each ] ]
+ 			ifFalse: [ 
+ 				charsetPredicates do: [ :each | self addPredicate: each ] ] ]!
- 			ifTrue: [ nonPredicates addAll: charsetPredicates ]
- 			ifFalse: [ predicates addAll: charsetPredicates ] ]!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxCharacter: (in category 'double dispatch') -----
  syntaxCharacter: charNode
  	"This character is the prefix, of one of them."
  
+ 	self addPrefix: charNode character!
- 	prefixes add: charNode character!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxEndOfLine (in category 'double dispatch') -----
  syntaxEndOfLine
  	"Beginning of line is among the prefixes."
  
+ 	self addCondition: #atEndOfLine!
- 	conditions add: #atEndOfLine!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxEndOfWord (in category 'double dispatch') -----
  syntaxEndOfWord
  
+ 	self addCondition: #atEndOfWord!
- 	conditions add: #atEndOfWord!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxLookaround: (in category 'double dispatch') -----
  syntaxLookaround: lookaroundNode 
+ 	"Do nothing."!
- 
- 	
- 	lookarounds add: lookaroundNode!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxMessagePredicate: (in category 'double dispatch') -----
  syntaxMessagePredicate: messagePredicateNode 
+ 
+ 	messagePredicateNode negated ifTrue: [
+ 		^self addNonMethodPredicate: messagePredicateNode selector ].
+ 	self addMethodPredicate: messagePredicateNode selector!
- 	messagePredicateNode negated
- 		ifTrue: [nonMethodPredicates add: messagePredicateNode selector]
- 		ifFalse: [methodPredicates add: messagePredicateNode selector]!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxNonWordBoundary (in category 'double dispatch') -----
  syntaxNonWordBoundary
  
+ 	self addCondition: #notAtWordBoundary!
- 	conditions add: #notAtWordBoundary!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxPredicate: (in category 'double dispatch') -----
  syntaxPredicate: predicateNode 
  
+ 	self addPredicate: predicateNode predicate!
- 	predicates add: predicateNode predicate!

Item was changed:
  ----- Method: RxMatchOptimizer>>syntaxWordBoundary (in category 'double dispatch') -----
  syntaxWordBoundary
  
+ 	self addCondition: #atWordBoundary!
- 	conditions add: #atWordBoundary!

Item was changed:
  Object subclass: #RxMatcher
+ 	instanceVariableNames: 'matcher ignoreCase startOptimizer stream markerPositions markerCount lastResult oldMarkerPositions'
- 	instanceVariableNames: 'matcher ignoreCase startOptimizer stream markerPositions markerCount lastResult'
  	classVariableNames: 'Cr Lf'
  	poolDictionaries: ''
  	category: 'Regex-Core'!
  
  !RxMatcher commentStamp: 'Tbn 11/12/2010 23:13' prior: 0!
  -- Regular Expression Matcher v 1.1 (C) 1996, 1999 Vassili Bykov
  --
  This is a recursive regex matcher. Not strikingly efficient, but simple. Also, keeps track of matched subexpressions.  The life cycle goes as follows:
  
  1. Initialization. Accepts a syntax tree (presumably produced by RxParser) and compiles it into a matcher built of other classes in this category.
  
  2. Matching. Accepts a stream or a string and returns a boolean indicating whether the whole stream or its prefix -- depending on the message sent -- matches the regex.
  
  3. Subexpression query. After a successful match, and before any other match, the matcher may be queried about the range of specific stream (string) positions that matched to certain parenthesized subexpressions of the original expression.
  
  Any number of queries may follow a successful match, and any number or matches may follow a successful initialization.
  
  Note that `matcher' is actually a sort of a misnomer. The actual matcher is a web of Rxm* instances built by RxMatcher during initialization. RxMatcher is just the interface facade of this network.  It is also a builder of it, and also provides a stream-like protocol to easily access the stream being matched.
  
  Instance variables:
  	matcher				<RxmLink> The entry point into the actual matcher.
  	stream				<Stream> The stream currently being matched against.
  	markerPositions		<Array of: Integer> Positions of markers' matches.
  	markerCount		<Integer> Number of markers.
  	lastResult 			<Boolean> Whether the latest match attempt succeeded or not.
  	lastChar			<Character | nil> character last seen in the matcher stream!

Item was changed:
  ----- Method: RxMatcher>>copy:replacingMatchesWith: (in category 'match enumeration') -----
  copy: aString replacingMatchesWith: replacementString
  	"Copy <aString>, except for the matches. Replace each match with <aString>."
  
+ 	^String new: (aString size min: 1000) streamContents: [ :stream |
+ 		self
+ 			copyStream: aString readStream
+ 			to: stream
+ 			replacingMatchesWith: replacementString ]!
- 	| answer |
- 	answer := (String new: 40) writeStream.
- 	self
- 		copyStream: aString readStream
- 		to: answer
- 		replacingMatchesWith: replacementString.
- 	^answer contents!

Item was changed:
  ----- Method: RxMatcher>>copyStream:to:replacingMatchesWith: (in category 'match enumeration') -----
  copyStream: aStream to: writeStream replacingMatchesWith: aString
  	"Copy the contents of <aStream> on the <writeStream>, except for the matches. Replace each match with <aString>."
  
  	| searchStart matchStart matchEnd |
  	stream := aStream.
+ 	oldMarkerPositions := markerPositions := nil.
- 	markerPositions := nil.
  	[searchStart := aStream position.
  	self proceedSearchingStream: aStream] whileTrue:
  		[matchStart := (self subBeginning: 1) first.
  		matchEnd := (self subEnd: 1) first.
  		aStream position: searchStart.
  		searchStart to: matchStart - 1 do:
  			[:ignoredPos | writeStream nextPut: aStream next].
  		writeStream nextPutAll: aString.
  		aStream position: matchEnd.
  		"Be extra careful about successful matches which consume no input.
  		After those, make sure to advance or finish if already at end."
  		matchEnd = searchStart ifTrue: 
  			[aStream atEnd
  				ifTrue:	[^self "rest after end of whileTrue: block is a no-op if atEnd"]
  				ifFalse:	[writeStream nextPut: aStream next]]].
  	aStream position: searchStart.
  	[aStream atEnd] whileFalse: [writeStream nextPut: aStream next]!

Item was changed:
  ----- Method: RxMatcher>>copyStream:to:translatingMatchesUsing: (in category 'match enumeration') -----
  copyStream: aStream to: writeStream translatingMatchesUsing: aBlock
  	"Copy the contents of <aStream> on the <writeStream>, except for the matches. For each match, evaluate <aBlock> passing the matched substring as the argument.  Expect the block to answer a String, and write the answer to <writeStream> in place of the match."
  
  	| searchStart matchStart matchEnd match |
  	stream := aStream.	
+ 	oldMarkerPositions := markerPositions := nil.
- 	markerPositions := nil.
  	[searchStart := aStream position.
  	self proceedSearchingStream: aStream] whileTrue:
  		[matchStart := (self subBeginning: 1) first.
  		matchEnd := (self subEnd: 1) first.
  		aStream position: searchStart.
  		searchStart to: matchStart - 1 do:
  			[:ignoredPos | writeStream nextPut: aStream next].
  		match := (String new: matchEnd - matchStart + 1) writeStream.
  		matchStart to: matchEnd - 1 do:
  			[:ignoredPos | match nextPut: aStream next].
  		writeStream nextPutAll: (aBlock value: match contents).
  		"Be extra careful about successful matches which consume no input.
  		After those, make sure to advance or finish if already at end."
  		matchEnd = searchStart ifTrue: 
  			[aStream atEnd
  				ifTrue:	[^self "rest after end of whileTrue: block is a no-op if atEnd"]
  				ifFalse:	[writeStream nextPut: aStream next]]].
  	aStream position: searchStart.
  	[aStream atEnd] whileFalse: [writeStream nextPut: aStream next]!

Item was changed:
  ----- Method: RxMatcher>>matchesStreamPrefix: (in category 'accessing') -----
  matchesStreamPrefix: theStream
  	"Match thyself against a positionable stream."
  
  	stream := theStream.
+ 	oldMarkerPositions := markerPositions := nil.
- 	markerPositions := nil.
  	^self tryMatch!

Item was changed:
  ----- Method: RxMatcher>>searchStream: (in category 'accessing') -----
  searchStream: aStream
  	"Search the stream for occurrence of something matching myself.
  	After the search has occurred, stop positioned after the end of the
  	matched substring. Answer a Boolean indicating success."
  
  	| position |
  	stream := aStream.
  	position := aStream position.
+ 	oldMarkerPositions := markerPositions := nil.
- 	markerPositions := nil.
  	[aStream atEnd] whileFalse:
  		[self tryMatch ifTrue: [^true].
  		aStream position: position; next.
  		position := aStream position].
  	"Try match at the very stream end too!!"
+ 	^self tryMatch!
- 	self tryMatch ifTrue: [^true]. 
- 	^false!

Item was changed:
  ----- Method: RxMatcher>>syntaxCharacter: (in category 'double dispatch') -----
  syntaxCharacter: charNode
  	"Double dispatch from the syntax tree. 
  	We get here when no merging characters into strings was possible."
  
  	| wanted |
  	wanted := charNode character.
  	^RxmPredicate new predicate: 
  		(ignoreCase
  			ifTrue: [[:char | char sameAs: wanted]]
+ 			ifFalse: [[:char | char == wanted]])!
- 			ifFalse: [[:char | char = wanted]])!

Item was changed:
  ----- Method: RxMatcher>>syntaxLookaround: (in category 'double dispatch') -----
  syntaxLookaround: lookaroundNode
  	"Double dispatch from the syntax tree. 
  	Special link can handle lookarounds (look ahead, positive and negative)."
  	| piece |
  	piece := lookaroundNode piece dispatchTo: self.
+ 	^ RxmLookahead with: piece!
- 	^ RxmLookahaed with: piece!

Item was changed:
  ----- Method: RxMatcher>>tryMatch (in category 'private') -----
  tryMatch
  	"Match thyself against the current stream."
  
+ 	| newMarkerPositions |
+ 	newMarkerPositions := oldMarkerPositions.
- 	| oldMarkerPositions |
  	oldMarkerPositions := markerPositions.
+ 	markerPositions := newMarkerPositions.
+ 	markerPositions
+ 		ifNil: [
+ 			markerPositions := Array new: markerCount.
+ 			1 to: markerCount do: [ :i |
+ 				| collection |
+ 				collection := OrderedCollection new: 2. "There are usually 0 or 1 objects to store."
+ 				collection resetTo: 3. "We'll add elements to the beginning, so make room there."
+ 				markerPositions at: i put: collection ] ]
+ 		ifNotNil: [
+ 			1 to: markerCount do: [ :i |
+ 				| collection |
+ 				collection := markerPositions at: i.
+ 				collection resetTo: collection capacity + 1 ] ].
+ 	lastResult := startOptimizer
+ 		ifNil: [ matcher matchAgainst: self]
+ 		ifNotNil: [ (startOptimizer canStartMatch: stream peek in: self) and: [ matcher matchAgainst: self ] ].
- 	markerPositions := Array new: markerCount.
- 	1 to: markerCount do: [ :i |
- 		| collection |
- 		collection := OrderedCollection new.
- 		collection resetTo: collection capacity + 1. "We'll add element to the beginning, so make room there."
- 		markerPositions at: i put: collection ].
- 	lastResult := startOptimizer isNil
- 		ifTrue: [ matcher matchAgainst: self]
- 		ifFalse: [ (startOptimizer canStartMatch: stream peek in: self) and: [ matcher matchAgainst: self ] ].
  	"check for duplicates"
+ 	lastResult ifFalse: [ ^false ].
+ 	oldMarkerPositions ifNil: [ ^true ].
+ 	(oldMarkerPositions hasEqualElements: markerPositions) ifFalse: [ ^true ].
- 	(lastResult not or: [ oldMarkerPositions isNil or: [ oldMarkerPositions size ~= markerPositions size ] ])
- 		ifTrue: [ ^ lastResult ].
- 	oldMarkerPositions with: markerPositions do: [ :oldPos :newPos |
- 		oldPos size = newPos size 
- 			ifFalse: [ ^ lastResult ].
- 		oldPos with: newPos do: [ :old :new |
- 			old = new
- 				ifFalse: [ ^ lastResult ] ] ].
  	"this is a duplicate"
  	^ lastResult := false!

Item was removed:
- RxmLink subclass: #RxmLookahaed
- 	instanceVariableNames: 'lookahead positive'
- 	classVariableNames: ''
- 	poolDictionaries: ''
- 	category: 'Regex-Core'!
- 
- !RxmLookahaed commentStamp: '<historical>' prior: 0!
- Instance holds onto a lookead which matches but does not consume anything.
- 
- Instance variables:
- 	predicate		<RxmLink>!

Item was removed:
- ----- Method: RxmLookahaed class>>with: (in category 'instance creation') -----
- with: aPiece
- 
- 	^self new lookahead: aPiece!

Item was removed:
- ----- Method: RxmLookahaed>>initialize (in category 'initialization') -----
- initialize
- 	super initialize.
- 	positive := true.!

Item was removed:
- ----- Method: RxmLookahaed>>lookahead (in category 'accessing') -----
- lookahead
- 	^ lookahead!

Item was removed:
- ----- Method: RxmLookahaed>>lookahead: (in category 'accessing') -----
- lookahead: anRxmLink
- 	lookahead := anRxmLink!

Item was removed:
- ----- Method: RxmLookahaed>>matchAgainst: (in category 'matching') -----
- matchAgainst: aMatcher
- 	"Match if the predicate block evaluates to true when given the
- 	current stream character as the argument."
- 
- 	| original result |
- 	original := aMatcher currentState.
- 	result := lookahead matchAgainst: aMatcher.
- 	aMatcher restoreState: original.
- 	^ result not 
- 		and: [ next matchAgainst: aMatcher ]!

Item was removed:
- ----- Method: RxmLookahaed>>terminateWith: (in category 'building') -----
- terminateWith: aNode
- 	lookahead terminateWith: aNode.
- 	super terminateWith: aNode.!

Item was added:
+ RxmLink subclass: #RxmLookahead
+ 	instanceVariableNames: 'lookahead positive'
+ 	classVariableNames: ''
+ 	poolDictionaries: ''
+ 	category: 'Regex-Core'!
+ 
+ !RxmLookahead commentStamp: '<historical>' prior: 0!
+ Instance holds onto a lookead which matches but does not consume anything.
+ 
+ Instance variables:
+ 	predicate		<RxmLink>!

Item was added:
+ ----- Method: RxmLookahead class>>with: (in category 'instance creation') -----
+ with: aPiece
+ 
+ 	^self new lookahead: aPiece!

Item was added:
+ ----- Method: RxmLookahead>>initialize (in category 'initialization') -----
+ initialize
+ 	super initialize.
+ 	positive := true.!

Item was added:
+ ----- Method: RxmLookahead>>lookahead (in category 'accessing') -----
+ lookahead
+ 	^ lookahead!

Item was added:
+ ----- Method: RxmLookahead>>lookahead: (in category 'accessing') -----
+ lookahead: anRxmLink
+ 	lookahead := anRxmLink!

Item was added:
+ ----- Method: RxmLookahead>>matchAgainst: (in category 'matching') -----
+ matchAgainst: aMatcher
+ 	"Match if the predicate block evaluates to true when given the
+ 	current stream character as the argument."
+ 
+ 	| original result |
+ 	original := aMatcher currentState.
+ 	result := lookahead matchAgainst: aMatcher.
+ 	aMatcher restoreState: original.
+ 	^ result not 
+ 		and: [ next matchAgainst: aMatcher ]!

Item was added:
+ ----- Method: RxmLookahead>>terminateWith: (in category 'building') -----
+ terminateWith: aNode
+ 	lookahead terminateWith: aNode.
+ 	super terminateWith: aNode.!

Item was changed:
  ----- Method: RxmPredicate>>matchAgainst: (in category 'matching') -----
  matchAgainst: aMatcher
  	"Match if the predicate block evaluates to true when given the
  	current stream character as the argument."
  
+ 	| nextCharacter originalState |
+ 	originalState := aMatcher currentState.
+ 	nextCharacter := aMatcher next ifNil: [
+ 		aMatcher restoreState: originalState.
- 	| original |
- 	aMatcher atEnd ifTrue: [ ^false ].
- 	original := aMatcher currentState.
- 	(predicate value: aMatcher next) ifFalse: [
- 		aMatcher restoreState: original.
  		^false ].
+ 	(predicate value: nextCharacter) ifTrue: [
+ 		(next matchAgainst: aMatcher) ifTrue: [ ^true ] ].
+ 	aMatcher restoreState: originalState.
- 	(next matchAgainst: aMatcher) ifTrue: [ ^true ].
- 	aMatcher restoreState: original.
  	^false
  !

Item was changed:
  RxmLink subclass: #RxmSubstring
+ 	instanceVariableNames: 'sampleStream caseSensitive ignoreCase'
- 	instanceVariableNames: 'sample compare'
  	classVariableNames: ''
  	poolDictionaries: ''
  	category: 'Regex-Core'!
  
  !RxmSubstring commentStamp: 'Tbn 11/12/2010 23:14' prior: 0!
  -- Regular Expression Matcher v 1.1 (C) 1996, 1999 Vassili Bykov
  --
  Instance holds onto a string and matches exactly this string, and exactly once.
  
  Instance variables:
  	string 	<String>!

Item was removed:
- ----- Method: RxmSubstring>>beCaseInsensitive (in category 'initialize-release') -----
- beCaseInsensitive
- 
- 	compare := [:char1 :char2 | char1 sameAs: char2]!

Item was removed:
- ----- Method: RxmSubstring>>beCaseSensitive (in category 'initialize-release') -----
- beCaseSensitive
- 
- 	compare := [:char1 :char2 | char1 = char2]!

Item was changed:
  ----- Method: RxmSubstring>>character:ignoreCase: (in category 'initialize-release') -----
  character: aCharacter ignoreCase: aBoolean
  	"Match exactly this character."
  
+ 	sampleStream := (String with: aCharacter) readStream.
+ 	ignoreCase := aBoolean!
- 	sample := String with: aCharacter.
- 	aBoolean ifTrue: [self beCaseInsensitive]!

Item was changed:
  ----- Method: RxmSubstring>>initialize (in category 'initialization') -----
  initialize
+ 
  	super initialize.
+ 	ignoreCase := false!
- 	self beCaseSensitive!

Item was changed:
  ----- Method: RxmSubstring>>matchAgainst: (in category 'matching') -----
  matchAgainst: aMatcher
  	"Match if my sample stream is exactly the current prefix
  	of the matcher stream's contents."
  
+ 	| nextSample nextFromMatcher originalState |
- 	| originalState sampleStream nextSample |
  	originalState := aMatcher currentState.
+ 	sampleStream reset.
+ 	ignoreCase 
+ 		ifFalse: [
+ 			[ (nextSample := sampleStream next) == nil or: [ (nextFromMatcher := aMatcher next) == nil ] ] whileFalse: [
+ 				nextSample == nextFromMatcher ifFalse: [
+ 					aMatcher restoreState: originalState.
+ 					^false ] ] ]
+ 		ifTrue: [ 
+ 			[ (nextSample := sampleStream next) == nil or: [ (nextFromMatcher := aMatcher next) == nil ] ] whileFalse: [
+ 				(nextSample sameAs: nextFromMatcher) ifFalse: [
+ 					aMatcher restoreState: originalState.
+ 					^false ] ] ].
- 	sampleStream := self sampleStream.
- 	[ (nextSample := sampleStream next) == nil or: [ aMatcher atEnd ] ] whileFalse: [
- 		(compare value: nextSample value: aMatcher next) ifFalse: [
- 			aMatcher restoreState: originalState.
- 			^false ] ].
  	(nextSample == nil and: [ next matchAgainst: aMatcher ]) ifTrue: [ ^true ].
  	aMatcher restoreState: originalState.
  	^false!

Item was removed:
- ----- Method: RxmSubstring>>sampleStream (in category 'private') -----
- sampleStream
- 
- 	^sample readStream!

Item was changed:
  ----- Method: RxmSubstring>>substring:ignoreCase: (in category 'initialize-release') -----
  substring: aString ignoreCase: aBoolean
  	"Match exactly this string."
  
+ 	sampleStream := aString readStream.
+ 	ignoreCase := aBoolean!
- 	sample := aString.
- 	aBoolean ifTrue: [self beCaseInsensitive]!

Item was changed:
  ----- Method: RxsPredicate class>>initializeEscapedLetterSelectors (in category 'class initialization') -----
  initializeEscapedLetterSelectors
  	"self initializeEscapedLetterSelectors"
  
  	| newEscapedLetterSelectors |
  	newEscapedLetterSelectors := Dictionary new
  		at: $w put: #beWordConstituent;
  		at: $W put: #beNotWordConstituent;
  		at: $d put: #beDigit;
  		at: $D put: #beNotDigit;
  		at: $s put: #beSpace;
  		at: $S put: #beNotSpace;
  		at: $\ put: #beBackslash;
  		at: $r put: #beCarriageReturn;
+ 		at: $n put: #beLineFeed;
+ 		at: $t put: #beTab;
  		yourself.
  	EscapedLetterSelectors := newEscapedLetterSelectors!

Item was changed:
  ----- Method: RxsPredicate>>beBackslash (in category 'initialize-release') -----
  beBackslash
  
+ 	self beCharacter: $\!
- 	predicate := [:char | char == $\].
- 	negation := [:char | char ~~ $\]!

Item was changed:
  ----- Method: RxsPredicate>>beCarriageReturn (in category 'initialize-release') -----
  beCarriageReturn
  
+ 	self beCharacter: Character cr!
- 	| cr |
- 	cr := Character cr.
- 	predicate := [ :char | char == cr ].
- 	negation := [ :char | char ~~ cr  ]!

Item was added:
+ ----- Method: RxsPredicate>>beCharacter: (in category 'initialize-release') -----
+ beCharacter: aCharacter
+ 
+ 	predicate := [ :char | char == aCharacter ].
+ 	negation := [ :char | char ~~ aCharacter  ]!

Item was added:
+ ----- Method: RxsPredicate>>beLineFeed (in category 'initialize-release') -----
+ beLineFeed
+ 
+ 	self beCharacter: Character lf!

Item was added:
+ ----- Method: RxsPredicate>>beTab (in category 'initialize-release') -----
+ beTab
+ 
+ 	self beCharacter: Character tab!

Item was added:
+ (PackageInfo named: 'Regex-Core') postscript: 'RxsPredicate initializeEscapedLetterSelectors'!



More information about the Squeak-dev mailing list