[squeak-dev] The Inbox: VB-Regex-nice.19.mcz
commits at source.squeak.org
commits at source.squeak.org
Thu May 12 20:32:41 UTC 2011
Nicolas Cellier uploaded a new version of VB-Regex to project The Inbox:
http://source.squeak.org/inbox/VB-Regex-nice.19.mcz
==================== Summary ====================
Name: VB-Regex-nice.19
Author: nice
Time: 12 May 2011, 10:32:30.073 pm
UUID: 0ed3c7d1-63ec-49a5-a4c0-fc20f0904826
Ancestors: VB-Regex-nice.18
Upgrade to pharo/VB-Regex-StephaneDucasse.49
which is a sync with version of lukas. http://source.lukas-renggli.ch/unsorted/
now up to 40
Not all Pharoism where moved in, only Lukas fixes.
Though we could use the '...' readStream/writeStream and a few others.
=============== Diff against VB-Regex-nice.18 ===============
Item was changed:
SystemOrganization addCategory: #'VB-Regex'!
SystemOrganization addCategory: #'VB-Regex-Exceptions'!
SystemOrganization addCategory: #'VB-Regex-Test'!
+ SystemOrganization addCategory: #'VB-Regex-Tests'!
Item was changed:
----- Method: RxMatchOptimizer>>syntaxCharSet: (in category 'double dispatch') -----
syntaxCharSet: charSetNode
"All these (or none of these) characters is the prefix."
+
charSetNode isNegated
+ ifTrue: [nonPrefixes addAll: (charSetNode enumerableSetIgnoringCase: ignoreCase)]
+ ifFalse: [prefixes addAll: (charSetNode enumerableSetIgnoringCase: ignoreCase)].
- ifTrue: [nonPrefixes addAll: charSetNode enumerableSet]
- ifFalse: [prefixes addAll: charSetNode enumerableSet].
charSetNode hasPredicates ifTrue:
[charSetNode isNegated
ifTrue: [nonPredicates addAll: charSetNode predicates]
ifFalse: [predicates addAll: charSetNode predicates]]!
Item was changed:
----- Method: RxMatcher>>atBeginningOfLine (in category 'testing') -----
atBeginningOfLine
+
+ ^self position = 0 or: [self lastChar = Cr]!
- ^self position = 0 or: [lastChar = Cr]!
Item was changed:
----- Method: RxMatcher>>atBeginningOfWord (in category 'testing') -----
atBeginningOfWord
+
+ ^(self isWordChar: self lastChar) not
- ^(self isWordChar: lastChar) not
and: [self isWordChar: stream peek]!
Item was changed:
----- Method: RxMatcher>>atEndOfWord (in category 'testing') -----
atEndOfWord
+
+ ^(self isWordChar: self lastChar)
- ^(self isWordChar: lastChar)
and: [(self isWordChar: stream peek) not]!
Item was changed:
----- Method: RxMatcher>>atWordBoundary (in category 'testing') -----
atWordBoundary
+
+ ^(self isWordChar: self lastChar)
- ^(self isWordChar: lastChar)
xor: (self isWordChar: stream peek)!
Item was changed:
----- Method: RxMatcher>>copyStream:to:replacingMatchesWith: (in category 'match enumeration') -----
copyStream: aStream to: writeStream replacingMatchesWith: aString
"Copy the contents of <aStream> on the <writeStream>, except for the matches. Replace each match with <aString>."
+
| searchStart matchStart matchEnd |
stream := aStream.
+ markerPositions := nil.
- lastChar := nil.
[searchStart := aStream position.
self proceedSearchingStream: aStream] whileTrue:
+ [matchStart := (self subBeginning: 1) first.
+ matchEnd := (self subEnd: 1) first.
- [matchStart := self subBeginning: 1.
- matchEnd := self subEnd: 1.
aStream position: searchStart.
searchStart to: matchStart - 1 do:
[:ignoredPos | writeStream nextPut: aStream next].
writeStream nextPutAll: aString.
+ aStream position: matchEnd.
+ "Be extra careful about successful matches which consume no input.
+ After those, make sure to advance or finish if already at end."
+ matchEnd = searchStart ifTrue:
+ [aStream atEnd
+ ifTrue: [^self "rest after end of whileTrue: block is a no-op if atEnd"]
+ ifFalse: [writeStream nextPut: aStream next]]].
- aStream position: matchEnd].
aStream position: searchStart.
[aStream atEnd] whileFalse: [writeStream nextPut: aStream next]!
Item was changed:
----- Method: RxMatcher>>copyStream:to:translatingMatchesUsing: (in category 'match enumeration') -----
copyStream: aStream to: writeStream translatingMatchesUsing: aBlock
"Copy the contents of <aStream> on the <writeStream>, except for the matches. For each match, evaluate <aBlock> passing the matched substring as the argument. Expect the block to answer a String, and write the answer to <writeStream> in place of the match."
+
| searchStart matchStart matchEnd match |
+ stream := aStream.
+ markerPositions := nil.
- stream := aStream.
- lastChar := nil.
[searchStart := aStream position.
self proceedSearchingStream: aStream] whileTrue:
+ [matchStart := (self subBeginning: 1) first.
+ matchEnd := (self subEnd: 1) first.
- [matchStart := self subBeginning: 1.
- matchEnd := self subEnd: 1.
aStream position: searchStart.
searchStart to: matchStart - 1 do:
[:ignoredPos | writeStream nextPut: aStream next].
match := (String new: matchEnd - matchStart + 1) writeStream.
matchStart to: matchEnd - 1 do:
[:ignoredPos | match nextPut: aStream next].
+ writeStream nextPutAll: (aBlock value: match contents).
+ "Be extra careful about successful matches which consume no input.
+ After those, make sure to advance or finish if already at end."
+ matchEnd = searchStart ifTrue:
+ [aStream atEnd
+ ifTrue: [^self "rest after end of whileTrue: block is a no-op if atEnd"]
+ ifFalse: [writeStream nextPut: aStream next]]].
- writeStream nextPutAll: (aBlock value: match contents)].
aStream position: searchStart.
[aStream atEnd] whileFalse: [writeStream nextPut: aStream next]!
Item was added:
+ ----- Method: RxMatcher>>lastChar (in category 'accessing') -----
+ lastChar
+ ^ stream position = 0
+ ifFalse: [ stream skip: -1; next ]!
Item was removed:
- ----- Method: RxMatcher>>markerPositionAt: (in category 'privileged') -----
- markerPositionAt: anIndex
- ^markerPositions at: anIndex!
Item was added:
+ ----- Method: RxMatcher>>markerPositionAt:add: (in category 'privileged') -----
+ markerPositionAt: anIndex add: position
+ "Remember position of another instance of the given marker."
+
+ (markerPositions at: anIndex) addFirst: position!
Item was removed:
- ----- Method: RxMatcher>>markerPositionAt:maybePut: (in category 'privileged') -----
- markerPositionAt: anIndex maybePut: position
- "Set position of the given marker, if not already set."
- (markerPositions at: anIndex) == nil
- ifTrue: [markerPositions at: anIndex put: position]!
Item was changed:
----- Method: RxMatcher>>matchesOnStream:do: (in category 'match enumeration') -----
matchesOnStream: aStream do: aBlock
+ "Be extra careful about successful matches which consume no input.
+ After those, make sure to advance or finish if already at end."
+
+ | position |
+ [position := aStream position.
+ self searchStream: aStream] whileTrue:
+ [aBlock value: (self subexpression: 1).
+ position = aStream position ifTrue:
+ [aStream atEnd
+ ifTrue: [^self]
+ ifFalse: [aStream next]]]!
- [self searchStream: aStream] whileTrue:
- [aBlock value: (self subexpression: 1)]!
Item was changed:
----- Method: RxMatcher>>matchesStreamPrefix: (in category 'accessing') -----
matchesStreamPrefix: theStream
"Match thyself against a positionable stream."
+
stream := theStream.
+ markerPositions := nil.
- lastChar := nil.
^self tryMatch!
Item was added:
+ ----- Method: RxMatcher>>matchingRangesIn: (in category 'match enumeration') -----
+ matchingRangesIn: aString
+ "Search aString repeatedly for the matches of the receiver. Answer an OrderedCollection of ranges of each match (index of first character to: index of last character)."
+
+ | result |
+ result := OrderedCollection new.
+ self
+ matchesIn: aString
+ do: [:match | result add: (self position - match size + 1 to: self position)].
+ ^result!
Item was changed:
----- Method: RxMatcher>>next (in category 'streaming') -----
next
+ ^ stream next!
- lastChar := stream next.
- ^lastChar!
Item was changed:
----- Method: RxMatcher>>proceedSearchingStream: (in category 'private') -----
proceedSearchingStream: aStream
+
| position |
position := aStream position.
[aStream atEnd] whileFalse:
[self tryMatch ifTrue: [^true].
+ aStream position: position; next.
- aStream position: position.
- lastChar := aStream next.
position := aStream position].
"Try match at the very stream end too!!"
self tryMatch ifTrue: [^true].
^false!
Item was changed:
----- Method: RxMatcher>>searchStream: (in category 'accessing') -----
searchStream: aStream
"Search the stream for occurrence of something matching myself.
After the search has occurred, stop positioned after the end of the
matched substring. Answer a Boolean indicating success."
+
| position |
stream := aStream.
- lastChar := nil.
position := aStream position.
+ markerPositions := nil.
[aStream atEnd] whileFalse:
[self tryMatch ifTrue: [^true].
+ aStream position: position; next.
- aStream position: position.
- lastChar := aStream next.
position := aStream position].
"Try match at the very stream end too!!"
self tryMatch ifTrue: [^true].
^false!
Item was changed:
----- Method: RxMatcher>>subexpression: (in category 'accessing') -----
subexpression: subIndex
+ "Answer a string that matched the subexpression at the given index.
+ If there are multiple matches, answer the last one.
+ If there are no matches, answer nil.
+ (NB: it used to answer an empty string but I think nil makes more sense)."
+
+ | matches |
+ matches := self subexpressions: subIndex.
+ ^matches isEmpty ifTrue: [nil] ifFalse: [matches last]!
- | originalPosition start end reply |
- originalPosition := stream position.
- start := self subBeginning: subIndex.
- end := self subEnd: subIndex.
- (start isNil or: [end isNil]) ifTrue: [^String new].
- reply := (String new: end - start) writeStream.
- stream position: start.
- start to: end - 1 do: [:ignored | reply nextPut: stream next].
- stream position: originalPosition.
- ^reply contents!
Item was removed:
- ----- Method: RxMatcher>>subexpressions (in category 'accessing') -----
- subexpressions
- | result |
- result := Array new: self subexpressionCount.
- 1 to: self subexpressionCount do: [:index |
- result
- at: index
- put: (self subexpression: index) ].
- ^ result!
Item was added:
+ ----- Method: RxMatcher>>subexpressions: (in category 'accessing') -----
+ subexpressions: subIndex
+ "Answer an array of all matches of the subexpression at the given index.
+ The answer is always an array; it is empty if there are no matches."
+
+ | originalPosition startPositions stopPositions reply |
+ originalPosition := stream position.
+ startPositions := self subBeginning: subIndex.
+ stopPositions := self subEnd: subIndex.
+ (startPositions isEmpty or: [stopPositions isEmpty]) ifTrue: [^Array new].
+ reply := OrderedCollection new.
+ startPositions with: stopPositions do:
+ [:start :stop |
+ stream position: start.
+ reply add: (stream next: stop - start)].
+ stream position: originalPosition.
+ ^reply asArray!
Item was removed:
- ----- Method: RxMatcher>>submatchesIn: (in category 'match enumeration') -----
- submatchesIn: aString
- "Search aString repeatedly for the matches of the receiver. Answer an OrderedCollection with an array of subexpressions per match."
- | result |
- result := OrderedCollection new.
- self
- submatchesOnStream: aString readStream
- do: [:subexprs | result add: subexprs].
- ^result!
Item was removed:
- ----- Method: RxMatcher>>submatchesIn:collect: (in category 'match enumeration') -----
- submatchesIn: aString collect: aBlock
- "Search aString repeatedly for the matches of the receiver. Evaluate aBlock for each match passing the collection of matched subexpressions as the argument, collecting evaluation results in an OrderedCollection."
- | result |
- result := OrderedCollection new.
- self
- submatchesOnStream: aString readStream
- do: [:subexprs | result add: (aBlock value: subexprs)].
- ^result!
Item was removed:
- ----- Method: RxMatcher>>submatchesIn:do: (in category 'match enumeration') -----
- submatchesIn: aString do: aBlock
- "Search aString repeatedly for the matches of the receiver.
- Evaluate aBlock for each match passing the collection of matched subexpressions
- as the argument."
- self
- submatchesOnStream: aString readStream
- do: aBlock!
Item was removed:
- ----- Method: RxMatcher>>submatchesOnStream:do: (in category 'match enumeration') -----
- submatchesOnStream: aStream do: aBlock
- [self searchStream: aStream] whileTrue:
- [aBlock value: self subexpressions]!
Item was changed:
----- Method: RxMatcher>>syntaxAny (in category 'double dispatch') -----
syntaxAny
"Double dispatch from the syntax tree.
+ Create a matcher for any non-null character."
+
- Create a matcher for any non-whitespace character."
^RxmPredicate new
+ predicate: [:char | char asInteger ~= 0]!
- predicate: [:char | (Cr = char or: [Lf = char]) not]!
Item was changed:
----- Method: RxMatcher>>syntaxCharSet: (in category 'double dispatch') -----
syntaxCharSet: charSetNode
"Double dispatch from the syntax tree.
A character set is a few characters, and we either match any of them,
or match any that is not one of them."
+
+ ^RxmPredicate with: (charSetNode predicateIgnoringCase: ignoreCase)!
- ^RxmPredicate with: charSetNode predicate!
Item was changed:
----- Method: RxMatcher>>tryMatch (in category 'private') -----
tryMatch
"Match thyself against the current stream."
+
+ | oldMarkerPositions |
+ oldMarkerPositions := markerPositions.
markerPositions := Array new: markerCount.
+ 1 to: markerCount do: [ :i | markerPositions at: i put: OrderedCollection new ].
+ lastResult := startOptimizer isNil
+ ifTrue: [ matcher matchAgainst: self]
+ ifFalse: [ (startOptimizer canStartMatch: stream peek in: self) and: [ matcher matchAgainst: self ] ].
+ "check for duplicates"
+ (lastResult not or: [ oldMarkerPositions isNil or: [ oldMarkerPositions size ~= markerPositions size ] ])
+ ifTrue: [ ^ lastResult ].
+ oldMarkerPositions with: markerPositions do: [ :oldPos :newPos |
+ oldPos size = newPos size
+ ifFalse: [ ^ lastResult ].
+ oldPos with: newPos do: [ :old :new |
+ old = new
+ ifFalse: [ ^ lastResult ] ] ].
+ "this is a duplicate"
+ ^ lastResult := false!
- startOptimizer == nil
- ifTrue: [lastResult := matcher matchAgainst: self]
- ifFalse: [lastResult := (startOptimizer canStartMatch: stream peek in: self)
- and: [matcher matchAgainst: self]].
- ^lastResult!
Item was added:
+ TestCase subclass: #RxMatcherTest
+ instanceVariableNames: ''
+ classVariableNames: ''
+ poolDictionaries: ''
+ category: 'VB-Regex-Tests'!
Item was added:
+ ----- Method: RxMatcherTest class>>packageNamesUnderTest (in category 'accessing') -----
+ packageNamesUnderTest
+ ^ #('VB-Regex')!
Item was added:
+ ----- Method: RxMatcherTest>>compileRegex: (in category 'utilties') -----
+ compileRegex: aString
+ "Compile the regex and answer the matcher, or answer nil if compilation fails."
+
+ | syntaxTree |
+ syntaxTree := self parserClass safelyParse: aString.
+ ^ syntaxTree isNil ifFalse: [ self matcherClass for: syntaxTree ]!
Item was added:
+ ----- Method: RxMatcherTest>>henryReadme (in category 'testing-henry') -----
+ henryReadme
+ self error: 'The tests in this category are based on the ones in Henry Spencer''s regexp.c package.'!
Item was added:
+ ----- Method: RxMatcherTest>>matcherClass (in category 'accessing') -----
+ matcherClass
+ ^ RxMatcher!
Item was added:
+ ----- Method: RxMatcherTest>>parserClass (in category 'accessing') -----
+ parserClass
+ ^ RxParser!
Item was added:
+ ----- Method: RxMatcherTest>>runMatcher:with:expect:withSubexpressions: (in category 'utilties') -----
+ runMatcher: aMatcher with: aString expect: aBoolean withSubexpressions: anArray
+ | copy got |
+ copy := aMatcher
+ copy: aString
+ translatingMatchesUsing: [ :each | each ].
+ self
+ assert: copy = aString
+ description: 'Copying: expected ' , aString printString , ', but got ' , copy printString.
+ got := aMatcher search: aString.
+ self
+ assert: got = aBoolean
+ description: 'Searching: expected ' , aBoolean printString , ', but got ' , got printString.
+ (anArray isNil or: [ aMatcher supportsSubexpressions not ])
+ ifTrue: [ ^ self ].
+ 1 to: anArray size by: 2 do: [ :index |
+ | sub subExpect subGot |
+ sub := anArray at: index.
+ subExpect := anArray at: index + 1.
+ subGot := aMatcher subexpression: sub.
+ self
+ assert: subExpect = subGot
+ description: 'Subexpression ' , sub printString , ': expected ' , subExpect printString , ', but got ' , subGot printString ]!
Item was added:
+ ----- Method: RxMatcherTest>>runRegex: (in category 'utilties') -----
+ runRegex: anArray
+ "Run a clause anArray against a set of tests. Each clause is an array with a regex source string followed by sequence of 3-tuples. Each three-element group is one test to try against the regex, and includes: 1) test string; 2) expected result; 3) expected subexpression as an array of (index, substring), or nil."
+
+ | source matcher |
+ source := anArray first.
+ matcher := self compileRegex: source.
+ matcher isNil
+ ifTrue: [
+ (anArray at: 2) isNil
+ ifFalse: [ self signalFailure: 'Compilation failed, should have succeeded: ' , source printString ] ]
+ ifFalse: [
+ (anArray at: 2) isNil
+ ifTrue: [ self signalFailure: 'Compilation succeeded, should have failed: ' , source printString ]
+ ifFalse: [
+ 2 to: anArray size by: 3 do: [ :index |
+ self
+ runMatcher: matcher
+ with: (anArray at: index)
+ expect: (anArray at: index + 1)
+ withSubexpressions: (anArray at: index + 2) ] ] ]!
Item was added:
+ ----- Method: RxMatcherTest>>testCaseInsensitive (in category 'testing-protocol') -----
+ testCaseInsensitive
+ | matcher |
+ matcher := self matcherClass forString: 'the quick brown fox' ignoreCase: true.
+ self assert: (matcher search: 'the quick brown fox').
+ self assert: (matcher search: 'The quick brown FOX').
+ self assert: (matcher search: 'What do you know about the quick brown fox?').
+ self assert: (matcher search: 'What do you know about THE QUICK BROWN FOX?')!
Item was added:
+ ----- Method: RxMatcherTest>>testCaseSensitive (in category 'testing-protocol') -----
+ testCaseSensitive
+ | matcher |
+ matcher := self matcherClass forString: 'the quick brown fox' ignoreCase: false.
+ self assert: (matcher search: 'the quick brown fox').
+ self deny: (matcher search: 'The quick brown FOX').
+ self assert: (matcher search: 'What do you know about the quick brown fox?').
+ self deny: (matcher search: 'What do you know about THE QUICK BROWN FOX?')!
Item was added:
+ ----- Method: RxMatcherTest>>testCopyReplacingMatches (in category 'testing-protocol') -----
+ testCopyReplacingMatches
+ "See that the match context is preserved while copying stuff between matches:"
+
+ | matcher |
+ matcher := self matcherClass forString: '\<\d\D+'.
+ self assert: (matcher copy: '9aaa1bbb 8ccc' replacingMatchesWith: 'foo')
+ = 'foo1bbb foo'!
Item was added:
+ ----- Method: RxMatcherTest>>testCopyTranslatingMatches (in category 'testing-protocol') -----
+ testCopyTranslatingMatches
+ | matcher |
+ matcher := self matcherClass forString: '\w+'.
+ self assert: (matcher copy: 'now is the time ' translatingMatchesUsing: [ :each | each reversed ])
+ = 'won si eht emit '!
Item was added:
+ ----- Method: RxMatcherTest>>testEmptyStringAtBeginningOfLine (in category 'testing-empty') -----
+ testEmptyStringAtBeginningOfLine
+ | matcher |
+ matcher := self matcherClass forString: '^'.
+ self
+ assert: (matcher copy: 'foo1 bar1' , String cr , 'foo2 bar2' replacingMatchesWith: '*')
+ = ('*foo1 bar1' , String cr , '*foo2 bar2')
+ description: 'An empty string at the beginning of a line'!
Item was added:
+ ----- Method: RxMatcherTest>>testEmptyStringAtBeginningOfWord (in category 'testing-empty') -----
+ testEmptyStringAtBeginningOfWord
+ | matcher |
+ matcher := self matcherClass forString: '\<'.
+ self
+ assert: (matcher copy: 'foo bar' replacingMatchesWith: '*')
+ = '*foo *bar'
+ description: 'An empty string at the beginning of a word'!
Item was added:
+ ----- Method: RxMatcherTest>>testEmptyStringAtEndOfLine (in category 'testing-empty') -----
+ testEmptyStringAtEndOfLine
+ | matcher |
+ matcher := self matcherClass forString: '$'.
+ self
+ assert: (matcher copy: 'foo1 bar1' , String cr , 'foo2 bar2' replacingMatchesWith: '*')
+ = ('foo1 bar1*', String cr , 'foo2 bar2*')
+ description: 'An empty string at the end of a line'!
Item was added:
+ ----- Method: RxMatcherTest>>testEmptyStringAtEndOfWord (in category 'testing-empty') -----
+ testEmptyStringAtEndOfWord
+ | matcher |
+ matcher := self matcherClass forString: '\>'.
+ self
+ assert: (matcher copy: 'foo bar' replacingMatchesWith: '*')
+ = 'foo* bar*'
+ description: 'An empty string at the end of a word'!
Item was added:
+ ----- Method: RxMatcherTest>>testEmptyStringAtWordBoundary (in category 'testing-empty') -----
+ testEmptyStringAtWordBoundary
+ | matcher |
+ matcher := self matcherClass forString: '\b'.
+ self
+ assert: (matcher copy: 'foo bar' replacingMatchesWith: '*')
+ = '*foo* *bar*'
+ description: 'An empty string at a word boundary'!
Item was added:
+ ----- Method: RxMatcherTest>>testEmptyStringNotAtWordBoundary (in category 'testing-empty') -----
+ testEmptyStringNotAtWordBoundary
+ | matcher |
+ matcher := self matcherClass forString: '\B'.
+ self
+ assert: (matcher copy: 'foo bar' replacingMatchesWith: '*')
+ = 'f*o*o b*a*r'
+ description: 'An empty string not at a word boundary'!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry001 (in category 'testing-henry') -----
+ testHenry001
+ self runRegex: #('abc'
+ 'abc' true (1 'abc')
+ 'xbc' false nil
+ 'axc' false nil
+ 'abx' false nil
+ 'xabcy' true (1 'abc')
+ 'ababc' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry002 (in category 'testing-henry') -----
+ testHenry002
+ self runRegex: #('ab*c'
+ 'abc' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry003 (in category 'testing-henry') -----
+ testHenry003
+ self runRegex: #('ab*bc'
+ 'abc' true (1 'abc')
+ 'abbc' true (1 'abbc')
+ 'abbbbc' true (1 'abbbbc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry004 (in category 'testing-henry') -----
+ testHenry004
+ self runRegex: #('ab+bc'
+ 'abbc' true (1 'abbc')
+ 'abc' false nil
+ 'abq' false nil
+ 'abbbbc' true (1 'abbbbc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry005 (in category 'testing-henry') -----
+ testHenry005
+ self runRegex: #('ab?bc'
+ 'abbc' true (1 'abbc')
+ 'abc' true (1 'abc')
+ 'abbbbc' false nil
+ 'abc' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry006 (in category 'testing-henry') -----
+ testHenry006
+ self runRegex: #('^abc$'
+ 'abc' true (1 'abc')
+ 'abcc' false nil
+ 'aabc' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry007 (in category 'testing-henry') -----
+ testHenry007
+ self runRegex: #('^abc'
+ 'abcc' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry008 (in category 'testing-henry') -----
+ testHenry008
+ self runRegex: #('abc$'
+ 'aabc' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry009 (in category 'testing-henry') -----
+ testHenry009
+ self runRegex: #('^'
+ 'abc' true nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry010 (in category 'testing-henry') -----
+ testHenry010
+ self runRegex: #('$'
+ 'abc' true nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry011 (in category 'testing-henry') -----
+ testHenry011
+ self runRegex: #('a.c'
+ 'abc' true (1 'abc')
+ 'axc' true (1 'axc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry012 (in category 'testing-henry') -----
+ testHenry012
+ "Need to get creative to include the null character..."
+ self runRegex: #('a.*c'
+ 'axyzc' true (1 'axyzc')
+ 'axy zc' true (1 'axy zc') "testing that a dot matches a space"
+ ), (Array with: 'axy', (String with: 0 asCharacter), 'zc'), #(false nil "testing that a dot does not match a null"
+ 'axyzd' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry013 (in category 'testing-henry') -----
+ testHenry013
+ self runRegex: #('.a.*'
+ '1234abc' true (1 '4abc')
+ 'abcd' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry014 (in category 'testing-henry') -----
+ testHenry014
+ self runRegex: #('a\w+c'
+ ' abbbbc ' true (1 'abbbbc')
+ 'abb bc' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry015 (in category 'testing-henry') -----
+ testHenry015
+ self runRegex: #('\w+'
+ ' foobar quux' true (1 'foobar')
+ ' ~!!@#$%^&*()-+=\|/?.>,<' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry016 (in category 'testing-henry') -----
+ testHenry016
+ self runRegex: #('a\W+c'
+ 'a c' true (1 'a c')
+ 'a bc' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry017 (in category 'testing-henry') -----
+ testHenry017
+ self runRegex: #('\W+'
+ 'foo!!@#$bar' true (1 '!!@#$')
+ 'foobar' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry018 (in category 'testing-henry') -----
+ testHenry018
+ self runRegex: #('a\s*c'
+ 'a c' true (1 'a c')
+ 'a bc' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry019 (in category 'testing-henry') -----
+ testHenry019
+ self runRegex: #('\s+'
+ 'abc3457 sd' true (1 ' ')
+ '1234$^*^&asdfb' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry020 (in category 'testing-henry') -----
+ testHenry020
+ self runRegex: #('a\S*c'
+ 'aqwertyc' true (1 'aqwertyc')
+ 'ab c' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry021 (in category 'testing-henry') -----
+ testHenry021
+ self runRegex: #('\S+'
+ ' asdf ' true (1 'asdf')
+ '
+ ' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry022 (in category 'testing-henry') -----
+ testHenry022
+ self runRegex: #('a\d+c'
+ 'a0123456789c' true (1 'a0123456789c')
+ 'a12b34c' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry023 (in category 'testing-henry') -----
+ testHenry023
+ self runRegex: #('\d+'
+ 'foo@#$%123ASD #$$%^&' true (1 '123')
+ 'foo!!@#$asdfl;' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry024 (in category 'testing-henry') -----
+ testHenry024
+ self runRegex: #('a\D+c'
+ 'aqwertyc' true (1 'aqwertyc')
+ 'aqw6ertc' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry025 (in category 'testing-henry') -----
+ testHenry025
+ self runRegex: #('\D+'
+ '1234 abc 456' true (1 ' abc ')
+ '1234567890' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry026 (in category 'testing-henry') -----
+ testHenry026
+ self runRegex: #('(f|o)+\b'
+ 'foo' true (1 'foo')
+ ' foo ' true (1 'foo'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry027 (in category 'testing-henry') -----
+ testHenry027
+ self runRegex: #('\ba\w+' "a word beginning with an A"
+ 'land ancient' true (1 'ancient')
+ 'antique vase' true (1 'antique')
+ 'goofy foobar' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry028 (in category 'testing-henry') -----
+ testHenry028
+ self runRegex: #('(f|o)+\B'
+ 'quuxfoobar' true (1 'foo')
+ 'quuxfoo ' true (1 'fo'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry029 (in category 'testing-henry') -----
+ testHenry029
+ self runRegex: #('\Ba\w+' "a word with an A in the middle, match at A and further"
+ 'land ancient' true (1 'and')
+ 'antique vase' true (1 'ase')
+ 'smalltalk shall overcome' true (1 'alltalk')
+ 'foonix is better' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry030 (in category 'testing-henry') -----
+ testHenry030
+ self runRegex: #('fooa\>.*'
+ 'fooa ' true nil
+ 'fooa123' false nil
+ 'fooa bar' true nil
+ 'fooa' true nil
+ 'fooargh' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry031 (in category 'testing-henry') -----
+ testHenry031
+ self runRegex: #('\>.+abc'
+ ' abcde fg' false nil
+ 'foo abcde' true (1 ' abc')
+ 'abcde' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry032 (in category 'testing-henry') -----
+ testHenry032
+ self runRegex: #('\<foo.*'
+ 'foo' true nil
+ 'foobar' true nil
+ 'qfoobarq foonix' true (1 'foonix')
+ ' foo' true nil
+ ' 12foo' false nil
+ 'barfoo' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry033 (in category 'testing-henry') -----
+ testHenry033
+ self runRegex: #('.+\<foo'
+ 'foo' false nil
+ 'ab foo' true (1 'ab foo')
+ 'abfoo' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry034 (in category 'testing-henry') -----
+ testHenry034
+ self runRegex: #('a[bc]d'
+ 'abc' false nil
+ 'abd' true (1 'abd'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry035 (in category 'testing-henry') -----
+ testHenry035
+ self runRegex: #('a[b-d]e'
+ 'abd' false nil
+ 'ace' true (1 'ace'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry036 (in category 'testing-henry') -----
+ testHenry036
+ self runRegex: #('a[b-d]'
+ 'aac' true (1 'ac'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry037 (in category 'testing-henry') -----
+ testHenry037
+ self runRegex: #('a[-b]'
+ 'a-' true (1 'a-'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry038 (in category 'testing-henry') -----
+ testHenry038
+ self runRegex: #('a[b-]'
+ 'a-' true (1 'a-'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry039 (in category 'testing-henry') -----
+ testHenry039
+ self runRegex: #('a[a-b-c]' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry040 (in category 'testing-henry') -----
+ testHenry040
+ self runRegex: #('[k]'
+ 'ab' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry041 (in category 'testing-henry') -----
+ testHenry041
+ self runRegex: #('a[b-a]' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry042 (in category 'testing-henry') -----
+ testHenry042
+ self runRegex: #('a[]b' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry043 (in category 'testing-henry') -----
+ testHenry043
+ self runRegex: #('a[' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry044 (in category 'testing-henry') -----
+ testHenry044
+ self runRegex: #('a]'
+ 'a]' true (1 'a]'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry045 (in category 'testing-henry') -----
+ testHenry045
+ self runRegex: #('a[]]b'
+ 'a]b' true (1 'a]b'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry046 (in category 'testing-henry') -----
+ testHenry046
+ self runRegex: #('a[^bc]d'
+ 'aed' true (1 'aed')
+ 'abd' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry047 (in category 'testing-henry') -----
+ testHenry047
+ self runRegex: #('a[^-b]c'
+ 'adc' true (1 'adc')
+ 'a-c' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry048 (in category 'testing-henry') -----
+ testHenry048
+ self runRegex: #('a[^]b]c'
+ 'a]c' false nil
+ 'adc' true (1 'adc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry049 (in category 'testing-henry') -----
+ testHenry049
+ self runRegex: #('[\de]+'
+ '01234' true (1 '01234')
+ '0123e456' true (1 '0123e456')
+ '0123e45g78' true (1 '0123e45'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry050 (in category 'testing-henry') -----
+ testHenry050
+ self runRegex: #('[e\d]+' "reversal of the above, should be the same"
+ '01234' true (1 '01234')
+ '0123e456' true (1 '0123e456')
+ '0123e45g78' true (1 '0123e45'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry051 (in category 'testing-henry') -----
+ testHenry051
+ self runRegex: #('[\D]+'
+ '123abc45def78' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry052 (in category 'testing-henry') -----
+ testHenry052
+ self runRegex: #('[[:digit:]e]+'
+ '01234' true (1 '01234')
+ '0123e456' true (1 '0123e456')
+ '0123e45g78' true (1 '0123e45'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry053 (in category 'testing-henry') -----
+ testHenry053
+ self runRegex: #('[\s]+'
+ '2 spaces' true (1 ' '))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry054 (in category 'testing-henry') -----
+ testHenry054
+ self runRegex: #('[\S]+'
+ ' word12!!@#$ ' true (1 'word12!!@#$'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry055 (in category 'testing-henry') -----
+ testHenry055
+ self runRegex: #('[\w]+'
+ ' foo123bar 45' true (1 'foo123bar'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry056 (in category 'testing-henry') -----
+ testHenry056
+ self runRegex: #('[\W]+'
+ 'fii234!!@#$34f' true (1 '!!@#$'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry057 (in category 'testing-henry') -----
+ testHenry057
+ self runRegex: #('[^[:alnum:]]+'
+ 'fii234!!@#$34f' true (1 '!!@#$'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry058 (in category 'testing-henry') -----
+ testHenry058
+ self runRegex: #('[%&[:alnum:]]+'
+ 'foo%3' true (1 'foo%3')
+ 'foo34&rt4$57a' true (1 'foo34&rt4')
+ '!!@#$' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry059 (in category 'testing-henry') -----
+ testHenry059
+ self runRegex: #('[[:alpha:]]+'
+ ' 123foo3 ' true (1 'foo')
+ '123foo' true (1 'foo')
+ 'foo1b' true (1 'foo'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry060 (in category 'testing-henry') -----
+ testHenry060
+ self runRegex: #('[[:cntrl:]]+'
+ ' a 1234asdf' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry061 (in category 'testing-henry') -----
+ testHenry061
+ self runRegex: #('[[:lower:]]+'
+ 'UPPERlower1234' true (1 'lower')
+ 'lowerUPPER' true (1 'lower'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry062 (in category 'testing-henry') -----
+ testHenry062
+ self runRegex: #('[[:upper:]]+'
+ 'UPPERlower1234' true (1 'UPPER')
+ 'lowerUPPER ' true (1 'UPPER'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry063 (in category 'testing-henry') -----
+ testHenry063
+ self runRegex: #('[[:space:]]+'
+ '2 spaces' true (1 ' '))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry064 (in category 'testing-henry') -----
+ testHenry064
+ self runRegex: #('[^[:space:]]+'
+ ' word12!!@#$ ' true (1 'word12!!@#$'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry065 (in category 'testing-henry') -----
+ testHenry065
+ self runRegex: #('[[:graph:]]+'
+ 'abc' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry066 (in category 'testing-henry') -----
+ testHenry066
+ self runRegex: #('[[:print:]]+'
+ 'abc' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry067 (in category 'testing-henry') -----
+ testHenry067
+ self runRegex: #('[^[:punct:]]+'
+ '!!hello,world!!' true (1 'hello'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry068 (in category 'testing-henry') -----
+ testHenry068
+ self runRegex: #('[[:xdigit:]]+'
+ ' x10FCD ' true (1 '10FCD')
+ ' hgfedcba0123456789ABCDEFGH '
+ true (1 'fedcba0123456789ABCDEF'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry069 (in category 'testing-henry') -----
+ testHenry069
+ self runRegex: #('ab|cd'
+ 'abc' true (1 'ab')
+ 'abcd' true (1 'ab'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry070 (in category 'testing-henry') -----
+ testHenry070
+ self runRegex: #('()ef'
+ 'def' true (1 'ef' 2 ''))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry071 (in category 'testing-henry') -----
+ testHenry071
+ self runRegex: #('()*' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry072 (in category 'testing-henry') -----
+ testHenry072
+ self runRegex: #('*a' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry073 (in category 'testing-henry') -----
+ testHenry073
+ self runRegex: #('^*' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry074 (in category 'testing-henry') -----
+ testHenry074
+ self runRegex: #('$*' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry075 (in category 'testing-henry') -----
+ testHenry075
+ self runRegex: #('(*)b' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry076 (in category 'testing-henry') -----
+ testHenry076
+ self runRegex: #('$b' 'b' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry077 (in category 'testing-henry') -----
+ testHenry077
+ self runRegex: #('a\' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry078 (in category 'testing-henry') -----
+ testHenry078
+ self runRegex: #('a\(b'
+ 'a(b' true (1 'a(b'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry079 (in category 'testing-henry') -----
+ testHenry079
+ self runRegex: #('a\(*b'
+ 'ab' true (1 'ab')
+ 'a((b' true (1 'a((b'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry080 (in category 'testing-henry') -----
+ testHenry080
+ self runRegex: #('a\\b'
+ 'a\b' true (1 'a\b'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry081 (in category 'testing-henry') -----
+ testHenry081
+ self runRegex: #('abc)' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry082 (in category 'testing-henry') -----
+ testHenry082
+ self runRegex: #('(abc' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry083 (in category 'testing-henry') -----
+ testHenry083
+ self runRegex: #('((a))'
+ 'abc' true (1 'a' 2 'a' 3 'a'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry084 (in category 'testing-henry') -----
+ testHenry084
+ self runRegex: #('(a)b(c)'
+ 'abc' true (1 'abc' 2 'a' 3 'c'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry085 (in category 'testing-henry') -----
+ testHenry085
+ self runRegex: #('a+b+c'
+ 'aabbabc' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry086 (in category 'testing-henry') -----
+ testHenry086
+ self runRegex: #('a**' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry087 (in category 'testing-henry') -----
+ testHenry087
+ self runRegex: #('a*?' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry088 (in category 'testing-henry') -----
+ testHenry088
+ self runRegex: #('(a*)*' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry089 (in category 'testing-henry') -----
+ testHenry089
+ self runRegex: #('(a*)+' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry090 (in category 'testing-henry') -----
+ testHenry090
+ self runRegex: #('(a|)*' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry091 (in category 'testing-henry') -----
+ testHenry091
+ self runRegex: #('(a*|b)*' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry092 (in category 'testing-henry') -----
+ testHenry092
+ self runRegex: #('(a+|b)*'
+ 'ab' true (1 'ab' 2 'b'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry093 (in category 'testing-henry') -----
+ testHenry093
+ self runRegex: #('(a+|b)+'
+ 'ab' true (1 'ab' 2 'b'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry094 (in category 'testing-henry') -----
+ testHenry094
+ self runRegex: #('(a+|b)?'
+ 'ab' true (1 'a' 2 'a'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry095 (in category 'testing-henry') -----
+ testHenry095
+ self runRegex: #('[^ab]*'
+ 'cde' true (1 'cde'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry096 (in category 'testing-henry') -----
+ testHenry096
+ self runRegex: #('(^)*' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry097 (in category 'testing-henry') -----
+ testHenry097
+ self runRegex: #('(ab|)*' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry098 (in category 'testing-henry') -----
+ testHenry098
+ self runRegex: #(')(' nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry099 (in category 'testing-henry') -----
+ testHenry099
+ self runRegex: #('' 'abc' true (1 ''))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry100 (in category 'testing-henry') -----
+ testHenry100
+ self runRegex: #('abc' '' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry101 (in category 'testing-henry') -----
+ testHenry101
+ self runRegex: #('a*'
+ '' true '')!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry102 (in category 'testing-henry') -----
+ testHenry102
+ self runRegex: #('abcd'
+ 'abcd' true (1 'abcd'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry103 (in category 'testing-henry') -----
+ testHenry103
+ self runRegex: #('a(bc)d'
+ 'abcd' true (1 'abcd' 2 'bc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry104 (in category 'testing-henry') -----
+ testHenry104
+ self runRegex: #('([abc])*d'
+ 'abbbcd' true (1 'abbbcd' 2 'c'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry105 (in category 'testing-henry') -----
+ testHenry105
+ self runRegex: #('([abc])*bcd'
+ 'abcd' true (1 'abcd' 2 'a'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry106 (in category 'testing-henry') -----
+ testHenry106
+ self runRegex: #('a|b|c|d|e' 'e' true (1 'e'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry107 (in category 'testing-henry') -----
+ testHenry107
+ self runRegex: #('(a|b|c|d|e)f'
+ 'ef' true (1 'ef' 2 'e'))
+ " ((a*|b))* - c - -"!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry108 (in category 'testing-henry') -----
+ testHenry108
+ self runRegex: #('abcd*efg'
+ 'abcdefg' true (1 'abcdefg'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry109 (in category 'testing-henry') -----
+ testHenry109
+ self runRegex: #('ab*'
+ 'xabyabbbz' true (1 'ab')
+ 'xayabbbz' true (1 'a'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry110 (in category 'testing-henry') -----
+ testHenry110
+ self runRegex: #('(ab|cd)e' 'abcde' true (1 'cde' 2 'cd'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry111 (in category 'testing-henry') -----
+ testHenry111
+ self runRegex: #('[abhgefdc]ij' 'hij' true (1 'hij'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry112 (in category 'testing-henry') -----
+ testHenry112
+ self runRegex: #('^(ab|cd)e' 'abcde' false nil)
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry113 (in category 'testing-henry') -----
+ testHenry113
+ self runRegex: #('(abc|)def' 'abcdef' true nil)
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry114 (in category 'testing-henry') -----
+ testHenry114
+ self runRegex: #('(a|b)c*d' 'abcd' true (1 'bcd' 2 'b'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry115 (in category 'testing-henry') -----
+ testHenry115
+ self runRegex: #('(ab|ab*)bc' 'abc' true (1 'abc' 2 'a'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry116 (in category 'testing-henry') -----
+ testHenry116
+ self runRegex: #('a([bc]*)c*' 'abc' true (1 'abc' 2 'bc'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry117 (in category 'testing-henry') -----
+ testHenry117
+ self runRegex: #('a([bc]*)(c*d)' 'abcd' true (1 'abcd' 2 'bc' 3 'd'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry118 (in category 'testing-henry') -----
+ testHenry118
+ self runRegex: #('a([bc]+)(c*d)' 'abcd' true (1 'abcd' 2 'bc' 3 'd'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry119 (in category 'testing-henry') -----
+ testHenry119
+ self runRegex: #('a([bc]*)(c+d)' 'abcd' true (1 'abcd' 2 'b' 3 'cd'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry120 (in category 'testing-henry') -----
+ testHenry120
+ self runRegex: #('a[bcd]*dcdcde' 'adcdcde' true (1 'adcdcde'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry121 (in category 'testing-henry') -----
+ testHenry121
+ self runRegex: #('a[bcd]+dcdcde' 'adcdcde' false nil)
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry122 (in category 'testing-henry') -----
+ testHenry122
+ self runRegex: #('(ab|a)b*c' 'abc' true (1 'abc'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry123 (in category 'testing-henry') -----
+ testHenry123
+ self runRegex: #('((a)(b)c)(d)' 'abcd' true (1 'abcd' 3 'a' 4 'b' 5 'd'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry124 (in category 'testing-henry') -----
+ testHenry124
+ self runRegex: #('[ -~]*' 'abc' true (1 'abc'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry125 (in category 'testing-henry') -----
+ testHenry125
+ self runRegex: #('[ -~ -~]*' 'abc' true (1 'abc'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry126 (in category 'testing-henry') -----
+ testHenry126
+ self runRegex: #('[ -~ -~ -~]*' 'abc' true (1 'abc'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry127 (in category 'testing-henry') -----
+ testHenry127
+ self runRegex: #('[ -~ -~ -~ -~]*' 'abc' true (1 'abc'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry128 (in category 'testing-henry') -----
+ testHenry128
+ self runRegex: #('[ -~ -~ -~ -~ -~]*' 'abc' true (1 'abc'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry129 (in category 'testing-henry') -----
+ testHenry129
+ self runRegex: #('[ -~ -~ -~ -~ -~ -~]*' 'abc' true (1 'abc'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry130 (in category 'testing-henry') -----
+ testHenry130
+ self runRegex: #('[ -~ -~ -~ -~ -~ -~ -~]*' 'abc' true (1 'abc'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry131 (in category 'testing-henry') -----
+ testHenry131
+ self runRegex: #('[a-zA-Z_][a-zA-Z0-9_]*' 'alpha' true (1 'alpha'))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry132 (in category 'testing-henry') -----
+ testHenry132
+ self runRegex: #('^a(bc+|b[eh])g|.h$' 'abh' true (1 'bh' 2 nil))
+ !
Item was added:
+ ----- Method: RxMatcherTest>>testHenry133 (in category 'testing-henry') -----
+ testHenry133
+ self runRegex: #('(bc+d$|ef*g.|h?i(j|k))'
+ 'effgz' true (1 'effgz' 2 'effgz' 3 nil)
+ 'ij' true (1 'ij' 2 'ij' 3 'j')
+ 'effg' false nil
+ 'bcdd' false nil
+ 'reffgz' true (1 'effgz' 2 'effgz' 3 nil))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry134 (in category 'testing-henry') -----
+ testHenry134
+ self runRegex: #('(((((((((a)))))))))' 'a' true (1 'a'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry135 (in category 'testing-henry') -----
+ testHenry135
+ self runRegex: #('multiple words of text'
+ 'uh-uh' false nil
+ 'multiple words of text, yeah' true (1 'multiple words of text'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry136 (in category 'testing-henry') -----
+ testHenry136
+ self runRegex: #('(.*)c(.*)' 'abcde' true (1 'abcde' 2 'ab' 3 'de'))!
Item was added:
+ ----- Method: RxMatcherTest>>testHenry137 (in category 'testing-henry') -----
+ testHenry137
+ self runRegex: #('\((.*), (.*)\)' '(a, b)' true (2 'a' 3 'b'))!
Item was added:
+ ----- Method: RxMatcherTest>>testMatches (in category 'testing-protocol') -----
+ testMatches
+ | matcher |
+ matcher := self matcherClass forString: '\w+'.
+ self assert: (matcher matches: 'now').
+ self deny: (matcher matches: 'now is')!
Item was added:
+ ----- Method: RxMatcherTest>>testMatchesIn (in category 'testing-protocol') -----
+ testMatchesIn
+ | matcher |
+ matcher := self matcherClass forString: '\w+'.
+ self assert: (matcher matchesIn: 'now is the time') asArray
+ = #('now' 'is' 'the' 'time')!
Item was added:
+ ----- Method: RxMatcherTest>>testMatchesInCollect (in category 'testing-protocol') -----
+ testMatchesInCollect
+ | matcher |
+ matcher := self matcherClass forString: '\w+'.
+ self assert: (matcher
+ matchesIn: 'now is the time'
+ collect: [ :each | each reversed ]) asArray
+ = #('won' 'si' 'eht' 'emit')!
Item was added:
+ ----- Method: RxMatcherTest>>testMatchesInDo (in category 'testing-protocol') -----
+ testMatchesInDo
+ | matcher expected |
+ matcher := self matcherClass forString: '\w+'.
+ expected := #('now' 'is' 'the' 'time') asOrderedCollection.
+ matcher
+ matchesIn: 'now is the time'
+ do: [ :each | self assert: each = expected removeFirst ].
+ self assert: expected isEmpty!
Item was added:
+ ----- Method: RxMatcherTest>>testMatchesOnStream (in category 'testing-protocol') -----
+ testMatchesOnStream
+ | matcher |
+ matcher := self matcherClass forString: '\w+'.
+ self assert: (matcher matchesOnStream: 'now is the time' readStream) asArray
+ = #('now' 'is' 'the' 'time')!
Item was added:
+ ----- Method: RxMatcherTest>>testMatchesOnStreamCollect (in category 'testing-protocol') -----
+ testMatchesOnStreamCollect
+ | matcher |
+ matcher := self matcherClass forString: '\w+'.
+ self assert: (matcher
+ matchesOnStream: 'now is the time' readStream
+ collect: [ :each | each reversed ]) asArray
+ = #('won' 'si' 'eht' 'emit')!
Item was added:
+ ----- Method: RxMatcherTest>>testMatchesOnStreamDo (in category 'testing-protocol') -----
+ testMatchesOnStreamDo
+ | matcher expected |
+ matcher := self matcherClass forString: '\w+'.
+ expected := #('now' 'is' 'the' 'time') asOrderedCollection.
+ matcher
+ matchesOnStream: 'now is the time' readStream
+ do: [ :each | self assert: each = expected removeFirst ].
+ self assert: expected isEmpty!
Item was added:
+ ----- Method: RxMatcherTest>>testMatchesStream (in category 'testing-protocol') -----
+ testMatchesStream
+ | matcher |
+ matcher := self matcherClass forString: '\w+'.
+ self assert: (matcher matchesStream: 'now' readStream).
+ self deny: (matcher matchesStream: 'now is' readStream)!
Item was added:
+ ----- Method: RxMatcherTest>>testMatchingRangesIn (in category 'testing-protocol') -----
+ testMatchingRangesIn
+ | matcher expected |
+ matcher := self matcherClass forString: '\w+'.
+ expected := #(1 3 5 6 8 10 12 15) asOrderedCollection.
+ (matcher matchingRangesIn: 'now is the time') do: [ :range |
+ self assert: range first = expected removeFirst.
+ self assert: range last = expected removeFirst ].
+ self assert: expected isEmpty!
Item was added:
+ ----- Method: RxMatcherTest>>testRegex001 (in category 'testing') -----
+ testRegex001
+ self runRegex: #('^.*$'
+ '' true (1 '')
+ 'a' true (1 'a')
+ 'abc' true (1 'abc'))!
Item was added:
+ ----- Method: RxMatcherTest>>testRegex002 (in category 'testing') -----
+ testRegex002
+ self runRegex: #('a\w+c'
+ ' abb_bbc ' true (1 'abb_bbc')
+ 'abb-bc' false nil)!
Item was added:
+ ----- Method: RxMatcherTest>>testRegex003 (in category 'testing') -----
+ testRegex003
+ self runRegex: #('a\W+c'
+ ' abb_bbc ' false nil
+ 'abb-bc' false nil
+ 'a.,:;-&!!"#%/()={[]}+?\~*''c' true (1 'a.,:;-&!!"#%/()={[]}+?\~*''c'))!
Item was added:
+ ----- Method: RxMatcherTest>>testSubexpressionCount (in category 'testing-protocol') -----
+ testSubexpressionCount
+ | matcher |
+ #(('a' 1) ('a(b)' 2) ('a(b(c))' 3) ('(a)(b)' 3) ('(a(b))*' 3)) do: [ :pair |
+ matcher := self matcherClass forString: pair first.
+ matcher supportsSubexpressions
+ ifTrue: [ self assert: matcher subexpressionCount = pair last ] ]!
Item was changed:
----- Method: RxParser class>>a:introduction: (in category 'DOCUMENTATION') -----
+ a: x introduction: xx
- a:x introduction:xx
"
A regular expression is a template specifying a class of strings. A
regular expression matcher is an tool that determines whether a string
belongs to a class specified by a regular expression. This is a
common task of a user input validation code, and the use of regular
expressions can GREATLY simplify and speed up development of such
code. As an example, here is how to verify that a string is a valid
hexadecimal number in Smalltalk notation, using this matcher package:
+
aString matchesRegex: '16r[[:xdigit:]]+'
+
(Coding the same ``the hard way'' is an exercise to a curious reader).
+
This matcher is offered to the Smalltalk community in hope it will be
useful. It is free in terms of money, and to a large extent--in terms
of rights of use. Refer to `Boring Stuff' section for legalese.
+
The 'What's new in this release' section describes the functionality
introduced in 1.1 release.
+
The `Syntax' section explains the recognized syntax of regular
expressions.
+
The `Usage' section explains matcher capabilities that go beyond what
String>>matchesRegex: method offers.
+
The `Implementation notes' sections says a few words about what is
under the hood.
+
Happy hacking,
+
--Vassili Bykov
<vassili at objectpeople.com> <vassili at magma.ca>
+
August 6, 1996
April 4, 1999
"
+
self error: 'comment only'!
Item was changed:
----- Method: RxParser class>>b:whatsNewInThisRelease: (in category 'DOCUMENTATION') -----
+ b: x whatsNewInThisRelease: xx
- b:x whatsNewInThisRelease: xx
"
+ VERSION 1.3.1 (September 2008)
+ 1. Updated documentation of character classes, making clear the problems of locale - an area for future improvement
+
+ VERSION 1.3 (September 2008)
+ 1. \w now matches underscore as well as alphanumerics, in line with most other regex libraries (and our documentation!!).
+ 2. \W rejects underscore as well as alphanumerics
+ 3. added tests for this at end of testSuite
+ 4. updated documentation and added note to old incorrect comments in version 1.1 below
+
+ VERSION 1.2.3 (November 2007)
+
+ 1. Regexs with ^ or $ applied to copy empty strings caused infinite loops, e.g. ('' copyWithRegex: '^.*$' matchesReplacedWith: 'foo'). Applied a similar correction to that from version 1.1c, to #copyStream:to:(replacingMatchesWith:|translatingMatchesUsing:).
+ 2. Extended RxParser testing to run each test for #copy:translatingMatchesUsing: as well as #search:.
+ 3. Corrected #testSuite test that a dot does not match a null, which was passing by luck with Smalltalk code in a literal array.
+ 4. Added test to end of test suite for fix 1 above.
+
+ VERSION 1.2.2 (November 2006)
+
+ There was no way to specify a backslash in a character set. Now [\\] is accepted.
+
+ VERSION 1.2.1 (August 2006)
+
+ 1. Support for returning all ranges (startIndex to: stopIndex) matching a regex - #allRangesOfRegexMatches:, #matchingRangesIn:
+ 2. Added hint to usage documentation on how to get more information about matches when enumerating
+ 3. Syntax description of dot corrected: matches anything but NUL since 1.1a
+
+ VERSION 1.2 (May 2006)
+
+ Fixed case-insensitive search for character sets.
+
+ VERSION 1.1c (December 2004)
+
+ Fixed the issue with #matchesOnStream:do: which caused infinite loops for matches
+ that matched empty strings.
+
+ VERSION 1.1b (November 2001)
+
+ Changes valueNowOrOnUnwindDo: to ensure:, plus incorporates some earlier fixes.
+
+ VERSION 1.1a (May 2001)
+
+ 1. Support for keeping track of multiple subexpressions.
+ 2. Dot (.) matches anything but NUL character, as it should per POSIX spec.
+ 3. Some bug fixes.
+
VERSION 1.1 (October 1999)
+
Regular expression syntax corrections and enhancements:
+
1. Backslash escapes similar to those in Perl are allowed in patterns:
+
+ \w any word constituent character (equivalent to [a-zA-Z0-9_]) *** underscore only since 1.3 ***
+ \W any character but a word constituent (equivalent to [^a-xA-Z0-9_] *** underscore only since 1.3 ***
- \w any word constituent character (equivalent to [a-zA-Z0-9_])
- \W any character but a word constituent (equivalent to [^a-xA-Z0-9_]
\d a digit (same as [0-9])
\D anything but a digit
\s a whitespace character
\S anything but a whitespace character
\b an empty string at a word boundary
\B an empty string not at a word boundary
\< an empty string at the beginning of a word
\> an empty string at the end of a word
+
For example, '\w+' is now a valid expression matching any word.
+
2. The following backslash escapes are also allowed in character sets
(between square brackets):
+
\w, \W, \d, \D, \s, and \S.
+
3. The following grep(1)-compatible named character classes are
recognized in character sets as well:
+
[:alnum:]
[:alpha:]
[:cntrl:]
[:digit:]
[:graph:]
[:lower:]
[:print:]
[:punct:]
[:space:]
[:upper:]
[:xdigit:]
+
For example, the following patterns are equivalent:
+
+ '[[:alnum:]_]+' '\w+' '[\w]+' '[a-zA-Z0-9_]+' *** underscore only since 1.3 ***
+
- '[[:alnum:]]+' '\w+' '[\w]+' '[a-zA-Z0-9_]+'
4. Some non-printable characters can be represented in regular
expressions using a common backslash notation:
+
\t tab (Character tab)
\n newline (Character lf)
\r carriage return (Character cr)
\f form feed (Character newPage)
\e escape (Character esc)
+
5. A dot is corectly interpreted as 'any character but a newline'
instead of 'anything but whitespace'.
+
6. Case-insensitive matching. The easiest access to it are new
messages CharacterArray understands: #asRegexIgnoringCase,
#matchesRegexIgnoringCase:, #prefixMatchesRegexIgnoringCase:.
+
7. The matcher (an instance of RxMatcher, the result of
String>>asRegex) now provides a collection-like interface to matches
in a particular string or on a particular stream, as well as
substitution protocol. The interface includes the following messages:
+
matchesIn: aString
matchesIn: aString collect: aBlock
matchesIn: aString do: aBlock
+
matchesOnStream: aStream
matchesOnStream: aStream collect: aBlock
matchesOnStream: aStream do: aBlock
+
copy: aString translatingMatchesUsing: aBlock
copy: aString replacingMatchesWith: replacementString
+
copyStream: aStream to: writeStream translatingMatchesUsing: aBlock
copyStream: aStream to: writeStream replacingMatchesWith: aString
+
Examples:
+
'\w+' asRegex matchesIn: 'now is the time'
+
returns an OrderedCollection containing four strings: 'now', 'is',
'the', and 'time'.
+
'\<t\w+' asRegexIgnoringCase
copy: 'now is the Time'
translatingMatchesUsing: [:match | match asUppercase]
+
returns 'now is THE TIME' (the regular expression matches words
beginning with either an uppercase or a lowercase T).
+
ACKNOWLEDGEMENTS
+
Since the first release of the matcher, thanks to the input from
several fellow Smalltalkers, I became convinced a native Smalltalk
regular expression matcher was worth the effort to keep it alive. For
+ the contributions, suggestions, and bug reports that made this release
+ possible, I want to thank:
+
- the advice and encouragement that made this release possible, I want
- to thank:
Felix Hack
+ Peter Hatch
+ Alan Knight
Eliot Miranda
+ Thomas Muhr
Robb Shecter
David N. Smith
Francis Wolinski
+
and anyone whom I haven't yet met or heard from, but who agrees this
has not been a complete waste of time.
+
--Vassili Bykov
October 3, 1999
"
+
self error: 'comment only'!
Item was changed:
----- Method: RxParser class>>c:syntax: (in category 'DOCUMENTATION') -----
+ c: x syntax: xx
- c:x syntax:xx
"
+
[You can select and `print it' examples in this method. Just don't
forget to cancel the changes.]
+
The simplest regular expression is a single character. It matches
exactly that character. A sequence of characters matches a string with
exactly the same sequence of characters:
+
'a' matchesRegex: 'a' -- true
'foobar' matchesRegex: 'foobar' -- true
'blorple' matchesRegex: 'foobar' -- false
+
The above paragraph introduced a primitive regular expression (a
character), and an operator (sequencing). Operators are applied to
regular expressions to produce more complex regular expressions.
Sequencing (placing expressions one after another) as an operator is,
in a certain sense, `invisible'--yet it is arguably the most common.
+
A more `visible' operator is Kleene closure, more often simply
referred to as `a star'. A regular expression followed by an asterisk
matches any number (including 0) of matches of the original
expression. For example:
+
'ab' matchesRegex: 'a*b' -- true
'aaaaab' matchesRegex: 'a*b' -- true
'b' matchesRegex: 'a*b' -- true
'aac' matchesRegex: 'a*b' -- false: b does not match
+
A star's precedence is higher than that of sequencing. A star applies
to the shortest possible subexpression that precedes it. For example,
'ab*' means `a followed by zero or more occurrences of b', not `zero
or more occurrences of ab':
+
'abbb' matchesRegex: 'ab*' -- true
'abab' matchesRegex: 'ab*' -- false
+
To actually make a regex matching `zero or more occurrences of ab',
`ab' is enclosed in parentheses:
+
'abab' matchesRegex: '(ab)*' -- true
'abcab' matchesRegex: '(ab)*' -- false: c spoils the fun
+
Two other operators similar to `*' are `+' and `?'. `+' (positive
closure, or simply `plus') matches one or more occurrences of the
original expression. `?' (`optional') matches zero or one, but never
more, occurrences.
+
'ac' matchesRegex: 'ab*c' -- true
'ac' matchesRegex: 'ab+c' -- false: need at least one b
'abbc' matchesRegex: 'ab+c' -- true
'abbc' matchesRegex: 'ab?c' -- false: too many b's
+
As we have seen, characters `*', `+', `?', `(', and `)' have special
meaning in regular expressions. If one of them is to be used
literally, it should be quoted: preceded with a backslash. (Thus,
backslash is also special character, and needs to be quoted for a
literal match--as well as any other special character described
further).
+
'ab*' matchesRegex: 'ab*' -- false: star in the right string is special
'ab*' matchesRegex: 'ab\*' -- true
'a\c' matchesRegex: 'a\\c' -- true
+
The last operator is `|' meaning `or'. It is placed between two
regular expressions, and the resulting expression matches if one of
the expressions matches. It has the lowest possible precedence (lower
than sequencing). For example, `ab*|ba*' means `a followed by any
number of b's, or b followed by any number of a's':
+
'abb' matchesRegex: 'ab*|ba*' -- true
'baa' matchesRegex: 'ab*|ba*' -- true
'baab' matchesRegex: 'ab*|ba*' -- false
+
A bit more complex example is the following expression, matching the
name of any of the Lisp-style `car', `cdr', `caar', `cadr',
... functions:
+
c(a|d)+r
+
It is possible to write an expression matching an empty string, for
example: `a|'. However, it is an error to apply `*', `+', or `?' to
such expression: `(a|)*' is an invalid expression.
+
So far, we have used only characters as the 'smallest' components of
regular expressions. There are other, more `interesting', components.
+
A character set is a string of characters enclosed in square
brackets. It matches any single character if it appears between the
brackets. For example, `[01]' matches either `0' or `1':
+
'0' matchesRegex: '[01]' -- true
'3' matchesRegex: '[01]' -- false
'11' matchesRegex: '[01]' -- false: a set matches only one character
+
Using plus operator, we can build the following binary number
recognizer:
+
'10010100' matchesRegex: '[01]+' -- true
'10001210' matchesRegex: '[01]+' -- false
+
If the first character after the opening bracket is `^', the set is
inverted: it matches any single character *not* appearing between the
brackets:
+
'0' matchesRegex: '[^01]' -- false
'3' matchesRegex: '[^01]' -- true
+
For convenience, a set may include ranges: pairs of characters
separated with `-'. This is equivalent to listing all characters
between them: `[0-9]' is the same as `[0123456789]'.
+
Special characters within a set are `^', `-', and `]' that closes the
set. Below are the examples of how to literally use them in a set:
+
[01^] -- put the caret anywhere except the beginning
[01-] -- put the dash as the last character
[]01] -- put the closing bracket as the first character
[^]01] (thus, empty and universal sets cannot be specified)
+
Regular expressions can also include the following backquote escapes
to refer to popular classes of characters:
+
\w any word constituent character (same as [a-zA-Z0-9_])
\W any character but a word constituent
\d a digit (same as [0-9])
\D anything but a digit
+ \s a whitespace character (same as [:space:] below)
- \s a whitespace character
\S anything but a whitespace character
+
These escapes are also allowed in character classes: '[\w+-]' means
'any character that is either a word constituent, or a plus, or a
minus'.
+
Character classes can also include the following grep(1)-compatible
elements to refer to:
+
+ [:alnum:] any alphanumeric character (same as [a-zA-Z0-9])
+ [:alpha:] any alphabetic character (same as [a-zA-Z])
+ [:cntrl:] any control character. (any character with code < 32)
+ [:digit:] any decimal digit (same as [0-9])
+ [:graph:] any graphical character. (any character with code >= 32).
+ [:lower:] any lowercase character (including non-ASCII lowercase characters)
+ [:print:] any printable character. In this version, this is the same as [:graph:]
+ [:punct:] any punctuation character: . , !! ? ; : ' - ( ) ` and double quotes
+ [:space:] any whitespace character (space, tab, CR, LF, null, form feed, Ctrl-Z, 16r2000-16r200B, 16r3000)
+ [:upper:] any uppercase character (including non-ASCII uppercase characters)
+ [:xdigit:] any hexadecimal character (same as [a-fA-F0-9]).
+
+ Note that many of these are only as consistent or inconsistent on issues
+ of locale as the underlying Smalltalk implementation. Values shown here
+ are for VisualWorks 7.6.
+
- [:alnum:] any alphanumeric, i.e., a word constituent, character
- [:alpha:] any alphabetic character
- [:cntrl:] any control character. In this version, it means any character which code is < 32.
- [:digit:] any decimal digit.
- [:graph:] any graphical character. In this version, this mean any character with the code >= 32.
- [:lower:] any lowercase character
- [:print:] any printable character. In this version, this is the same as [:cntrl:]
- [:punct:] any punctuation character.
- [:space:] any whitespace character.
- [:upper:] any uppercase character.
- [:xdigit:] any hexadecimal character.
Note that these elements are components of the character classes,
i.e. they have to be enclosed in an extra set of square brackets to
form a valid regular expression. For example, a non-empty string of
digits would be represented as '[[:digit:]]+'.
+
The above primitive expressions and operators are common to many
implementations of regular expressions. The next primitive expression
is unique to this Smalltalk implementation.
+
A sequence of characters between colons is treated as a unary selector
which is supposed to be understood by Characters. A character matches
such an expression if it answers true to a message with that
selector. This allows a more readable and efficient way of specifying
character classes. For example, `[0-9]' is equivalent to `:isDigit:',
but the latter is more efficient. Analogously to character sets,
character classes can be negated: `:^isDigit:' matches a Character
that answers false to #isDigit, and is therefore equivalent to
`[^0-9]'.
+
As an example, so far we have seen the following equivalent ways to
write a regular expression that matches a non-empty string of digits:
+
'[0-9]+'
'\d+'
'[\d]+'
+ '[[:digit:]]+'
- '[[:digit::]+'
:isDigit:+'
+
The last group of special primitive expressions includes:
+
+ . matching any character except a NULL;
- . matching any character except a newline;
^ matching an empty string at the beginning of a line;
$ matching an empty string at the end of a line.
\b an empty string at a word boundary
\B an empty string not at a word boundary
\< an empty string at the beginning of a word
\> an empty string at the end of a word
+
'axyzb' matchesRegex: 'a.+b' -- true
+ 'ax zb' matchesRegex: 'a.+b' -- true (space is matched by `.')
+ 'ax
+ zb' matchesRegex: 'a.+b' -- true (carriage return is matched by `.')
+
+ Again, the dot ., caret ^ and dollar $ characters are special and should be quoted
- 'ax zb' matchesRegex: 'a.+b' -- false (space is not matched by `.')
- Again, all the above three characters are special and should be quoted
to be matched literally.
+
EXAMPLES
+
As the introductions said, a great use for regular expressions is user
input validation. Following are a few examples of regular expressions
that might be handy in checking input entered by the user in an input
field. Try them out by entering something between the quotes and
print-iting. (Also, try to imagine Smalltalk code that each validation
would require if coded by hand). Most example expressions could have
been written in alternative ways.
+
Checking if aString may represent a nonnegative integer number:
+
'' matchesRegex: ':isDigit:+'
or
'' matchesRegex: '[0-9]+'
or
'' matchesRegex: '\d+'
+
Checking if aString may represent an integer number with an optional
sign in front:
+
'' matchesRegex: '(\+|-)?\d+'
+
Checking if aString is a fixed-point number, with at least one digit
is required after a dot:
+
'' matchesRegex: '(\+|-)?\d+(\.\d+)?'
+
The same, but allow notation like `123.':
+
'' matchesRegex: '(\+|-)?\d+(\.\d*)?'
+
Recognizer for a string that might be a name: one word with first
capital letter, no blanks, no digits. More traditional:
+
'' matchesRegex: '[A-Z][A-Za-z]*'
+
more Smalltalkish:
+
'' matchesRegex: ':isUppercase::isAlphabetic:*'
+
A date in format MMM DD, YYYY with any number of spaces in between, in
XX century:
+
'' matchesRegex: '(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[ ]+(\d\d?)[ ]*,[ ]*19(\d\d)'
+
Note parentheses around some components of the expression above. As
`Usage' section shows, they will allow us to obtain the actual strings
that have matched them (i.e. month name, day number, and year number).
+
For dessert, coming back to numbers: here is a recognizer for a
general number format: anything like 999, or 999.999, or -999.999e+21.
+
'' matchesRegex: '(\+|-)?\d+(\.\d*)?((e|E)(\+|-)?\d+)?'
+
"
+
self error: 'comment only'!
Item was changed:
----- Method: RxParser class>>d:usage: (in category 'DOCUMENTATION') -----
+ d: x usage: xx
- d:x usage:xx
"
The preceding section covered the syntax of regular expressions. It
used the simplest possible interface to the matcher: sending
#matchesRegex: message to the sample string, with regular expression
string as the argument. This section explains hairier ways of using
the matcher.
+
PREFIX MATCHING AND CASE-INSENSITIVE MATCHING
+
A CharacterArray (an EsString in VA) also understands these messages:
+
#prefixMatchesRegex: regexString
#matchesRegexIgnoringCase: regexString
#prefixMatchesRegexIgnoringCase: regexString
+
#prefixMatchesRegex: is just like #matchesRegex, except that the whole
receiver is not expected to match the regular expression passed as the
argument; matching just a prefix of it is enough. For example:
+
'abcde' matchesRegex: '(a|b)+' -- false
'abcde' prefixMatchesRegex: '(a|b)+' -- true
+
The last two messages are case-insensitive versions of matching.
+
ENUMERATION INTERFACE
+
An application can be interested in all matches of a certain regular
expression within a String. The matches are accessible using a
protocol modelled after the familiar Collection-like enumeration
protocol:
+
#regex: regexString matchesDo: aBlock
+
Evaluates a one-argument <aBlock> for every match of the regular
expression within the receiver string.
+
#regex: regexString matchesCollect: aBlock
+
Evaluates a one-argument <aBlock> for every match of the regular
expression within the receiver string. Collects results of evaluations
and anwers them as a SequenceableCollection.
+
#allRegexMatches: regexString
+
Returns a collection of all matches (substrings of the receiver
string) of the regular expression. It is an equivalent of <aString
regex: regexString matchesCollect: [:each | each]>.
+
+ #allRangesOfRegexMatches: regexString
+
+ Returns a collection of all character ranges (startIndex to: stopIndex)
+ that match the regular expression.
+
REPLACEMENT AND TRANSLATION
+
It is possible to replace all matches of a regular expression with a
certain string using the message:
+
#copyWithRegex: regexString matchesReplacedWith: aString
+
For example:
+
'ab cd ab' copyWithRegex: '(a|b)+' matchesReplacedWith: 'foo'
+
A more general substitution is match translation:
+
#copyWithRegex: regexString matchesTranslatedUsing: aBlock
+
This message evaluates a block passing it each match of the regular
expression in the receiver string and answers a copy of the receiver
with the block results spliced into it in place of the respective
matches. For example:
+
'ab cd ab' copyWithRegex: '(a|b)+' matchesTranslatedUsing: [:each | each asUppercase]
+
All messages of enumeration and replacement protocols perform a
case-sensitive match. Case-insensitive versions are not provided as
part of a CharacterArray protocol. Instead, they are accessible using
the lower-level matching interface.
+
LOWER-LEVEL INTERFACE
+
Internally, #matchesRegex: works as follows:
+
1. A fresh instance of RxParser is created, and the regular expression
string is passed to it, yielding the expression's syntax tree.
+
2. The syntax tree is passed as an initialization parameter to an
instance of RxMatcher. The instance sets up some data structure that
will work as a recognizer for the regular expression described by the
tree.
+
3. The original string is passed to the matcher, and the matcher
checks for a match.
+
THE MATCHER
+
If you repeatedly match a number of strings against the same regular
expression using one of the messages defined in CharacterArray, the
regular expression string is parsed and a matcher is created anew for
every match. You can avoid this overhead by building a matcher for
the regular expression, and then reusing the matcher over and over
again. You can, for example, create a matcher at a class or instance
initialization stage, and store it in a variable for future use.
+
You can create a matcher using one of the following methods:
+
- Sending #forString:ignoreCase: message to RxMatcher class, with
the regular expression string and a Boolean indicating whether case is
ignored as arguments.
+
- Sending #forString: message. It is equivalent to <... forString:
regexString ignoreCase: false>.
+
A more convenient way is using one of the two matcher-created messages
understood by CharacterArray.
+
- <regexString asRegex> is equivalent to <RxMatcher forString:
regexString>.
+
- <regexString asRegexIgnoringCase> is equivalent to <RxMatcher
forString: regexString ignoreCase: true>.
+
Here are four examples of creating a matcher:
+
hexRecognizer := RxMatcher forString: '16r[0-9A-Fa-f]+'
hexRecognizer := RxMatcher forString: '16r[0-9A-Fa-f]+' ignoreCase: false
hexRecognizer := '16r[0-9A-Fa-f]+' asRegex
hexRecognizer := '16r[0-9A-F]+' asRegexIgnoringCase
+
MATCHING
+
The matcher understands these messages (all of them return true to
indicate successful match or search, and false otherwise):
+
matches: aString
+
True if the whole target string (aString) matches.
+
matchesPrefix: aString
+
True if some prefix of the string (not necessarily the whole
string) matches.
+
search: aString
+
Search the string for the first occurrence of a matching
substring. (Note that the first two methods only try matching from
the very beginning of the string). Using the above example with a
matcher for `a+', this method would answer success given a string
`baaa', while the previous two would fail.
+
matchesStream: aStream
matchesStreamPrefix: aStream
searchStream: aStream
+
Respective analogs of the first three methods, taking input from a
stream instead of a string. The stream must be positionable and
peekable.
+
All these methods answer a boolean indicating success. The matcher
also stores the outcome of the last match attempt and can report it:
+
lastResult
+
Answers a Boolean -- the outcome of the most recent match
attempt. If no matches were attempted, the answer is unspecified.
+
SUBEXPRESSION MATCHES
+
After a successful match attempt, you can query the specifics of which
part of the original string has matched which part of the whole
expression.
+
A subexpression is a parenthesized part of a regular expression, or
the whole expression. When a regular expression is compiled, its
subexpressions are assigned indices starting from 1, depth-first,
left-to-right. For example, `((ab)+(c|d))?ef' includes the following
subexpressions with these indices:
+
1: ((ab)+(c|d))?ef
2: (ab)+(c|d)
3: ab
4: c|d
+
After a successful match, the matcher can report what part of the
original string matched what subexpression. It understandards these
messages:
+
subexpressionCount
+
Answers the total number of subexpressions: the highest value that
can be used as a subexpression index with this matcher. This value
is available immediately after initialization and never changes.
+
subexpression: anIndex
+
An index must be a valid subexpression index, and this message
must be sent only after a successful match attempt. The method
answers a substring of the original string the corresponding
subexpression has matched to.
+
subBeginning: anIndex
subEnd: anIndex
+
Answer positions within the original string or stream where the
match of a subexpression with the given index has started and
ended, respectively.
+
This facility provides a convenient way of extracting parts of input
strings of complex format. For example, the following piece of code
uses the 'MMM DD, YYYY' date format recognizer example from the
`Syntax' section to convert a date to a three-element array with year,
month, and day strings (you can select and evaluate it right here):
+
| matcher |
+ matcher := RxMatcher forString: '(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[ ]+(:isDigit::isDigit:?)[ ]*,[ ]*(19|20)(:isDigit::isDigit:)'.
- matcher := RxMatcher forString: '(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[ ]+(:isDigit::isDigit:?)[ ]*,[ ]*19(:isDigit::isDigit:)'.
(matcher matches: 'Aug 6, 1996')
ifTrue:
[Array
+ with: (matcher subexpression: 5)
- with: (matcher subexpression: 4)
with: (matcher subexpression: 2)
with: (matcher subexpression: 3)]
ifFalse: ['no match']
+
(should answer ` #('96' 'Aug' '6')').
+
ENUMERATION AND REPLACEMENT
+
The enumeration and replacement protocols exposed in CharacterArray
+ are actually implemented by the matcher. The following messages are
- are actually implemented by the mather. The following messages are
understood:
+
#matchesIn: aString
#matchesIn: aString do: aBlock
#matchesIn: aString collect: aBlock
#copy: aString replacingMatchesWith: replacementString
#copy: aString translatingMatchesUsing: aBlock
+ #matchingRangesIn: aString
+
#matchesOnStream: aStream
#matchesOnStream: aStream do: aBlock
#matchesOnStream: aStream collect: aBlock
#copy: sourceStream to: targetStream replacingMatchesWith: replacementString
#copy: sourceStream to: targetStream translatingMatchesWith: aBlock
+
+ Note that in those methods that take a block, the block may refer to the rxMatcher itself,
+ e.g. to collect information about the position the match occurred at, or the
+ subexpressions of the match. An example can be seen in #matchingRangesIn:
+
ERROR HANDLING
+
Exception signaling objects (Signals in VisualWorks, Exceptions in VisualAge) are
accessible through RxParser class protocol. To handle possible errors, use
the protocol described below to obtain the exception objects and use the
protocol of the native Smalltalk implementation to handle them.
+
If a syntax error is detected while parsing expression,
RxParser>>syntaxErrorSignal is raised/signaled.
+
If an error is detected while building a matcher,
RxParser>>compilationErrorSignal is raised/signaled.
+
If an error is detected while matching (for example, if a bad selector
was specified using `:<selector>:' syntax, or because of the matcher's
internal error), RxParser>>matchErrorSignal is raised
+
RxParser>>regexErrorSignal is the parent of all three. Since any of
the three signals can be raised within a call to #matchesRegex:, it is
handy if you want to catch them all. For example:
+
VisualWorks:
+
RxParser regexErrorSignal
handle: [:ex | ex returnWith: nil]
do: ['abc' matchesRegex: '))garbage[']
+
VisualAge:
+
['abc' matchesRegex: '))garbage[']
when: RxParser regexErrorSignal
do: [:signal | signal exitWith: nil]
+
"
+
self error: 'comment only'!
Item was changed:
----- Method: RxParser class>>doHandlingMessageNotUnderstood: (in category 'exception signaling') -----
doHandlingMessageNotUnderstood: aBlock
"MNU should be trapped and resignaled as a match error in a few places in the matcher.
This method factors out this dialect-dependent code to make porting easier."
^ aBlock
on: MessageNotUnderstood
+ do: [:ex | RxParser signalMatchException: 'invalid predicate selector']!
- do: [:ex | RxMatcher signalMatchException: 'invalid predicate selector']!
Item was changed:
----- Method: RxParser class>>e:implementationNotes: (in category 'DOCUMENTATION') -----
+ e: x implementationNotes: xx
- e:x implementationNotes:xx
"
Version: 1.1
Released: October 1999
+ Mail to: Vassili Bykov <vassili at parcplace.com>, <v_bykov at yahoo.com>
- Mail to: Vassili Bykov <vassili at magma.ca>, <vassili at objectpeople.com>
Flames to: /dev/null
+
WHAT IS ADDED
+
The matcher includes classes in two categories:
VB-Regex-Syntax
VB-Regex-Matcher
and a few CharacterArray methods in `VB-regex' protocol. No system
classes or methods are modified.
+
WHAT TO LOOK AT FIRST
+
String>>matchesRegex: -- in 90% cases this method is all you need to
access the package.
+
RxParser -- accepts a string or a stream of characters with a regular
expression, and produces a syntax tree corresponding to the
expression. The tree is made of instances of Rxs<whatever> classes.
+
RxMatcher -- accepts a syntax tree of a regular expression built by
the parser and compiles it into a matcher: a structure made of
instances of Rxm<whatever> classes. The RxMatcher instance can test
whether a string or a positionable stream of characters matches the
original regular expression, or search a string or a stream for
substrings matching the expression. After a match is found, the
matcher can report a specific string that matched the whole
expression, or any parenthesized subexpression of it.
+
All other classes support the above functionality and are used by
RxParser, RxMatcher, or both.
+
CAVEATS
+
The matcher is similar in spirit, but NOT in the design--let alone the
code--to the original Henry Spencer's regular expression
implementation in C. The focus is on simplicity, not on efficiency.
I didn't optimize or profile anything. I may in future--or I may not:
I do this in my spare time and I don't promise anything.
+
The matcher passes H. Spencer's test suite (see 'test suite'
protocol), with quite a few extra tests added, so chances are good
there are not too many bugs. But watch out anyway.
+
EXTENSIONS, FUTURE, ETC.
+
With the existing separation between the parser, the syntax tree, and
the matcher, it is easy to extend the system with other matchers based
on other algorithms. In fact, I have a DFA-based matcher right now,
but I don't feel it is good enough to include it here. I might add
automata-based matchers later, but again I don't promise anything.
+
HOW TO REACH ME
+
+ As of today (December 20, 2000), you can contact me at
+ <vassili at parcplace.com>. If this doesn't work, look around
+ comp.lang.smalltalk or comp.lang.lisp.
- As of today (October 3, 1999), you can contact me at
- <vassili at objectpeople.com>. If this doesn't work, look around
- comp.lang.smalltalk and comp.lang.lisp.
"
+
self error: 'comment only'!
Item was changed:
----- Method: RxParser class>>f:boringStuff: (in category 'DOCUMENTATION') -----
+ f: x boringStuff: xx
- f:x boringStuff: xx
"
The Regular Expression Matcher (``The Software'')
is Copyright (C) 1996, 1999 Vassili Bykov.
It is provided to the Smalltalk community in hope it will be useful.
+
1. This license applies to the package as a whole, as well as to any
component of it. By performing any of the activities described
below, you accept the terms of this agreement.
+
2. The software is provided free of charge, and ``as is'', in hope
that it will be useful, with ABSOLUTELY NO WARRANTY. The entire
risk and all responsibility for the use of the software is with
you. Under no circumstances the author may be held responsible for
loss of data, loss of profit, or any other damage resulting
directly or indirectly from the use of the software, even if the
damage is caused by defects in the software.
+
3. You may use this software in any applications you build.
+
4. You may distribute this software provided that the software
documentation and copyright notices are included and intact.
+
5. You may create and distribute modified versions of the software,
such as ports to other Smalltalk dialects or derived work, provided
that:
+
a. any modified version is expressly marked as such and is not
misrepresented as the original software;
+
b. credit is given to the original software in the source code and
documentation of the derived work;
+
c. the copyright notice at the top of this document accompanies
copyright notices of any modified version. "
+
self error: 'comment only'!
Item was removed:
- ----- Method: RxmLink class>>new (in category 'instance creation') -----
- new
- ^super new initialize!
Item was changed:
----- Method: RxmMarker>>matchAgainst: (in category 'matching') -----
matchAgainst: aMatcher
"If the rest of the link chain matches successfully, report the
position of the stream *before* the match started to the matcher."
+
| startPosition |
startPosition := aMatcher position.
(next matchAgainst: aMatcher)
ifTrue:
+ [aMatcher markerPositionAt: index add: startPosition.
- [aMatcher markerPositionAt: index maybePut: startPosition.
^true].
^false!
Item was removed:
- ----- Method: RxsCharSet>>enumerablePartPredicate (in category 'privileged') -----
- enumerablePartPredicate
- | enumeration |
- enumeration := self optimalSet.
- ^negated
- ifTrue: [[:char | (enumeration includes: char) not]]
- ifFalse: [[:char | enumeration includes: char]]!
Item was added:
+ ----- Method: RxsCharSet>>enumerablePartPredicateIgnoringCase: (in category 'privileged') -----
+ enumerablePartPredicateIgnoringCase: aBoolean
+
+ | enumeration |
+ enumeration := self optimalSetIgnoringCase: aBoolean.
+ ^negated
+ ifTrue: [[:char | (enumeration includes: char) not]]
+ ifFalse: [[:char | enumeration includes: char]]!
Item was removed:
- ----- Method: RxsCharSet>>enumerableSet (in category 'privileged') -----
- enumerableSet
- "Answer a collection of characters that make up the portion of me
- that can be enumerated."
- | set |
- set := Set new.
- elements do:
- [:each |
- each isEnumerable ifTrue: [each enumerateTo: set]].
- ^set!
Item was added:
+ ----- Method: RxsCharSet>>enumerableSetIgnoringCase: (in category 'privileged') -----
+ enumerableSetIgnoringCase: aBoolean
+ "Answer a collection of characters that make up the portion of me
+ that can be enumerated."
+
+ | set |
+ set := Set new.
+ elements do:
+ [:each |
+ each isEnumerable ifTrue:
+ [each enumerateTo: set ignoringCase: aBoolean]].
+ ^set!
Item was removed:
- ----- Method: RxsCharSet>>optimalSet (in category 'privileged') -----
- optimalSet
- "Assuming the client with search the `set' using #includes:,
- answer a collection with the contents of `set', of the class
- that will provide the fastest lookup. Strings are faster than
- Sets for short strings."
- | set |
- set := self enumerableSet.
- ^set size < 10
- ifTrue: [String withAll: set asArray]
- ifFalse: [set]!
Item was added:
+ ----- Method: RxsCharSet>>optimalSetIgnoringCase: (in category 'privileged') -----
+ optimalSetIgnoringCase: aBoolean
+ "Assuming the client with search the `set' using #includes:,
+ answer a collection with the contents of `set', of the class
+ that will provide the fastest lookup. Strings are faster than
+ Sets for short strings."
+
+ | set |
+ set := self enumerableSetIgnoringCase: aBoolean.
+ ^set size < 10
+ ifTrue: [set asArray]
+ ifFalse: [set]!
Item was removed:
- ----- Method: RxsCharSet>>predicate (in category 'accessing') -----
- predicate
- | predicate enumerable |
- enumerable := self enumerablePartPredicate.
- ^self hasPredicates
- ifFalse: [enumerable]
- ifTrue:
- [predicate := self predicatePartPredicate.
- negated
- ifTrue: [[:char | (enumerable value: char) and: [predicate value: char]]]
- ifFalse: [[:char | (enumerable value: char) or: [predicate value: char]]]]!
Item was added:
+ ----- Method: RxsCharSet>>predicateIgnoringCase: (in category 'accessing') -----
+ predicateIgnoringCase: aBoolean
+
+ | predicate enumerable |
+ enumerable := self enumerablePartPredicateIgnoringCase: aBoolean.
+ ^self hasPredicates
+ ifFalse: [enumerable]
+ ifTrue:
+ [predicate := self predicatePartPredicate.
+ negated
+ ifTrue: [[:char | (enumerable value: char) and: [predicate value: char]]]
+ ifFalse: [[:char | (enumerable value: char) or: [predicate value: char]]]]!
Item was removed:
- ----- Method: RxsCharacter>>enumerateTo: (in category 'accessing') -----
- enumerateTo: aCollection
- aCollection add: character!
Item was added:
+ ----- Method: RxsCharacter>>enumerateTo:ignoringCase: (in category 'accessing') -----
+ enumerateTo: aSet ignoringCase: aBoolean
+
+ aBoolean
+ ifTrue:
+ [aSet
+ add: character asUppercase;
+ add: character asLowercase]
+ ifFalse: [aSet add: character]!
Item was removed:
- ----- Method: RxsMessagePredicate>>initializeSelector: (in category 'initialize-release') -----
- initializeSelector: aSelector
- "The selector must be a one-argument message understood by Character."
- selector := aSelector!
Item was changed:
----- Method: RxsPredicate class>>initializeEscapedLetterSelectors (in category 'class initialization') -----
initializeEscapedLetterSelectors
"self initializeEscapedLetterSelectors"
+
(EscapedLetterSelectors := Dictionary new)
at: $w put: #beWordConstituent;
at: $W put: #beNotWordConstituent;
at: $d put: #beDigit;
at: $D put: #beNotDigit;
at: $s put: #beSpace;
+ at: $S put: #beNotSpace;
+ at: $\ put: #beBackslash!
- at: $S put: #beNotSpace!
Item was removed:
- ----- Method: RxsPredicate>>beAny (in category 'initialize-release') -----
- beAny
- | cr lf |
- cr := Character cr.
- lf := Character lf.
- predicate := [:char | char ~= lf and: [char ~= cr]].
- negation := [:char | char = lf or: [char = cr]]!
Item was added:
+ ----- Method: RxsPredicate>>beBackslash (in category 'initialize-release') -----
+ beBackslash
+
+ predicate := [:char | char == $\].
+ negation := [:char | char ~~ $\]!
Item was changed:
----- Method: RxsPredicate>>bePunctuation (in category 'initialize-release') -----
bePunctuation
+
| punctuationChars |
+ punctuationChars := #($. $, $!! $? $; $: $" $' $- $( $) $`).
- punctuationChars := #($. $, $!! $; $: $" $' $- $( $) $`).
predicate := [:char | punctuationChars includes: char].
negation := [:char | (punctuationChars includes: char) not]!
Item was changed:
----- Method: RxsPredicate>>beWordConstituent (in category 'initialize-release') -----
beWordConstituent
+
+ predicate := [:char | char isAlphaNumeric or: [char == $_]].
+ negation := [:char | char isAlphaNumeric not and: [char ~~ $_]]!
- predicate := [:char | char isAlphaNumeric].
- negation := [:char | char isAlphaNumeric not]!
Item was removed:
- ----- Method: RxsRange>>enumerateTo: (in category 'accessing') -----
- enumerateTo: aCollection
- "Add all of the elements I represent to the collection."
- first asInteger to: last asInteger do:
- [:charCode |
- aCollection add: charCode asCharacter]!
Item was added:
+ ----- Method: RxsRange>>enumerateTo:ignoringCase: (in category 'accessing') -----
+ enumerateTo: aSet ignoringCase: aBoolean
+ "Add all of the elements I represent to the collection."
+
+ first asInteger to: last asInteger do:
+ [:charCode | | character |
+ character := charCode asCharacter.
+ aBoolean
+ ifTrue:
+ [aSet
+ add: character asUppercase;
+ add: character asLowercase]
+ ifFalse: [aSet add: character]]!
Item was added:
+ ----- Method: String>>allRangesOfRegexMatches: (in category '*VB-regex') -----
+ allRangesOfRegexMatches: rxString
+
+ ^rxString asRegex matchingRangesIn: self!
More information about the Squeak-dev
mailing list
|