[squeak-dev] The Trunk: Regex-Tests-Core-ct.30

christoph.thiede at student.hpi.uni-potsdam.de christoph.thiede at student.hpi.uni-potsdam.de
Wed Oct 5 18:18:30 UTC 2022


Manual diff again ...

Best,
Christoph

==================== Summary ====================

Name: Regex-Tests-Core-ct.30
Author: ct
Time: 5 October 2022, 8:04:59.150966 pm
UUID: 39d75aee-88a4-a647-9485-84d2ac56cb99
Ancestors: Regex-Tests-Core-mt.17, Regex-Tests-Core-ct.15, Regex-Tests-Core-ct.17, Regex-Tests-Core-ct.18, Regex-Tests-Core-ct.19, Regex-Tests-Core-ct.20, Regex-Tests-Core-ct.21, Regex-Tests-Core-ct.22, Regex-Tests-Core-ct.23, Regex-Tests-Core-ct.26, Regex-Tests-Core-ct.25, Regex-Tests-Core-ct.27, Regex-Tests-Core-ct.28, Regex-Tests-Core-tobe.17

Merge commit.

Regex-Tests-Core-ct.15:
	Tests #escapeRegex from Regex-Core-ct.61.
	
	Revision: Rename to #escapeForRegex (complements Regex-Core-ct.78).

Regex-Tests-Core-ct.17:
	Adds a regression test for parsing regular expressions with nested quantifiers. Thanks to Conrad Halle for reporting this bug!

Regex-Tests-Core-ct.18:
	Adds regression test for a bug while parsing lookaround-like regexes.

Regex-Tests-Core-ct.19:
	Tests non-capturing groups. Complements Regex-Core-ct.63.

Regex-Tests-Core-ct.20:
	Adds regression test for captured lookaround expressions.

Regex-Tests-Core-ct.21:
	Adds regression tests for quantifier sequences.
	
	Revision: Extends tests with all possible minimal/possessive quantifiers and explanation.

Regex-Tests-Core-ct.22:
	Complements Regex-Core-ct.66 (convenience selectors).

Regex-Tests-Core-ct.23:
	Complements Regex-Core-ct.67 (named capturing groups).
	
	Revision: Add updated tests for #keyedSubexpressionRanges: and #subexpressionRanges:.

Regex-Tests-Core-ct.25:
	Fixes #testOptionalLookbehind2. It still fails, but this time for the real bug in the matcher instead of a lowercase slip in the test.
	
	Revision for Regex-Core-ct.70 from Regex-Core-ct.78: Fixes #testOptionalLookbehind2 really, resolves the expected failure, and merges it back into #testOptionalLookbehind. Adds #testMatchNullable.

Regex-Tests-Core-ct.26:
	Tests nullable closures that are introduced in Regex-Core-ct.70.
	
	Revision: Don't deny nullable lookarounds any longer in #testLookaroundParser.

Regex-Tests-Core-ct.27:
	Merges two tests that are no-longer to do.

Regex-Tests-Core-ct.28:
	Complements Regex-Core-ct.71 (Unicode backslash atoms). ... Merges Regex-Tests-Core-tobe.17.
	
	Revision: Add test for non-existing Unicode category.

Regex-Tests-Core-ct.30:
	Complements Regex-Core-ct.66 (convenience selectors).
	
	Revision: Adds regression fixture on order of capture groups within branches (complements revision in Regex-Core-ct.78).

=============== Diff against Regex-Tests-Core-mt.17 ===============

RxMatcherTest>>runMatcher:with:expect:withSubexpressions: {utilties} · ct 10/20/2021 20:24 (changed)
runMatcher: aMatcher with: aString expect: aBoolean withSubexpressions: anArray
	| copy got |
	copy := aMatcher
		copy: aString
		translatingMatchesUsing: [ :each | each ].
	self 
		assert: copy = aString
		description: 'Copying: expected ' , aString printString , ', but got ' , copy printString.
	got := aMatcher search: aString.
	self
		assert: got = aBoolean 
		description: 'Searching: expected ' , aBoolean printString , ', but got ' , got printString.
	(anArray isNil or: [ aMatcher supportsSubexpressions not ])
		ifTrue: [ ^ self ].
	1 to: anArray size by: 2 do: [ :index |
		| sub subExpect subGot |
		sub := anArray at: index.
		subExpect := anArray at: index + 1.
- 		subGot := aMatcher subexpression: sub.
+ 		subGot := (subExpect isNil or: [subExpect isString])
+ 			ifTrue: [aMatcher subexpression: sub]
+ 			ifFalse: [aMatcher subexpressions: sub].
		self
			assert: subExpect = subGot
			description: 'Subexpression ' , sub printString , ': expected ' , subExpect printString , ', but got ' , subGot printString ]

RxMatcherTest>>testCapturingGroup {testing} · ct 8/23/2021 18:02
+ testCapturingGroup
+ 
+ 	self runRegex: #('(a)(b)c'
+ 		'c' false nil
+ 		'abc' true (1 'abc' 2 'a' 3 'b')
+ 		'eabcd' true (1 'abc' 2 'a' 3 'b')).
+ 	self flag: #tests. "ct: It might be helpful to test subexpressionCount, too"

RxMatcherTest>>testHenry039 {testing-henry} · ct 10/28/2021 02:44 (changed)
testHenry039
- 	self runRegex: #('a[a-b-c]' nil)
+ 	self runRegex: #('a[a-c-d]'
+ 		'aa' true nil
+ 		'ab' true nil
+ 		'ac' true nil
+ 		'ad' true nil
+ 		'a-' true nil
+ 		'ae' false nil)

RxMatcherTest>>testHenry071 {testing-henry} · ct 10/21/2021 00:00 (changed)
testHenry071
- 	self runRegex: #('()*' nil)
+ 	self runRegex: #('^()*$' '' true (1 '' 2 ('')))

RxMatcherTest>>testHenry073 {testing-henry} · ct 10/21/2021 00:27 (changed)
testHenry073
- 	self runRegex: #('^*' nil)
+ 	self runRegex: #('^*' '' true nil)

RxMatcherTest>>testHenry074 {testing-henry} · ct 10/21/2021 00:28 (changed)
testHenry074
- 	self runRegex: #('$*' nil)
+ 	self runRegex: #('$*' '' true nil)

RxMatcherTest>>testHenry088 {testing-henry} · ct 10/21/2021 00:01 (changed)
testHenry088
- 	self runRegex: #('(a*)*' nil)
+ 	self runRegex: #('^(a*)*a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 (''))
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa')))

RxMatcherTest>>testHenry089 {testing-henry} · ct 10/21/2021 00:01 (changed)
testHenry089
- 	self runRegex: #('(a*)+' nil)
+ 	self runRegex: #('^(a*)+a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 (''))
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa')))

RxMatcherTest>>testHenry090 {testing-henry} · ct 10/20/2021 21:47 (changed)
testHenry090
- 	self runRegex: #('(a|)*' nil)
+ 	self runRegex: #('(a|)*'
+ 		'' true (1 '')
+ 		'a' true (1 'a' 2 ('a'))
+ 		'aa' true (1 'aa' 2 ('a' 'a')))

RxMatcherTest>>testHenry091 {testing-henry} · ct 10/20/2021 21:53 (changed)
testHenry091
- 	self runRegex: #('(a*|b)*' nil)
+ 	self runRegex: #('(a*|b)*$'
+ 		'' true (1 '')
+ 		'a' true (1 'a' 2 ('a'))
+ 		'aa' true (1 'aa' 2 ('aa'))
+ 		'bb' true (1 'bb' 2 ('b' 'b'))
+ 		'aabba' true (1 'aabba' 2 ('aa' 'b' 'b' 'a')))

RxMatcherTest>>testHenry096 {testing-henry} · ct 10/20/2021 20:32 (changed)
testHenry096
- 	self runRegex: #('(^)*' nil)
+ 	self runRegex: #('(^)*'
+ 		'' true (1 '' 2 '')
+ 		'a' true (1 '' 2 ''))

RxMatcherTest>>testHenry097 {testing-henry} · ct 10/20/2021 21:53 (changed)
testHenry097
- 	self runRegex: #('(ab|)*' nil)
+ 	self runRegex: #('(ab|)*'
+ 		'' true (1 '' 2 '')
+ 		'ab' true (1 'ab' 2 ('ab'))
+ 		'abab' true (1 'abab' 2 ('ab' 'ab')))

RxMatcherTest>>testHenry138 {testing-henry} · ct 10/20/2021 23:21
+ testHenry138
+ 	self runRegex: #('(a|b?)?'
+ 		'' true (1 '' 2 (''))
+ 		'a' true (1 'a' 2 ('a'))
+ 		'b' true (1 'b' 2 ('b'))
+ 		'ab' false)

RxMatcherTest>>testHenry141 {testing-henry} · ct 10/20/2021 23:37
+ testHenry141
+ 	self runRegex: #('^(a*)?a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 (''))
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa')))

RxMatcherTest>>testHenry142 {testing-henry} · ct 10/20/2021 23:37
+ testHenry142
+ 	self runRegex: #('^(a*){1,2}a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 ('' ''))
+ 		'aa' true (1 'aa' 2 ('a' ''))
+ 		'aaa' true (1 'aaa' 2 ('aa' '')))

RxMatcherTest>>testHenry143 {testing-henry} · ct 10/20/2021 23:37
+ testHenry143
+ 	self runRegex: #('^(a+)*a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 ())
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa')))

RxMatcherTest>>testHenry144 {testing-henry} · ct 10/20/2021 23:37
+ testHenry144
+ 	self runRegex: #('^(a+)+a$'
+ 		'' false nil
+ 		'a' false nil
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa')))

RxMatcherTest>>testHenry145 {testing-henry} · ct 10/20/2021 23:37
+ testHenry145
+ 	self runRegex: #('^(a+)?a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 ())
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa')))

RxMatcherTest>>testHenry146 {testing-henry} · ct 10/20/2021 23:37
+ testHenry146
+ 	self runRegex: #('^(a+){1,2}a$'
+ 		'' false nil
+ 		'a' false nil
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa')))

RxMatcherTest>>testHenry147 {testing-henry} · ct 10/20/2021 23:37
+ testHenry147
+ 	self runRegex: #('^(a?)*a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 (''))
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('a' 'a')))

RxMatcherTest>>testHenry148 {testing-henry} · ct 10/20/2021 23:37
+ testHenry148
+ 	self runRegex: #('^(a?)+a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 (''))
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('a' 'a')))

RxMatcherTest>>testHenry149 {testing-henry} · ct 10/20/2021 23:36
+ testHenry149
+ 	self runRegex: #('^(a?)?a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 (''))
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' false nil)

RxMatcherTest>>testHenry150 {testing-henry} · ct 10/20/2021 23:38
+ testHenry150
+ 	self runRegex: #('^(a?){1,2}a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 ('' ''))
+ 		'aa' true (1 'aa' 2 ('a' ''))
+ 		'aaa' true (1 'aaa' 2 ('a' 'a')))

RxMatcherTest>>testHenry151 {testing-henry} · ct 10/20/2021 23:41
+ testHenry151
+ 	self runRegex: #('^(a{1,2})*a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 ())
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa'))
+ 		'aaaa' true (1 'aaaa' 2 ('aa' 'a')))

RxMatcherTest>>testHenry152 {testing-henry} · ct 10/20/2021 23:41
+ testHenry152
+ 	self runRegex: #('^(a{1,2})+a$'
+ 		'' false nil
+ 		'a' false nil
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa'))
+ 		'aaaa' true (1 'aaaa' 2 ('aa' 'a')))

RxMatcherTest>>testHenry153 {testing-henry} · ct 10/20/2021 23:42
+ testHenry153
+ 	self runRegex: #('^(a{1,2})?a$'
+ 		'' false nil
+ 		'a' true (1 'a' 2 nil)
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa'))
+ 		'aaaa' false nil)

RxMatcherTest>>testHenry154 {testing-henry} · ct 10/20/2021 23:40
+ testHenry154
+ 	self runRegex: #('^(a{1,2}){1,2}a$'
+ 		'' false nil
+ 		'a' false nil
+ 		'aa' true (1 'aa' 2 ('a'))
+ 		'aaa' true (1 'aaa' 2 ('aa'))
+ 		'aaaa' true (1 'aaaa' 2 ('aa' 'a')))

RxMatcherTest>>testKeyedSubexpressions {testing-protocol} · ct 10/5/2022 13:02
+ testKeyedSubexpressions
+ 
+ 	{
+ 		#('abc' () ()).
+ 		#('(\w)+' () ()).
+ 		{'(?<foo>a)b(c)'. #(foo ('a')). {#foo. {1 to: 1}}}.
+ 		{'(?<foo>\w+)'. #(foo ('abc')). {#foo. {1 to: 3}}}.
+ 		{'(?<foo>\w)+'. #(foo ('a' 'b' 'c')). {#foo. {1 to: 1. 2 to: 2. 3 to: 3}}}.
+ 		#('abc(?<foo>\w)?' (foo ()) (foo ()))
+ 	} do: [:spec |
+ 		| matcher |
+ 		matcher := spec first asRegex.
+ 		self assert: (matcher matches: 'abc').
+ 		self
+ 			assert: (Dictionary newFromPairs: spec second)
+ 			equals: matcher allKeyedSubexpressions.
+ 		self
+ 			assert: (Dictionary newFromPairs: spec third) associations
+ 			equals:
+ 				(matcher keyedMarkers collect:
+ 					[:key | key -> (matcher keyedSubexpressionRanges: key) asArray])].

RxMatcherTest>>testMatchNullable {testing} · ct 10/5/2022 14:48
+ testMatchNullable
+ 
+ 	self assert: #('A' '' '') equals: ('AB' allRegexMatches: 'A?') asArray.

RxMatcherTest>>testNonCapturingGroup {testing} · ct 8/23/2021 18:02
+ testNonCapturingGroup
+ 
+ 	self runRegex: #('(?:a)(b)c'
+ 		'bc' false nil
+ 		'abc' true (1 'abc' 2 'b')
+ 		'eabcd' true (1 'abc' 2 'b')).
+ 	self flag: #tests. "ct: It might be helpful to test subexpressionCount, too"

RxMatcherTest>>testSubexpressions {testing-protocol} · ct 10/5/2022 12:53
+ testSubexpressions
+ 
+ 	{
+ 		{'abc'. #(('abc')). {{1 to: 3}}}.
+ 		{'(a)b(c)'. #(('abc') ('a') ('c')). {{1 to: 3}. {1 to: 1}. {3 to: 3}}}. 
+ 		{'(\w+)'. #(('abc') ('abc')). {{1 to: 3}. {1 to: 3}}}.
+ 		{'(\w)+'. #(('abc') ('a' 'b' 'c')). {{1 to: 3}. {1 to: 1. 2 to: 2. 3 to: 3}}}.
+ 		{'abc(\w)?'. #(('abc') ()). {{1 to: 3}. {}}}.
+ 		{'((a)|(b))((a)|(b))c'. #(('abc') ('a') ('a') () ('b') () ('b')). {{1 to: 3}. {1 to: 1}. {1 to: 1}. {}. {2 to: 2}. {}. {2 to: 2}}}
+ 	} do: [:spec |
+ 		| matcher |
+ 		matcher := spec first asRegex.
+ 		self assert: (matcher matches: 'abc').
+ 		self
+ 			assert: spec second
+ 			equals: (matcher allSubexpressions collect: #asArray).
+ 		self
+ 			assert: spec third
+ 			equals:
+ 				((1 to: matcher subexpressionCount) collect:
+ 					[:index | (matcher subexpressionRanges: index) asArray])].

RxParserTest>>expectedFailures {failures} · mt 6/8/2022 15:03 (removed)
- expectedFailures
- 
- 	^ #(testOptionalLookbehind2)

RxParserTest>>testCharacterSetWithEscapedCharacters {tests} · ct 10/27/2021 23:17 (changed)
testCharacterSetWithEscapedCharacters
	"self debug: #testCharacterSetRange"
	
	{
		'[\r]'. String cr. String space.
		'[\n]'. String lf. String space.
		'[\t]'. String tab. String space.
		'[\e]'. Character escape asString. String space.
		'[\f]'. Character newPage asString. String space.
		'[\]]+'. ']]]'. '[[['.
		'[\S]+[\s]+=[\s]+#[^\[(]'. 'foo = #bar'. 'foo = #[1 2 3]'.
		'[\d]+'. '123'. 'abc'.
		'[\D]+'. 'abc'. '123'.
		'[\w]+'. 'a1_b2'. '...'.
		'[\W]+'. '...'. 'a1_b2'.
+ 		'[\b]'. 'b'. ' '.
+ 		'[\p{L}\d]+'. 'tschüß123'. ':-)'.
+ 		'[\P{L}a]'. 'a'. 'b'.
	} groupsDo: [ :regexString :inputToAccept :inputToReject |
		| regex |
		regex := regexString asRegex.
		self
			assert: (regex search: inputToAccept);
			deny: (regex search: inputToReject) ]

RxParserTest>>testCodePointu {tests} · ct 10/28/2021 04:46
+ testCodePointu
+ 
+ 	| string |
+ 	string := String value: 16r1f388.
+ 	self assert: [string matchesRegex: '\u{1f388}'].
+ 	self assert: ['A' matchesRegex: '\u0041'].
+ 	self assert: ['Aa' matchesRegex: '\u0041a'].
+ 	self assert: ['m' matchesRegex: '\u006D'].
+ 	self assert: ['m' matchesRegex: '\u006d'].
+ 	self should: ['\u004' asRegex] raise: RegexSyntaxError.
+ 	self should: ['\u0g41' asRegex] raise: RegexSyntaxError.
+ 	
+ 	self assert: ['e' matchesRegex: '\u{ar101}'].
+ 	self deny: [string matchesRegex: '\u{1f387}'].
+ 	self deny: ['\u{1f388}' matchesRegex: '\u{1f388}'].
+ 	self deny: ['1f388' matchesRegex: '\u{1f388}'].
+ 	self deny: ['u' matchesRegex: '\u{1}'].
+ 	self deny: [(String value: 16r1f389) matchesRegex: '\u{1f388}'].
+ 	self deny: [(WideString fromByteArray: #(16r17f3 16r88)) matchesRegex: '\u{1f388}'].
+ 	self deny: [(WideString fromByteArray: #(16r17f3 88)) matchesRegex: '\u{1f388}'].
+ 	
+ 	self assert: ['m' matchesRegex: '[\u006d]'].
+ 	self assert: ['3' matchesRegex: '[\u0032-4]'].
+ 	self deny: ['0' matchesRegex: '[\u0032-4]'].
+ 	self assert: ['3' matchesRegex: '[2-\u0034]'].
+ 	self deny: ['0' matchesRegex: '[2-\u0034]'].
+ 	self should: ['[\u006d-\d]' asRegex] raise: RegexSyntaxError.
+ 	self should: ['[\d-\u006d]' asRegex] raise: RegexSyntaxError.
+ 	self assert: ['A' matchesRegex: '[\u006d-\u006fA]'].

RxParserTest>>testCodePointx {tests} · ct 10/28/2021 04:47
+ testCodePointx
+ 
+ 	self assert: ['8' matchesRegex: '\x38'].
+ 	self deny: ['8' matchesRegex: '\x39'].
+ 	self deny: ['9' matchesRegex: '\x38'].
+ 	self deny: ['&' matchesRegex: '\x38'].
+ 	self deny: ['\x38' matchesRegex: '\x38'].
+ 	self deny: ['38' matchesRegex: '\x38'].
+ 	self assert: ['8a' matchesRegex: '\x38a'].
+ 	self should: ['\x3' asRegex] raise: RegexSyntaxError.
+ 	self deny: [(WideString fromByteArray: {3. 8}) matchesRegex: '\x38'].
+ 	self deny: [(WideString fromByteArray: {3. 38}) matchesRegex: '\x38'].
+ 	self deny: [(String new: 20 withAll: $x) matchesRegex: '\x20'].
+ 	
+ 	self assert: ['8' matchesRegex: '\x{38}'].
+ 	self assert: ['?' matchesRegex: '\x{38a}'].
+ 	self assert: ['8' matchesRegex: '\x{2r111000}'].
+ 	self deny: ['8' matchesRegex: '\x{39}'].
+ 	self deny: ['9' matchesRegex: '\x{38}'].
+ 	self deny: ['\x{38}' matchesRegex: '\x{38}'].
+ 	
+ 	self assert: ['8a' matchesRegex: '[\x38a]+'].

RxParserTest>>testEscapeString {tests} · ct 10/5/2022 11:18
+ testEscapeString
+ 
+ 	| string |
+ 	string := 'Hello world, how are you? (This is a test - special characters *very much* intended \-.-/ )'.
+ 	self assert: (string matchesRegex: string escapeForRegex).
+ 	self assert: (string includesSubstring: 'Hello world, how are you?') "no all-out escaping".

RxParserTest>>testLookaroundNullable {tests} · ct 10/21/2021 00:04 (changed)
testLookaroundNullable

- 	self should: ['(?<=a)?b' asRegex] raise: RegexSyntaxError.
+ 	self assert: ('b' matchesRegex: '(?<=a)?b').
+ 	self assert: ('(?<=a)?b' asRegex search: 'ab').

RxParserTest>>testLookaroundParser {tests} · ct 10/5/2022 13:42
+ testLookaroundParser
+ 
+ 	self should: ['(?<a)b' asRegex] raise: RegexSyntaxError.

RxParserTest>>testNestedQuantifiers {tests} · ct 8/23/2021 17:23
+ testNestedQuantifiers
+ 
+ 	self deny: ('' matchesRegex: '(ab+){2,}').
+ 	self deny: ('ab' matchesRegex: '(ab+){2,}').
+ 	self deny: ('aba' matchesRegex: '(ab+){2,}').
+ 	self assert: ('abab' matchesRegex: '(ab+){2,}').
+ 	self assert: ('abbabbb' matchesRegex: '(ab+){2,}').
+ 	self assert: ('abbabbbab' matchesRegex: '(ab+){2,}').

RxParserTest>>testNoCapturingOfLookarounds {tests} · ct 8/23/2021 18:43
+ testNoCapturingOfLookarounds
+ 
+ 	| matcher |
+ 	matcher := '(?<=a)(?<!c)(b)(?=c)(?!b)' asRegex.
+ 	self assert: (matcher search: 'abc').
+ 	self assert: 2 equals: matcher subexpressionCount.
+ 	self assert: #('b') equals: (matcher subexpressions: 2)

RxParserTest>>testOptionalLookbehind {tests} · ct 10/5/2022 14:51 (changed)
testOptionalLookbehind

- 	self assert: ['A' matchesRegex: '((?<=^)A)+'].
+ 	self assert: ['A' matchesRegex: '((?<=^)A)+'].
+ 	self assert: [('AB' allRegexMatches: '((?<=A)B)?') asArray = #('' 'B' '')].

RxParserTest>>testOptionalLookbehind2 {tests} · mt 7/8/2021 08:22 (removed)
- testOptionalLookbehind2
- 
- 	self assert: [('AB' allRegexMatches: '((?<=a)b)?') asArray = #('A')].

RxParserTest>>testOrOperator {tests} · ct 10/20/2021 16:40 (changed)
testOrOperator
	"self debug: #testOrOperator"
	
	"The last operator is `|' meaning `or'. It is placed between two
regular expressions, and the resulting expression matches if one of
the expressions matches. It has the lowest possible precedence (lower
than sequencing). For example, `ab*|ba*' means `a followed by any
number of b's, or b followed by any number of a's':"

	self assert: ('abb' matchesRegex: 'ab*|ba*').  	
	self assert: ('baa' matchesRegex: 'ab*|ba*').	 	
	self deny: ('baab' matchesRegex: 'ab*|ba*').
	
- 
- 	"It is possible to write an expression matching an empty string, for
- example: `a|'.  However, it is an error to apply `*', `+', or `?' to
- such expression: `(a|)*' is an invalid expression."
- 
- 	self should: ['(a|)*' asRegex] raise: Error.
+ 	
+ 	self assert: ('' matchesRegex: '(a|)*').
+ 	self assert: ('a' matchesRegex: '(a|)*').
+ 	self assert: ('aa' matchesRegex: '(a|)*').


RxParserTest>>testQuantifierSequence {tests} · ct 10/5/2022 11:48
+ testQuantifierSequence
+ 
+ 	"Unless we add support for minimal quantifiers, the following should raise a syntax error."
+ 	self
+ 		should: ['a??' asRegex] raise: RegexSyntaxError;
+ 		should: ['+?' asRegex] raise: RegexSyntaxError;
+ 		should: ['*?' asRegex] raise: RegexSyntaxError;
+ 		should: ['a{1,2}?' asRegex] raise: RegexSyntaxError.
+ 	
+ 	"Unless we add support for possessive quantifiers, the following should raise a syntax error."
+ 	self
+ 		should: ['a?+' asRegex] raise: RegexSyntaxError;
+ 		should: ['a++' asRegex] raise: RegexSyntaxError;
+ 		should: ['a*+' asRegex] raise: RegexSyntaxError;
+ 		should: ['a{1,2}+' asRegex] raise: RegexSyntaxError.
+ 	
+ 	"The following does not make sense under any circumstances."
+ 	self
+ 		should: ['a?*' asRegex] raise: RegexSyntaxError;
+ 		should: ['a+*' asRegex] raise: RegexSyntaxError;
+ 		should: ['a**' asRegex] raise: RegexSyntaxError.
+ 	self
+ 		should: ['a{1,2}{3,4}' asRegex] raise: RegexSyntaxError;
+ 		should: ['a?{1,2}' asRegex] raise: RegexSyntaxError;
+ 		should: ['a+{1,2}' asRegex] raise: RegexSyntaxError;
+ 		should: ['a*{1,2}' asRegex] raise: RegexSyntaxError.

RxParserTest>>testRegexSyntaxErrorPosition {tests} · ct 10/28/2021 03:14
+ testRegexSyntaxErrorPosition
+ 
+ 	| position |
+ 	['a::z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ 	self assert: 3 equals: position.
+ 	['a[b[:space:_]y]z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ 	self assert: 12 equals: position.
+ 	['a[^][::]]z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ 	self assert: 8 equals: position.
+ 	
+ 	"During nested parsing, the global position must be provided"
+ 	['a\x{}z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ 	self assert: 5 equals: position.
+ 	['a[b\x{}y]z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ 	self assert: 7 equals: position.
+ 	['a[^b\x{}y]z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ 	self assert: 8 equals: position.

RxParserTest>>testSpecialCharacterInSetRange {tests} · ct 10/21/2021 01:08 (changed)
testSpecialCharacterInSetRange
	"self debug: #testSpecialCharacterInSetRange"
	
	"Special characters within a set are `^', `-', and `]' that closes the
set. Below are the examples of how to literally use them in a set:
	[01^]		-- put the caret anywhere except the beginning
	[01-]		-- put the dash as the last character
	[]01]		-- put the closing bracket as the first character 
	[^]01]			(thus, empty and universal sets cannot be specified)"

	self assert: ('0' matchesRegex: '[01^]').
	self assert: ('1' matchesRegex: '[01^]').
	self assert: ('^' matchesRegex: '[01^]').
	
	self deny: ('0' matchesRegex: '[^01]').
	self deny: ('1' matchesRegex: '[^01]').
	
	"[^abc] means that everything except abc is matche"
	self assert: ('^' matchesRegex: '[^01]').
- 	
+ 	
+ 	"[1-7] is the range of all digits between 1 and 7"
+ 	self assert: ('3' matchesRegex: '[1-7]').

RxParserTest>>testUnicodeCategory {tests} · ct 10/5/2022 19:38
+ testUnicodeCategory
+ 
+ 	self assert: ['X' matchesRegex: '\p{Lu}'].
+ 	self assert: ['X' matchesRegex: '\p{L}'].
+ 	self deny: ['X' matchesRegex: '\p{Ll}'].
+ 	self assert: ['x' matchesRegex: '\p{Ll}'].
+ 	self assert: ['x' matchesRegex: '\p{L}'].
+ 	self deny: ['x' matchesRegex: '\p{Lu}'].
+ 	
+ 	self deny: ['X' matchesRegex: '\P{Lu}'].
+ 	self deny: ['X' matchesRegex: '\P{L}'].
+ 	self assert: ['X' matchesRegex: '\P{Ll}'].
+ 	self deny: ['x' matchesRegex: '\P{Ll}'].
+ 	self deny: ['x' matchesRegex: '\P{L}'].
+ 	self assert: ['x' matchesRegex: '\P{Lu}'].
+ 	
+ 	self assert: ['x' matchesRegex: '[\p{L}]'].
+ 	self deny: ['x' matchesRegex: '[\P{L}]'].
+ 	self assert: ['x' matchesRegex: '[^\P{L}]'].
+ 	
+ 	self should: ['x' matchesRegex: '[\p{LoremIpsum}]'] raise: RegexSyntaxError.

RxParserTest>>toDotestSpecialCharacterInSetRange {tests} · sd 9/4/2006 23:29 (removed)
- toDotestSpecialCharacterInSetRange
- 	"self debug: #testSpecialCharacterInSetRange"
- 	
- 	"Special characters within a set are `^', `-', and `]' that closes the
- set. Below are the examples of how to literally use them in a set:
- 	[01^]		-- put the caret anywhere except the beginning
- 	[01-]		-- put the dash as the last character
- 	[]01]		-- put the closing bracket as the first character 
- 	[^]01]			(thus, empty and universal sets cannot be specified)"
- 
- 	self assert: ('0' matchesRegex: '[01^]').
- 	
- 	self assert: ('0' matchesRegex: '[0-9]').	
- 	


---
Sent from Squeak Inbox Talk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20221005/6ba3e526/attachment.html>


More information about the Squeak-dev mailing list