[squeak-dev] The Trunk: Regex-Tests-Core-ct.30
christoph.thiede at student.hpi.uni-potsdam.de
christoph.thiede at student.hpi.uni-potsdam.de
Wed Oct 5 18:18:30 UTC 2022
Manual diff again ...
Best,
Christoph
==================== Summary ====================
Name: Regex-Tests-Core-ct.30
Author: ct
Time: 5 October 2022, 8:04:59.150966 pm
UUID: 39d75aee-88a4-a647-9485-84d2ac56cb99
Ancestors: Regex-Tests-Core-mt.17, Regex-Tests-Core-ct.15, Regex-Tests-Core-ct.17, Regex-Tests-Core-ct.18, Regex-Tests-Core-ct.19, Regex-Tests-Core-ct.20, Regex-Tests-Core-ct.21, Regex-Tests-Core-ct.22, Regex-Tests-Core-ct.23, Regex-Tests-Core-ct.26, Regex-Tests-Core-ct.25, Regex-Tests-Core-ct.27, Regex-Tests-Core-ct.28, Regex-Tests-Core-tobe.17
Merge commit.
Regex-Tests-Core-ct.15:
Tests #escapeRegex from Regex-Core-ct.61.
Revision: Rename to #escapeForRegex (complements Regex-Core-ct.78).
Regex-Tests-Core-ct.17:
Adds a regression test for parsing regular expressions with nested quantifiers. Thanks to Conrad Halle for reporting this bug!
Regex-Tests-Core-ct.18:
Adds regression test for a bug while parsing lookaround-like regexes.
Regex-Tests-Core-ct.19:
Tests non-capturing groups. Complements Regex-Core-ct.63.
Regex-Tests-Core-ct.20:
Adds regression test for captured lookaround expressions.
Regex-Tests-Core-ct.21:
Adds regression tests for quantifier sequences.
Revision: Extends tests with all possible minimal/possessive quantifiers and explanation.
Regex-Tests-Core-ct.22:
Complements Regex-Core-ct.66 (convenience selectors).
Regex-Tests-Core-ct.23:
Complements Regex-Core-ct.67 (named capturing groups).
Revision: Add updated tests for #keyedSubexpressionRanges: and #subexpressionRanges:.
Regex-Tests-Core-ct.25:
Fixes #testOptionalLookbehind2. It still fails, but this time for the real bug in the matcher instead of a lowercase slip in the test.
Revision for Regex-Core-ct.70 from Regex-Core-ct.78: Fixes #testOptionalLookbehind2 really, resolves the expected failure, and merges it back into #testOptionalLookbehind. Adds #testMatchNullable.
Regex-Tests-Core-ct.26:
Tests nullable closures that are introduced in Regex-Core-ct.70.
Revision: Don't deny nullable lookarounds any longer in #testLookaroundParser.
Regex-Tests-Core-ct.27:
Merges two tests that are no-longer to do.
Regex-Tests-Core-ct.28:
Complements Regex-Core-ct.71 (Unicode backslash atoms). ... Merges Regex-Tests-Core-tobe.17.
Revision: Add test for non-existing Unicode category.
Regex-Tests-Core-ct.30:
Complements Regex-Core-ct.66 (convenience selectors).
Revision: Adds regression fixture on order of capture groups within branches (complements revision in Regex-Core-ct.78).
=============== Diff against Regex-Tests-Core-mt.17 ===============
RxMatcherTest>>runMatcher:with:expect:withSubexpressions: {utilties} · ct 10/20/2021 20:24 (changed)
runMatcher: aMatcher with: aString expect: aBoolean withSubexpressions: anArray
| copy got |
copy := aMatcher
copy: aString
translatingMatchesUsing: [ :each | each ].
self
assert: copy = aString
description: 'Copying: expected ' , aString printString , ', but got ' , copy printString.
got := aMatcher search: aString.
self
assert: got = aBoolean
description: 'Searching: expected ' , aBoolean printString , ', but got ' , got printString.
(anArray isNil or: [ aMatcher supportsSubexpressions not ])
ifTrue: [ ^ self ].
1 to: anArray size by: 2 do: [ :index |
| sub subExpect subGot |
sub := anArray at: index.
subExpect := anArray at: index + 1.
- subGot := aMatcher subexpression: sub.
+ subGot := (subExpect isNil or: [subExpect isString])
+ ifTrue: [aMatcher subexpression: sub]
+ ifFalse: [aMatcher subexpressions: sub].
self
assert: subExpect = subGot
description: 'Subexpression ' , sub printString , ': expected ' , subExpect printString , ', but got ' , subGot printString ]
RxMatcherTest>>testCapturingGroup {testing} · ct 8/23/2021 18:02
+ testCapturingGroup
+
+ self runRegex: #('(a)(b)c'
+ 'c' false nil
+ 'abc' true (1 'abc' 2 'a' 3 'b')
+ 'eabcd' true (1 'abc' 2 'a' 3 'b')).
+ self flag: #tests. "ct: It might be helpful to test subexpressionCount, too"
RxMatcherTest>>testHenry039 {testing-henry} · ct 10/28/2021 02:44 (changed)
testHenry039
- self runRegex: #('a[a-b-c]' nil)
+ self runRegex: #('a[a-c-d]'
+ 'aa' true nil
+ 'ab' true nil
+ 'ac' true nil
+ 'ad' true nil
+ 'a-' true nil
+ 'ae' false nil)
RxMatcherTest>>testHenry071 {testing-henry} · ct 10/21/2021 00:00 (changed)
testHenry071
- self runRegex: #('()*' nil)
+ self runRegex: #('^()*$' '' true (1 '' 2 ('')))
RxMatcherTest>>testHenry073 {testing-henry} · ct 10/21/2021 00:27 (changed)
testHenry073
- self runRegex: #('^*' nil)
+ self runRegex: #('^*' '' true nil)
RxMatcherTest>>testHenry074 {testing-henry} · ct 10/21/2021 00:28 (changed)
testHenry074
- self runRegex: #('$*' nil)
+ self runRegex: #('$*' '' true nil)
RxMatcherTest>>testHenry088 {testing-henry} · ct 10/21/2021 00:01 (changed)
testHenry088
- self runRegex: #('(a*)*' nil)
+ self runRegex: #('^(a*)*a$'
+ '' false nil
+ 'a' true (1 'a' 2 (''))
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa')))
RxMatcherTest>>testHenry089 {testing-henry} · ct 10/21/2021 00:01 (changed)
testHenry089
- self runRegex: #('(a*)+' nil)
+ self runRegex: #('^(a*)+a$'
+ '' false nil
+ 'a' true (1 'a' 2 (''))
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa')))
RxMatcherTest>>testHenry090 {testing-henry} · ct 10/20/2021 21:47 (changed)
testHenry090
- self runRegex: #('(a|)*' nil)
+ self runRegex: #('(a|)*'
+ '' true (1 '')
+ 'a' true (1 'a' 2 ('a'))
+ 'aa' true (1 'aa' 2 ('a' 'a')))
RxMatcherTest>>testHenry091 {testing-henry} · ct 10/20/2021 21:53 (changed)
testHenry091
- self runRegex: #('(a*|b)*' nil)
+ self runRegex: #('(a*|b)*$'
+ '' true (1 '')
+ 'a' true (1 'a' 2 ('a'))
+ 'aa' true (1 'aa' 2 ('aa'))
+ 'bb' true (1 'bb' 2 ('b' 'b'))
+ 'aabba' true (1 'aabba' 2 ('aa' 'b' 'b' 'a')))
RxMatcherTest>>testHenry096 {testing-henry} · ct 10/20/2021 20:32 (changed)
testHenry096
- self runRegex: #('(^)*' nil)
+ self runRegex: #('(^)*'
+ '' true (1 '' 2 '')
+ 'a' true (1 '' 2 ''))
RxMatcherTest>>testHenry097 {testing-henry} · ct 10/20/2021 21:53 (changed)
testHenry097
- self runRegex: #('(ab|)*' nil)
+ self runRegex: #('(ab|)*'
+ '' true (1 '' 2 '')
+ 'ab' true (1 'ab' 2 ('ab'))
+ 'abab' true (1 'abab' 2 ('ab' 'ab')))
RxMatcherTest>>testHenry138 {testing-henry} · ct 10/20/2021 23:21
+ testHenry138
+ self runRegex: #('(a|b?)?'
+ '' true (1 '' 2 (''))
+ 'a' true (1 'a' 2 ('a'))
+ 'b' true (1 'b' 2 ('b'))
+ 'ab' false)
RxMatcherTest>>testHenry141 {testing-henry} · ct 10/20/2021 23:37
+ testHenry141
+ self runRegex: #('^(a*)?a$'
+ '' false nil
+ 'a' true (1 'a' 2 (''))
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa')))
RxMatcherTest>>testHenry142 {testing-henry} · ct 10/20/2021 23:37
+ testHenry142
+ self runRegex: #('^(a*){1,2}a$'
+ '' false nil
+ 'a' true (1 'a' 2 ('' ''))
+ 'aa' true (1 'aa' 2 ('a' ''))
+ 'aaa' true (1 'aaa' 2 ('aa' '')))
RxMatcherTest>>testHenry143 {testing-henry} · ct 10/20/2021 23:37
+ testHenry143
+ self runRegex: #('^(a+)*a$'
+ '' false nil
+ 'a' true (1 'a' 2 ())
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa')))
RxMatcherTest>>testHenry144 {testing-henry} · ct 10/20/2021 23:37
+ testHenry144
+ self runRegex: #('^(a+)+a$'
+ '' false nil
+ 'a' false nil
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa')))
RxMatcherTest>>testHenry145 {testing-henry} · ct 10/20/2021 23:37
+ testHenry145
+ self runRegex: #('^(a+)?a$'
+ '' false nil
+ 'a' true (1 'a' 2 ())
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa')))
RxMatcherTest>>testHenry146 {testing-henry} · ct 10/20/2021 23:37
+ testHenry146
+ self runRegex: #('^(a+){1,2}a$'
+ '' false nil
+ 'a' false nil
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa')))
RxMatcherTest>>testHenry147 {testing-henry} · ct 10/20/2021 23:37
+ testHenry147
+ self runRegex: #('^(a?)*a$'
+ '' false nil
+ 'a' true (1 'a' 2 (''))
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('a' 'a')))
RxMatcherTest>>testHenry148 {testing-henry} · ct 10/20/2021 23:37
+ testHenry148
+ self runRegex: #('^(a?)+a$'
+ '' false nil
+ 'a' true (1 'a' 2 (''))
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('a' 'a')))
RxMatcherTest>>testHenry149 {testing-henry} · ct 10/20/2021 23:36
+ testHenry149
+ self runRegex: #('^(a?)?a$'
+ '' false nil
+ 'a' true (1 'a' 2 (''))
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' false nil)
RxMatcherTest>>testHenry150 {testing-henry} · ct 10/20/2021 23:38
+ testHenry150
+ self runRegex: #('^(a?){1,2}a$'
+ '' false nil
+ 'a' true (1 'a' 2 ('' ''))
+ 'aa' true (1 'aa' 2 ('a' ''))
+ 'aaa' true (1 'aaa' 2 ('a' 'a')))
RxMatcherTest>>testHenry151 {testing-henry} · ct 10/20/2021 23:41
+ testHenry151
+ self runRegex: #('^(a{1,2})*a$'
+ '' false nil
+ 'a' true (1 'a' 2 ())
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa'))
+ 'aaaa' true (1 'aaaa' 2 ('aa' 'a')))
RxMatcherTest>>testHenry152 {testing-henry} · ct 10/20/2021 23:41
+ testHenry152
+ self runRegex: #('^(a{1,2})+a$'
+ '' false nil
+ 'a' false nil
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa'))
+ 'aaaa' true (1 'aaaa' 2 ('aa' 'a')))
RxMatcherTest>>testHenry153 {testing-henry} · ct 10/20/2021 23:42
+ testHenry153
+ self runRegex: #('^(a{1,2})?a$'
+ '' false nil
+ 'a' true (1 'a' 2 nil)
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa'))
+ 'aaaa' false nil)
RxMatcherTest>>testHenry154 {testing-henry} · ct 10/20/2021 23:40
+ testHenry154
+ self runRegex: #('^(a{1,2}){1,2}a$'
+ '' false nil
+ 'a' false nil
+ 'aa' true (1 'aa' 2 ('a'))
+ 'aaa' true (1 'aaa' 2 ('aa'))
+ 'aaaa' true (1 'aaaa' 2 ('aa' 'a')))
RxMatcherTest>>testKeyedSubexpressions {testing-protocol} · ct 10/5/2022 13:02
+ testKeyedSubexpressions
+
+ {
+ #('abc' () ()).
+ #('(\w)+' () ()).
+ {'(?<foo>a)b(c)'. #(foo ('a')). {#foo. {1 to: 1}}}.
+ {'(?<foo>\w+)'. #(foo ('abc')). {#foo. {1 to: 3}}}.
+ {'(?<foo>\w)+'. #(foo ('a' 'b' 'c')). {#foo. {1 to: 1. 2 to: 2. 3 to: 3}}}.
+ #('abc(?<foo>\w)?' (foo ()) (foo ()))
+ } do: [:spec |
+ | matcher |
+ matcher := spec first asRegex.
+ self assert: (matcher matches: 'abc').
+ self
+ assert: (Dictionary newFromPairs: spec second)
+ equals: matcher allKeyedSubexpressions.
+ self
+ assert: (Dictionary newFromPairs: spec third) associations
+ equals:
+ (matcher keyedMarkers collect:
+ [:key | key -> (matcher keyedSubexpressionRanges: key) asArray])].
RxMatcherTest>>testMatchNullable {testing} · ct 10/5/2022 14:48
+ testMatchNullable
+
+ self assert: #('A' '' '') equals: ('AB' allRegexMatches: 'A?') asArray.
RxMatcherTest>>testNonCapturingGroup {testing} · ct 8/23/2021 18:02
+ testNonCapturingGroup
+
+ self runRegex: #('(?:a)(b)c'
+ 'bc' false nil
+ 'abc' true (1 'abc' 2 'b')
+ 'eabcd' true (1 'abc' 2 'b')).
+ self flag: #tests. "ct: It might be helpful to test subexpressionCount, too"
RxMatcherTest>>testSubexpressions {testing-protocol} · ct 10/5/2022 12:53
+ testSubexpressions
+
+ {
+ {'abc'. #(('abc')). {{1 to: 3}}}.
+ {'(a)b(c)'. #(('abc') ('a') ('c')). {{1 to: 3}. {1 to: 1}. {3 to: 3}}}.
+ {'(\w+)'. #(('abc') ('abc')). {{1 to: 3}. {1 to: 3}}}.
+ {'(\w)+'. #(('abc') ('a' 'b' 'c')). {{1 to: 3}. {1 to: 1. 2 to: 2. 3 to: 3}}}.
+ {'abc(\w)?'. #(('abc') ()). {{1 to: 3}. {}}}.
+ {'((a)|(b))((a)|(b))c'. #(('abc') ('a') ('a') () ('b') () ('b')). {{1 to: 3}. {1 to: 1}. {1 to: 1}. {}. {2 to: 2}. {}. {2 to: 2}}}
+ } do: [:spec |
+ | matcher |
+ matcher := spec first asRegex.
+ self assert: (matcher matches: 'abc').
+ self
+ assert: spec second
+ equals: (matcher allSubexpressions collect: #asArray).
+ self
+ assert: spec third
+ equals:
+ ((1 to: matcher subexpressionCount) collect:
+ [:index | (matcher subexpressionRanges: index) asArray])].
RxParserTest>>expectedFailures {failures} · mt 6/8/2022 15:03 (removed)
- expectedFailures
-
- ^ #(testOptionalLookbehind2)
RxParserTest>>testCharacterSetWithEscapedCharacters {tests} · ct 10/27/2021 23:17 (changed)
testCharacterSetWithEscapedCharacters
"self debug: #testCharacterSetRange"
{
'[\r]'. String cr. String space.
'[\n]'. String lf. String space.
'[\t]'. String tab. String space.
'[\e]'. Character escape asString. String space.
'[\f]'. Character newPage asString. String space.
'[\]]+'. ']]]'. '[[['.
'[\S]+[\s]+=[\s]+#[^\[(]'. 'foo = #bar'. 'foo = #[1 2 3]'.
'[\d]+'. '123'. 'abc'.
'[\D]+'. 'abc'. '123'.
'[\w]+'. 'a1_b2'. '...'.
'[\W]+'. '...'. 'a1_b2'.
+ '[\b]'. 'b'. ' '.
+ '[\p{L}\d]+'. 'tschüß123'. ':-)'.
+ '[\P{L}a]'. 'a'. 'b'.
} groupsDo: [ :regexString :inputToAccept :inputToReject |
| regex |
regex := regexString asRegex.
self
assert: (regex search: inputToAccept);
deny: (regex search: inputToReject) ]
RxParserTest>>testCodePointu {tests} · ct 10/28/2021 04:46
+ testCodePointu
+
+ | string |
+ string := String value: 16r1f388.
+ self assert: [string matchesRegex: '\u{1f388}'].
+ self assert: ['A' matchesRegex: '\u0041'].
+ self assert: ['Aa' matchesRegex: '\u0041a'].
+ self assert: ['m' matchesRegex: '\u006D'].
+ self assert: ['m' matchesRegex: '\u006d'].
+ self should: ['\u004' asRegex] raise: RegexSyntaxError.
+ self should: ['\u0g41' asRegex] raise: RegexSyntaxError.
+
+ self assert: ['e' matchesRegex: '\u{ar101}'].
+ self deny: [string matchesRegex: '\u{1f387}'].
+ self deny: ['\u{1f388}' matchesRegex: '\u{1f388}'].
+ self deny: ['1f388' matchesRegex: '\u{1f388}'].
+ self deny: ['u' matchesRegex: '\u{1}'].
+ self deny: [(String value: 16r1f389) matchesRegex: '\u{1f388}'].
+ self deny: [(WideString fromByteArray: #(16r17f3 16r88)) matchesRegex: '\u{1f388}'].
+ self deny: [(WideString fromByteArray: #(16r17f3 88)) matchesRegex: '\u{1f388}'].
+
+ self assert: ['m' matchesRegex: '[\u006d]'].
+ self assert: ['3' matchesRegex: '[\u0032-4]'].
+ self deny: ['0' matchesRegex: '[\u0032-4]'].
+ self assert: ['3' matchesRegex: '[2-\u0034]'].
+ self deny: ['0' matchesRegex: '[2-\u0034]'].
+ self should: ['[\u006d-\d]' asRegex] raise: RegexSyntaxError.
+ self should: ['[\d-\u006d]' asRegex] raise: RegexSyntaxError.
+ self assert: ['A' matchesRegex: '[\u006d-\u006fA]'].
RxParserTest>>testCodePointx {tests} · ct 10/28/2021 04:47
+ testCodePointx
+
+ self assert: ['8' matchesRegex: '\x38'].
+ self deny: ['8' matchesRegex: '\x39'].
+ self deny: ['9' matchesRegex: '\x38'].
+ self deny: ['&' matchesRegex: '\x38'].
+ self deny: ['\x38' matchesRegex: '\x38'].
+ self deny: ['38' matchesRegex: '\x38'].
+ self assert: ['8a' matchesRegex: '\x38a'].
+ self should: ['\x3' asRegex] raise: RegexSyntaxError.
+ self deny: [(WideString fromByteArray: {3. 8}) matchesRegex: '\x38'].
+ self deny: [(WideString fromByteArray: {3. 38}) matchesRegex: '\x38'].
+ self deny: [(String new: 20 withAll: $x) matchesRegex: '\x20'].
+
+ self assert: ['8' matchesRegex: '\x{38}'].
+ self assert: ['?' matchesRegex: '\x{38a}'].
+ self assert: ['8' matchesRegex: '\x{2r111000}'].
+ self deny: ['8' matchesRegex: '\x{39}'].
+ self deny: ['9' matchesRegex: '\x{38}'].
+ self deny: ['\x{38}' matchesRegex: '\x{38}'].
+
+ self assert: ['8a' matchesRegex: '[\x38a]+'].
RxParserTest>>testEscapeString {tests} · ct 10/5/2022 11:18
+ testEscapeString
+
+ | string |
+ string := 'Hello world, how are you? (This is a test - special characters *very much* intended \-.-/ )'.
+ self assert: (string matchesRegex: string escapeForRegex).
+ self assert: (string includesSubstring: 'Hello world, how are you?') "no all-out escaping".
RxParserTest>>testLookaroundNullable {tests} · ct 10/21/2021 00:04 (changed)
testLookaroundNullable
- self should: ['(?<=a)?b' asRegex] raise: RegexSyntaxError.
+ self assert: ('b' matchesRegex: '(?<=a)?b').
+ self assert: ('(?<=a)?b' asRegex search: 'ab').
RxParserTest>>testLookaroundParser {tests} · ct 10/5/2022 13:42
+ testLookaroundParser
+
+ self should: ['(?<a)b' asRegex] raise: RegexSyntaxError.
RxParserTest>>testNestedQuantifiers {tests} · ct 8/23/2021 17:23
+ testNestedQuantifiers
+
+ self deny: ('' matchesRegex: '(ab+){2,}').
+ self deny: ('ab' matchesRegex: '(ab+){2,}').
+ self deny: ('aba' matchesRegex: '(ab+){2,}').
+ self assert: ('abab' matchesRegex: '(ab+){2,}').
+ self assert: ('abbabbb' matchesRegex: '(ab+){2,}').
+ self assert: ('abbabbbab' matchesRegex: '(ab+){2,}').
RxParserTest>>testNoCapturingOfLookarounds {tests} · ct 8/23/2021 18:43
+ testNoCapturingOfLookarounds
+
+ | matcher |
+ matcher := '(?<=a)(?<!c)(b)(?=c)(?!b)' asRegex.
+ self assert: (matcher search: 'abc').
+ self assert: 2 equals: matcher subexpressionCount.
+ self assert: #('b') equals: (matcher subexpressions: 2)
RxParserTest>>testOptionalLookbehind {tests} · ct 10/5/2022 14:51 (changed)
testOptionalLookbehind
- self assert: ['A' matchesRegex: '((?<=^)A)+'].
+ self assert: ['A' matchesRegex: '((?<=^)A)+'].
+ self assert: [('AB' allRegexMatches: '((?<=A)B)?') asArray = #('' 'B' '')].
RxParserTest>>testOptionalLookbehind2 {tests} · mt 7/8/2021 08:22 (removed)
- testOptionalLookbehind2
-
- self assert: [('AB' allRegexMatches: '((?<=a)b)?') asArray = #('A')].
RxParserTest>>testOrOperator {tests} · ct 10/20/2021 16:40 (changed)
testOrOperator
"self debug: #testOrOperator"
"The last operator is `|' meaning `or'. It is placed between two
regular expressions, and the resulting expression matches if one of
the expressions matches. It has the lowest possible precedence (lower
than sequencing). For example, `ab*|ba*' means `a followed by any
number of b's, or b followed by any number of a's':"
self assert: ('abb' matchesRegex: 'ab*|ba*').
self assert: ('baa' matchesRegex: 'ab*|ba*').
self deny: ('baab' matchesRegex: 'ab*|ba*').
-
- "It is possible to write an expression matching an empty string, for
- example: `a|'. However, it is an error to apply `*', `+', or `?' to
- such expression: `(a|)*' is an invalid expression."
-
- self should: ['(a|)*' asRegex] raise: Error.
+
+ self assert: ('' matchesRegex: '(a|)*').
+ self assert: ('a' matchesRegex: '(a|)*').
+ self assert: ('aa' matchesRegex: '(a|)*').
RxParserTest>>testQuantifierSequence {tests} · ct 10/5/2022 11:48
+ testQuantifierSequence
+
+ "Unless we add support for minimal quantifiers, the following should raise a syntax error."
+ self
+ should: ['a??' asRegex] raise: RegexSyntaxError;
+ should: ['+?' asRegex] raise: RegexSyntaxError;
+ should: ['*?' asRegex] raise: RegexSyntaxError;
+ should: ['a{1,2}?' asRegex] raise: RegexSyntaxError.
+
+ "Unless we add support for possessive quantifiers, the following should raise a syntax error."
+ self
+ should: ['a?+' asRegex] raise: RegexSyntaxError;
+ should: ['a++' asRegex] raise: RegexSyntaxError;
+ should: ['a*+' asRegex] raise: RegexSyntaxError;
+ should: ['a{1,2}+' asRegex] raise: RegexSyntaxError.
+
+ "The following does not make sense under any circumstances."
+ self
+ should: ['a?*' asRegex] raise: RegexSyntaxError;
+ should: ['a+*' asRegex] raise: RegexSyntaxError;
+ should: ['a**' asRegex] raise: RegexSyntaxError.
+ self
+ should: ['a{1,2}{3,4}' asRegex] raise: RegexSyntaxError;
+ should: ['a?{1,2}' asRegex] raise: RegexSyntaxError;
+ should: ['a+{1,2}' asRegex] raise: RegexSyntaxError;
+ should: ['a*{1,2}' asRegex] raise: RegexSyntaxError.
RxParserTest>>testRegexSyntaxErrorPosition {tests} · ct 10/28/2021 03:14
+ testRegexSyntaxErrorPosition
+
+ | position |
+ ['a::z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ self assert: 3 equals: position.
+ ['a[b[:space:_]y]z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ self assert: 12 equals: position.
+ ['a[^][::]]z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ self assert: 8 equals: position.
+
+ "During nested parsing, the global position must be provided"
+ ['a\x{}z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ self assert: 5 equals: position.
+ ['a[b\x{}y]z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ self assert: 7 equals: position.
+ ['a[^b\x{}y]z' asRegex] on: RegexSyntaxError do: [:ex | position := ex position].
+ self assert: 8 equals: position.
RxParserTest>>testSpecialCharacterInSetRange {tests} · ct 10/21/2021 01:08 (changed)
testSpecialCharacterInSetRange
"self debug: #testSpecialCharacterInSetRange"
"Special characters within a set are `^', `-', and `]' that closes the
set. Below are the examples of how to literally use them in a set:
[01^] -- put the caret anywhere except the beginning
[01-] -- put the dash as the last character
[]01] -- put the closing bracket as the first character
[^]01] (thus, empty and universal sets cannot be specified)"
self assert: ('0' matchesRegex: '[01^]').
self assert: ('1' matchesRegex: '[01^]').
self assert: ('^' matchesRegex: '[01^]').
self deny: ('0' matchesRegex: '[^01]').
self deny: ('1' matchesRegex: '[^01]').
"[^abc] means that everything except abc is matche"
self assert: ('^' matchesRegex: '[^01]').
-
+
+ "[1-7] is the range of all digits between 1 and 7"
+ self assert: ('3' matchesRegex: '[1-7]').
RxParserTest>>testUnicodeCategory {tests} · ct 10/5/2022 19:38
+ testUnicodeCategory
+
+ self assert: ['X' matchesRegex: '\p{Lu}'].
+ self assert: ['X' matchesRegex: '\p{L}'].
+ self deny: ['X' matchesRegex: '\p{Ll}'].
+ self assert: ['x' matchesRegex: '\p{Ll}'].
+ self assert: ['x' matchesRegex: '\p{L}'].
+ self deny: ['x' matchesRegex: '\p{Lu}'].
+
+ self deny: ['X' matchesRegex: '\P{Lu}'].
+ self deny: ['X' matchesRegex: '\P{L}'].
+ self assert: ['X' matchesRegex: '\P{Ll}'].
+ self deny: ['x' matchesRegex: '\P{Ll}'].
+ self deny: ['x' matchesRegex: '\P{L}'].
+ self assert: ['x' matchesRegex: '\P{Lu}'].
+
+ self assert: ['x' matchesRegex: '[\p{L}]'].
+ self deny: ['x' matchesRegex: '[\P{L}]'].
+ self assert: ['x' matchesRegex: '[^\P{L}]'].
+
+ self should: ['x' matchesRegex: '[\p{LoremIpsum}]'] raise: RegexSyntaxError.
RxParserTest>>toDotestSpecialCharacterInSetRange {tests} · sd 9/4/2006 23:29 (removed)
- toDotestSpecialCharacterInSetRange
- "self debug: #testSpecialCharacterInSetRange"
-
- "Special characters within a set are `^', `-', and `]' that closes the
- set. Below are the examples of how to literally use them in a set:
- [01^] -- put the caret anywhere except the beginning
- [01-] -- put the dash as the last character
- []01] -- put the closing bracket as the first character
- [^]01] (thus, empty and universal sets cannot be specified)"
-
- self assert: ('0' matchesRegex: '[01^]').
-
- self assert: ('0' matchesRegex: '[0-9]').
-
---
Sent from Squeak Inbox Talk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20221005/6ba3e526/attachment.html>
More information about the Squeak-dev
mailing list
|