[squeak-dev] The Inbox: Regex-Core-ct.77.mcz

commits at source.squeak.org commits at source.squeak.org
Wed Oct 5 09:08:58 UTC 2022


Christoph Thiede uploaded a new version of Regex-Core to project The Inbox:
http://source.squeak.org/inbox/Regex-Core-ct.77.mcz

==================== Summary ====================

Name: Regex-Core-ct.77
Author: ct
Time: 5 October 2022, 11:08:58.426956 am
UUID: 526b6f56-d854-b74e-9a67-fc0f136ecfcd
Ancestors: Regex-Core-ct.76

Implements match reset (\K) as an alternative to positive lookbehinds. This is useful for compatibility with Perl/PCRE/Boost et al. For more information, see: https://www.regular-expressions.info/keep.html

Examples:

	'a\Kb' asRegex matches: 'ab'; subexpression: 1. "'b'"
	'([a-z]\K[A-Z])+' asRegex matches: 'aBcDeF'; allSubexpressions. "#(#('F') #('aB' 'cD' 'eF'))"

=============== Diff against Regex-Core-ct.76 ===============

Item was changed:
  ----- Method: RxMatcher>>matchAgainstMarkerAt:nextLink: (in category 'matching') -----
  matchAgainstMarkerAt: index nextLink: anRmxLink
  
  	| position |
  	position := stream position.
  	(anRmxLink matchAgainst: self) ifFalse: [ ^false ].
  	index <= 2 
+ 		ifTrue:
+ 			[ (markerPositions at: index)
+ 				ifNotNil: [ "position was eagerly set in #matchResetNextLink:" ]
+ 				ifNil: [ markerPositions at: index put: position ] ]
- 		ifTrue: [ markerPositions at: index put: position ]
  		ifFalse: [ (markerPositions at: index) addLast: position ].
  	^true!

Item was added:
+ ----- Method: RxMatcher>>matchResetNextLink: (in category 'matching') -----
+ matchResetNextLink: anRmxLink
+ 
+ 	| position |
+ 	position := stream position.
+ 	(anRmxLink matchAgainst: self) ifFalse: [^ false].
+ 	(markerPositions at: 1)
+ 		ifNotNil: [ "position was already set in a nested (subsequent) reset" ]
+ 		ifNil: [ markerPositions at: 1 put: position ].
+ 	^ true!

Item was added:
+ ----- Method: RxMatcher>>syntaxMatchReset (in category 'double dispatch') -----
+ syntaxMatchReset
+ 
+ 	^ RxmMatchReset new!

Item was changed:
  ----- Method: RxParser class>>initializeBackslashSpecials (in category 'class initialization') -----
  initializeBackslashSpecials
  	"Keys are characters that normally follow a \, the values are
  	associations of classes and initialization selectors on the instance side
  	of the classes."
  	"self initializeBackslashSpecials"
  
  	(BackslashSpecials := Dictionary new)
  		at: $w put: (Association key: RxsPredicate value: #beWordConstituent);
  		at: $W put: (Association key: RxsPredicate value: #beNotWordConstituent);
  		at: $s put: (Association key: RxsPredicate value: #beSpace);
  		at: $S put: (Association key: RxsPredicate value: #beNotSpace);
  		at: $d put: (Association key: RxsPredicate value: #beDigit);
  		at: $D put: (Association key: RxsPredicate value: #beNotDigit);
  		at: $b put: (Association key: RxsContextCondition value: #beWordBoundary);
  		at: $B put: (Association key: RxsContextCondition value: #beNonWordBoundary);
  		at: $< put: (Association key: RxsContextCondition value: #beBeginningOfWord);
+ 		at: $> put: (Association key: RxsContextCondition value: #beEndOfWord);
+ 		at: $K put: (Association key: RxsMatchReset value: #yourself)!
- 		at: $> put: (Association key: RxsContextCondition value: #beEndOfWord)!

Item was added:
+ RxmLink subclass: #RxmMatchReset
+ 	instanceVariableNames: ''
+ 	classVariableNames: ''
+ 	poolDictionaries: ''
+ 	category: 'Regex-Core'!
+ 
+ !RxmMatchReset commentStamp: 'ct 10/5/2022 10:59' prior: 0!
+ Instance matches always and trims the matched string to start from the current position.!

Item was added:
+ ----- Method: RxmMatchReset>>matchAgainst: (in category 'matching') -----
+ matchAgainst: aMatcher
+ 
+ 	^ aMatcher matchResetNextLink: next!

Item was added:
+ RxsNode subclass: #RxsMatchReset
+ 	instanceVariableNames: ''
+ 	classVariableNames: ''
+ 	poolDictionaries: ''
+ 	category: 'Regex-Core'!
+ 
+ !RxsMatchReset commentStamp: 'ct 10/5/2022 10:58' prior: 0!
+ I reset the matcher to the current position and remove all previously matched characters from the match. I do not affect any capture groups.!

Item was added:
+ ----- Method: RxsMatchReset>>dispatchTo: (in category 'building') -----
+ dispatchTo: aBuilder
+ 
+ 	^ aBuilder syntaxMatchReset!

Item was added:
+ ----- Method: RxsMatchReset>>isNullable (in category 'testing') -----
+ isNullable
+ 
+ 	^ true!

Item was changed:
+ (PackageInfo named: 'Regex-Core') postscript: 'RxParser initializeBackslashSpecials. "Regex-Core-ct.77 (match reset)"'!
- (PackageInfo named: 'Regex-Core') postscript: 'RxsPredicate initializeEscapedLetterSelectors.'!



More information about the Squeak-dev mailing list