[squeak-dev] The Inbox: Regex-Core-tobe.62.mcz

commits at source.squeak.org commits at source.squeak.org
Mon Oct 18 10:55:02 UTC 2021


A new version of Regex-Core was added to project The Inbox:
http://source.squeak.org/inbox/Regex-Core-tobe.62.mcz

==================== Summary ====================

Name: Regex-Core-tobe.62
Author: tobe
Time: 18 October 2021, 12:55:00.939602 pm
UUID: 02286bc7-4450-4843-9988-40b1dc9bfa70
Ancestors: Regex-Core-mt.61

Add support for \uXXXX for specifying unicode code points

=============== Diff against Regex-Core-mt.61 ===============

Item was changed:
  ----- Method: RxCharSetParser>>parseEscapeChar (in category 'parsing') -----
  parseEscapeChar
  
+ 	| first last |
- 	| first |
  	self match: $\.
  	first := (RxsPredicate forEscapedLetter: lookahead)
+ 		ifNil: [
+ 			 (lookahead = $u and: [RxsPredicate matchesUnicodeSymbol: (source peek: 4)])
+ 				ifTrue: [RxsCharacter with: (RxsPredicate unicodeCharacterFrom: self)]
+ 				ifFalse: [RxsCharacter with: lookahead]].
- 		ifNil: [ RxsCharacter with: lookahead ].
  	self next == $- ifFalse: [^ elements add: first].
  	self next ifNil: [
  		elements add: first.
  		^ self addChar: $-].
+ 	last := lookahead = $\
+ 		ifTrue: [
+ 			self next.
+ 			(RxsPredicate forEscapedLetter: lookahead)
+ 				ifNil: [
+ 					 (lookahead = $u and: [RxsPredicate matchesUnicodeSymbol: (source peek: 4)])
+ 						ifTrue: [RxsCharacter with: (RxsPredicate unicodeCharacterFrom: self)]
+ 						ifFalse: [RxsCharacter with: lookahead]]]
+ 		ifFalse: [ | char |
+ 			char := RxsCharacter with: lookahead.
+ 			self next.
+ 			char].
+ 	self addRangeFrom: first character to: last character!
- 	self addRangeFrom: first character to: lookahead.
- 	self next!

Item was changed:
  ----- Method: RxParser>>ifSpecial:then: (in category 'private') -----
  ifSpecial: aCharacter then: aBlock
  	"If the character is such that it defines a special node when follows a $\,
  	then create that node and evaluate aBlock with the node as the parameter.
  	Otherwise just return."
  
  	| classAndSelector |
+ 	classAndSelector := BackslashSpecials at: aCharacter ifAbsent: [
+ 		" check if we have four hex digits for a unicode code symbol following "
+ 		(aCharacter = $u and: [RxsPredicate matchesUnicodeSymbol: (input peek: 4)]) ifTrue: [
+ 			^ aBlock value: (RxsPredicate forUnicodeFrom: self)].
+ 		^self].
- 	classAndSelector := BackslashSpecials at: aCharacter ifAbsent: [^self].
  	^aBlock value: (classAndSelector key new perform: classAndSelector value)!

Item was added:
+ ----- Method: RxsPredicate class>>forUnicodeFrom: (in category 'instance creation') -----
+ forUnicodeFrom: aParser
+ 
+ 	^RxsPredicate new beCharacter: (self unicodeCharacterFrom: aParser)!

Item was added:
+ ----- Method: RxsPredicate class>>matchesUnicodeSymbol: (in category 'helper') -----
+ matchesUnicodeSymbol: aString
+ 
+ 	^aString size = 4 and: [aString allSatisfy: [:c | c asLowercase between: $0 and: $f]]!

Item was added:
+ ----- Method: RxsPredicate class>>unicodeCharacterFrom: (in category 'helper') -----
+ unicodeCharacterFrom: aParser
+ 
+ 	^Character value: (Integer readFrom: aParser next asString, aParser next, aParser next, aParser next base: 16)!



More information about the Squeak-dev mailing list