[squeak-dev] The Inbox: Regex-Core-tobe.62.mcz
Thiede, Christoph
Christoph.Thiede at student.hpi.uni-potsdam.de
Mon Oct 18 12:27:30 UTC 2021
Hi Tom,
this looks similar to Regex-Core-ct.68. :-) I will compare both patches later in-depth, maybe we can merge the best of both approaches.
(For the future, maybe someone should build a tiny tool that automatically warns you when you start editing a class/protocol/method for that there already open patches in The Inbox ... :D)
Best,
Christoph
________________________________
Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von commits at source.squeak.org <commits at source.squeak.org>
Gesendet: Montag, 18. Oktober 2021 12:55:02
An: squeak-dev at lists.squeakfoundation.org
Betreff: [squeak-dev] The Inbox: Regex-Core-tobe.62.mcz
A new version of Regex-Core was added to project The Inbox:
http://source.squeak.org/inbox/Regex-Core-tobe.62.mcz
==================== Summary ====================
Name: Regex-Core-tobe.62
Author: tobe
Time: 18 October 2021, 12:55:00.939602 pm
UUID: 02286bc7-4450-4843-9988-40b1dc9bfa70
Ancestors: Regex-Core-mt.61
Add support for \uXXXX for specifying unicode code points
=============== Diff against Regex-Core-mt.61 ===============
Item was changed:
----- Method: RxCharSetParser>>parseEscapeChar (in category 'parsing') -----
parseEscapeChar
+ | first last |
- | first |
self match: $\.
first := (RxsPredicate forEscapedLetter: lookahead)
+ ifNil: [
+ (lookahead = $u and: [RxsPredicate matchesUnicodeSymbol: (source peek: 4)])
+ ifTrue: [RxsCharacter with: (RxsPredicate unicodeCharacterFrom: self)]
+ ifFalse: [RxsCharacter with: lookahead]].
- ifNil: [ RxsCharacter with: lookahead ].
self next == $- ifFalse: [^ elements add: first].
self next ifNil: [
elements add: first.
^ self addChar: $-].
+ last := lookahead = $\
+ ifTrue: [
+ self next.
+ (RxsPredicate forEscapedLetter: lookahead)
+ ifNil: [
+ (lookahead = $u and: [RxsPredicate matchesUnicodeSymbol: (source peek: 4)])
+ ifTrue: [RxsCharacter with: (RxsPredicate unicodeCharacterFrom: self)]
+ ifFalse: [RxsCharacter with: lookahead]]]
+ ifFalse: [ | char |
+ char := RxsCharacter with: lookahead.
+ self next.
+ char].
+ self addRangeFrom: first character to: last character!
- self addRangeFrom: first character to: lookahead.
- self next!
Item was changed:
----- Method: RxParser>>ifSpecial:then: (in category 'private') -----
ifSpecial: aCharacter then: aBlock
"If the character is such that it defines a special node when follows a $\,
then create that node and evaluate aBlock with the node as the parameter.
Otherwise just return."
| classAndSelector |
+ classAndSelector := BackslashSpecials at: aCharacter ifAbsent: [
+ " check if we have four hex digits for a unicode code symbol following "
+ (aCharacter = $u and: [RxsPredicate matchesUnicodeSymbol: (input peek: 4)]) ifTrue: [
+ ^ aBlock value: (RxsPredicate forUnicodeFrom: self)].
+ ^self].
- classAndSelector := BackslashSpecials at: aCharacter ifAbsent: [^self].
^aBlock value: (classAndSelector key new perform: classAndSelector value)!
Item was added:
+ ----- Method: RxsPredicate class>>forUnicodeFrom: (in category 'instance creation') -----
+ forUnicodeFrom: aParser
+
+ ^RxsPredicate new beCharacter: (self unicodeCharacterFrom: aParser)!
Item was added:
+ ----- Method: RxsPredicate class>>matchesUnicodeSymbol: (in category 'helper') -----
+ matchesUnicodeSymbol: aString
+
+ ^aString size = 4 and: [aString allSatisfy: [:c | c asLowercase between: $0 and: $f]]!
Item was added:
+ ----- Method: RxsPredicate class>>unicodeCharacterFrom: (in category 'helper') -----
+ unicodeCharacterFrom: aParser
+
+ ^Character value: (Integer readFrom: aParser next asString, aParser next, aParser next, aParser next base: 16)!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20211018/a93d71cb/attachment.html>
More information about the Squeak-dev
mailing list
|