[squeak-dev] The Trunk: Regex-Core-ct.55.mcz

Thiede, Christoph Christoph.Thiede at student.hpi.uni-potsdam.de
Sun May 10 11:33:32 UTC 2020


Thank you for reviewing and merging this all, Nicolas!


Actually, these versions were still WIP as noted in the inbox thread<http://forum.world.st/The-Inbox-Regex-Core-ct-56-mcz-td5113011.html>.

Didn't you notice this or did you rate it as non-critical? :)

However, I guess it's not a big problem because this is not a regression.


Will fix the open bugs ASAP (but unfortunately, it may take me some weeks to find the time ...)!


Best,

Christoph

<http://www.hpi.de/>
________________________________
Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von commits at source.squeak.org <commits at source.squeak.org>
Gesendet: Freitag, 8. Mai 2020 22:24:45
An: squeak-dev at lists.squeakfoundation.org; packages at lists.squeakfoundation.org
Betreff: [squeak-dev] The Trunk: Regex-Core-ct.55.mcz

Nicolas Cellier uploaded a new version of Regex-Core to project The Trunk:
http://source.squeak.org/trunk/Regex-Core-ct.55.mcz

==================== Summary ====================

Name: Regex-Core-ct.55
Author: ct
Time: 6 March 2020, 7:08:55.997601 pm
UUID: 4f76095b-f67f-4c41-afec-d936b7dfeecb
Ancestors: Regex-Core-eem.54

Implements positive lookaheads in Regular Expressions for Squeak

There were already some stubs and a bit of documentation, but while negative lookaheads (such as 'q(?!u)' asRegex) have been working in the past, positive lookaheads (such as 'q(?=u)' asRegex) never worked before.

- Fix erroneous parsing of positive lookahead syntax (the previous implementation missed a side effect of #regex)
- Add #positive argument to construction messages for lookahead nodes/links (see RxsLookaround >> #dispatchTo: and others, these steps had actually been forgotten)*
- In RxMatcher >> #matchAgainstLookahead:positive:nextLink:, actually respect the #positive argument
- Fix typos in documentation and category names

*Note: I decided to remove but not deprecate the original construction messages for lookahead nodes/links. The cause is that IMO, the default value should never be a negative setting, which you would not expect at first glance. Also, all the link and node classes are rather an implementation detail of Regex-Core, so I think we do not need to move these methods into the Deprecated package. Please let me know if you agree with this.

Please review! Further information about lookaheads can be found here: https://www.regular-expressions.info/lookaround.html

=============== Diff against Regex-Core-eem.54 ===============

Item was removed:
- ----- Method: RxMatchOptimizer>>syntaxLookaround: (in category 'double dispatch') -----
- syntaxLookaround: lookaroundNode
-        "Do nothing."!

Item was added:
+ ----- Method: RxMatchOptimizer>>syntaxLookaround:positive: (in category 'double dispatch') -----
+ syntaxLookaround: lookaroundNode positive: positive
+        "Do nothing."!

Item was removed:
- ----- Method: RxMatcher>>matchAgainstLookahead:nextLink: (in category 'matching') -----
- matchAgainstLookahead: lookahead nextLink: anRmxLink
-
-        | position result |
-        position := stream position.
-        result := lookahead matchAgainst: self.
-        stream position: position.
-        result ifTrue: [ ^false ].
-        ^anRmxLink matchAgainst: self!

Item was added:
+ ----- Method: RxMatcher>>matchAgainstLookahead:positive:nextLink: (in category 'matching') -----
+ matchAgainstLookahead: lookahead positive: positive nextLink: anRmxLink
+
+        | position result |
+        position := stream position.
+        result := lookahead matchAgainst: self.
+        stream position: position.
+        ^ result = positive and: [
+                anRmxLink matchAgainst: self]!

Item was removed:
- ----- Method: RxMatcher>>syntaxLookaround: (in category 'double dispatch') -----
- syntaxLookaround: lookaroundNode
-        "Double dispatch from the syntax tree.
-        Special link can handle lookarounds (look ahead, positive and negative)."
-        | piece |
-        piece := lookaroundNode piece dispatchTo: self.
-        ^ RxmLookahead with: piece!

Item was added:
+ ----- Method: RxMatcher>>syntaxLookaround:positive: (in category 'double dispatch') -----
+ syntaxLookaround: lookaroundNode positive: positiveBoolean
+        "Double dispatch from the syntax tree.
+        Special link can handle lookarounds (look ahead, positive and negative)."
+        | piece |
+        piece := lookaroundNode piece dispatchTo: self.
+        ^ RxmLookahead with: piece positive: positiveBoolean!

Item was changed:
  ----- Method: RxParser>>lookAround (in category 'recursive descent') -----
  lookAround
+        "Parse a lookaround expression after: (?<lookaround>)
+        <lookaround> ::= !!<regex> | =<regex>"
+        | positive |
+        ('!!=' includes: lookahead) ifFalse: [
+                ^ self signalParseError: 'Invalid lookaround expression ?', lookahead asString].
+        positive := lookahead == $=.
-        "Parse a lookaround expression after: (?<lookround>)
-        <lookround> ::= !!<regex> | =<regex>"
-        | lookaround |
-        (lookahead == $!!
-        or: [ lookahead == $=])
-                ifFalse: [ ^ self signalParseError: 'Invalid lookaround expression ?', lookahead asString ].
         self next.
+        ^ RxsLookaround
+                with: self regex
+                positive: positive!
-        lookaround := RxsLookaround with: self regex.
-        lookahead == $!!
-                ifTrue: [ lookaround beNegative ].
-        ^ lookaround
-        !

Item was changed:
  RxmLink subclass: #RxmLookahead
+        instanceVariableNames: 'lookahead positive'
-        instanceVariableNames: 'lookahead'
         classVariableNames: ''
         poolDictionaries: ''
         category: 'Regex-Core'!

+ !RxmLookahead commentStamp: 'ct 3/6/2020 18:29' prior: 0!
+ Instance holds onto a lookahead which matches but does not consume anything.
- !RxmLookahead commentStamp: '<historical>' prior: 0!
- Instance holds onto a lookead which matches but does not consume anything.

+ Instance Variables
+        lookahead:              <RxmLink>
+        positive:               <Boolean>
+ !
- Instance variables:
-        predicate               <RxmLink>!

Item was removed:
- ----- Method: RxmLookahead class>>with: (in category 'instance creation') -----
- with: aPiece
-
-        ^self new lookahead: aPiece!

Item was added:
+ ----- Method: RxmLookahead class>>with:positive: (in category 'instance creation') -----
+ with: aPiece positive: aBoolean
+
+        ^self new lookahead: aPiece positive: aBoolean!

Item was removed:
- ----- Method: RxmLookahead>>lookahead: (in category 'accessing') -----
- lookahead: anRxmLink
-        lookahead := anRxmLink!

Item was added:
+ ----- Method: RxmLookahead>>lookahead:positive: (in category 'accessing') -----
+ lookahead: anRxmLink positive: aBoolean
+        lookahead := anRxmLink.
+        positive := aBoolean.!

Item was changed:
  ----- Method: RxmLookahead>>matchAgainst: (in category 'matching') -----
  matchAgainst: aMatcher
         "Match if the predicate block evaluates to true when given the
         current stream character as the argument."

+        ^aMatcher matchAgainstLookahead: lookahead positive: positive nextLink: next!
-        ^aMatcher matchAgainstLookahead: lookahead nextLink: next!

Item was removed:
- ----- Method: RxsLookaround class>>with: (in category 'instance creation') -----
- with: anRsxPiece
-        ^ self new
-                initializePiece: anRsxPiece!

Item was added:
+ ----- Method: RxsLookaround class>>with:positive: (in category 'instance creation') -----
+ with: aRxsRegex positive: positiveBoolean
+        ^ self new
+                initializePiece: aRxsRegex
+                positive: positiveBoolean!

Item was changed:
+ ----- Method: RxsLookaround>>beNegative (in category 'initialize-release') -----
- ----- Method: RxsLookaround>>beNegative (in category 'initailize-release') -----
  beNegative
         positive := false!

Item was changed:
+ ----- Method: RxsLookaround>>bePositive (in category 'initialize-release') -----
- ----- Method: RxsLookaround>>bePositive (in category 'initailize-release') -----
  bePositive
         positive := true!

Item was changed:
  ----- Method: RxsLookaround>>dispatchTo: (in category 'accessing') -----
  dispatchTo: aBuilder
+        "Inform the matcher of the kind of the node, and it will do whatever it has to."
+        ^aBuilder syntaxLookaround: self positive: self positive!
-        "Inform the matcher of the kind of the node, and it
-        will do whatever it has to."
-        ^aBuilder syntaxLookaround: self!

Item was added:
+ ----- Method: RxsLookaround>>initialize (in category 'initialize-release') -----
+ initialize
+
+        super initialize.
+        self bePositive.!

Item was removed:
- ----- Method: RxsLookaround>>initializePiece: (in category 'initailize-release') -----
- initializePiece: anRsxPiece
-        super initialize.
-        piece := anRsxPiece.!

Item was added:
+ ----- Method: RxsLookaround>>initializePiece:positive: (in category 'initialize-release') -----
+ initializePiece: anRsxPiece positive: positiveBoolean
+
+        piece := anRsxPiece.
+        positive := positiveBoolean.!

Item was added:
+ ----- Method: RxsLookaround>>positive (in category 'accessing') -----
+ positive
+
+        ^ positive!


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200510/17e236f5/attachment.html>


More information about the Squeak-dev mailing list