[Newbies] Re: ByteString>>match: greedyness of * ??
Ch Lamprecht
ch.l.ngre at online.de
Tue Jan 8 22:52:20 UTC 2008
Hello,
I browsed String>>startingAt:match:startingAt:
and changed two lines to make errorhandling work as probably intended by the
author.
startingAt: keyStart match: text startingAt: textStart
"Answer whether text matches the pattern in this string.
Matching ignores upper/lower case differences.
Where this string contains #, text may contain any character.
Where this string contains *, text may contain any sequence of characters."
| anyMatch matchStart matchEnd i matchStr j ii jj |
i := keyStart.
j := textStart.
"Check for any #'s"
[i > self size ifTrue: [^ j > text size "Empty key matches only empty string"].
(self at: i) = $#] whileTrue:
["# consumes one char of key and one char of text"
j > text size ifTrue: [^ false "no more text"].
i := i+1. j := j+1].
"Then check for *"
(self at: i) = $*
ifTrue: [i = self size ifTrue:
[^ true "Terminal * matches all"].
"* means next match string can occur anywhere"
anyMatch := true.
matchStart := i + 1]
ifFalse: ["Otherwise match string must occur immediately"
anyMatch := false.
matchStart := i].
"Now determine the match string"
matchEnd := self size.
(ii := self indexOf: $* startingAt: matchStart) > 0 ifTrue:
"changed the following line to:"
[ii = matchStart ifTrue: [self error: '** not valid -- use * instead'].
matchEnd := ii-1].
(ii := self indexOf: $# startingAt: matchStart) > 0 ifTrue:
"changed the following line to:"
[ii = matchStart ifTrue: [self error: '*# not valid -- use #* instead'].
matchEnd := matchEnd min: ii-1].
matchStr := self copyFrom: matchStart to: matchEnd.
"Now look for the match string"
[jj := text findString: matchStr startingAt: j caseSensitive: false.
anyMatch ifTrue: [jj > 0] ifFalse: [jj = j]]
whileTrue:
["Found matchStr at jj. See if the rest matches..."
(self startingAt: matchEnd+1 match: text startingAt: jj + matchStr size) ifTrue:
[^ true "the rest matches -- success"].
"The rest did not match."
anyMatch ifFalse: [^ false].
"Preceded by * -- try for a later match"
j := j+1].
^ false "Failed to find the match string"
Kent Loobey wrote:
> On Tuesday 08 January 2008 13:03:22 Ch Lamprecht wrote:
>
>>nicolas cellier wrote:
>>
>>>This behavior is squeakish, other Smalltalk match differently:
>>>
>>>VW: '**' match: 'e'. "true"
>>>gst: '**' match: 'e'. "true"
>>>
>>>Anyway, this pattern matching is limited. How do you match a '*' itself?
>>>I thought your example might be interpreted as an escape sequence, but
>>>no, there is no escape in this simple matching.
>>>
>>>'**' match: '*'. "false"
>>>'\*' match: '*'. "false"
>>>
>>>Try VBregex or another regex package.
>>>
>>>Nicolas
>>
>>Hi,
>>thank you.
>>In addition to the expressions below, I found, that #match: does not behave
>>as stated by the comment given in the method definition itself:
>>
>> From ByteString>>match:
>>
>>"
>> [snip]
>> 'foo*baz' match: 'foo23baz' true
>> 'foo*baz' match: 'foobaz' true <----
>> 'foo*baz' match: 'foo23bazo' false
>> 'foo' match: 'Foo' true
>> 'foo*baz*zort' match: 'foobazort' false
>> 'foo*baz*zort' match: 'foobazzort' false <----
>> [snip]
>>"
>
>
> In general * means any character including no characters.
>
> So the first one is foo any-character baz.
>
> The third is false because of the "o" on the end. If you wanted it to work
> you could put 'foo*baz*'.
>
> I don't know why the last one wasn't reported as true.
>
>
>>confused, Christoph
>>
>>
>>>Ch Lamprecht a écrit :
>>>
>>>>Hello,
>>>>
>>>>I found the following results for some expressions using #match:
>>>>
>>>>'e' match: 'e'. "true"
>>>>'*' match: 'e'. "true"
>>>>'#' match: 'e'. "true"
>>>>
>>>>'*e' match: 'e'. "true"
>>>>'*#' match: 'e'. "false"
>>>>'**' match: 'e'. "false"
>>>>
>>>>'*' match: ''. "true"
>>>>'**' match: ''. "false"
>>>>
>>>>
>>>>Is this expected behavior?
>>>>Looks like * is sometimes 'greedy', sometimes not. (Comparing 4 and 5)
>>>>Thank you for any hints.
More information about the Beginners
mailing list