Searching magma string indices
Chris Muller
chris at funkyobjects.org
Tue Mar 6 03:59:29 UTC 2007
Hi Brent,
> > Now, the expression '%foo%bar%' goes beyond the first requirement
> of just searching on a prefix and/or a suffix.. For this a different
> type of index would be needed that adds every possible value for
> "Fredfoon Tobart", the following values:
> >
> > fredfoon tobart
> > redfoon tobart
> > edfoon tobart
> > dfoon tobart
> > foon tobart
> > oon tobart
> > on tobart
> > n tobart
> > tobart
> > tobart
> > obart
> > bart
> > art
> > rt
> > t
> >
> > Your '%foo%bar%' expression would then need to translate to
> >
> > (familyName from: 'foo' to: 'foo' maAlphabeticalNext)
> > & (familyName from: 'bar' to: 'bar' maAlphabeticalNext)
> >
> > to find the object(s) that had "Fredfoon Tobart".
>
> I must admit I am not following you here. Is this one of those Magma
> keyword indices which I have never managed to grok ?
Yes. As you know, even though many indexes only map one value per
indexed attribute, any index can just as easily map multiple values per
object, per indexed attribute.
Have a look at MaClause>>#shouldInclude:using:, you can very simply how
it treats multiple values; if any one of the (multiple) indexed values
for a particular object are "in range", then the entire object
qualifies for that clause.
It's more generic than a "keyword" index, but a keyword index is an
easy way to grok it, because you know when you search for a keyword,
any of the keywords assigned to an object will cause that object to
match.
So in the above example, searching the first clause
> > (familyName from: 'foo' to: 'foo' maAlphabeticalNext)
is qualified on the 5th element
> > foon tobart
and the second clause
> > & (familyName from: 'bar' to: 'bar' maAlphabeticalNext)
is qualifie by the 12th element:
> > bart
therefore the entire expression is satisfied and the object qualifies.
> Also, whilst you are hunting elephants, SQL has both % and ?
> wildcards: % is any sequence of characters inlcuding the empty string
> and ? is precicely one character.
> So foo??bar would match fooABbar but not fooCDEbar and not fooFbar.
>
> Any chance you could bend Magma's indices into managing expressions
> with a fixed number of ?s (e.g. ?foo????bar??baz???)
Yes, there are several ways to do this but here's one. A new index
type that simply indexes a unique value for each character at a
particular position. Returning to the previous example, an object
whose #familyName attribute is "fredfoon tobart", this index type would
to index the object at the following values:
(256 * 1) + $f asciiValue
(256 * 2) + $r asciiValue
(256 * 3) + $e asciiValue
(256 * 4) + $d asciiValue
(256 * 5) + $f asciiValue
(256 * 6) + $o asciiValue
(256 * 7) + $o asciiValue
(256 * 8) + $n asciiValue
(256 * 9) + Character space asciiValue
(256 * 10) + $t asciiValue
(256 * 11) + $o asciiValue
(256 * 12) + $b asciiValue
(256 * 13) + $a asciiValue
(256 * 14) + $r asciiValue
(256 * 15) + $t asciiValue
Then, whenever a single-character-matching string is specified, every
character other than the question-marks becomes a conjunction in the
query. If the user is searching:
where: [ : p | p familyName like: '????foon toba??' ]
then it would have to be parsed into the following regular clause:
(familyName = ((256*5) + $f asciiValue)
& (familyName = ((256*6) + $o asciiValue)
& (familyName = ((256*7) + $o asciiValue)
& (familyName = ((256*8) + Character space asciiValue)
& (familyName = ((256*9) + $t asciiValue)
& (familyName = ((256*10) + $o asciiValue)
& (familyName = ((256*11) + $b asciiValue)
& (familyName = ((256*12) + $a asciiValue)
Notice the question marks are the ones on which we are not qualifying,
so any value can be placed in positions 1-4 and 13-14.
Does this all make sense?
> It is to darn hot here to think.
Yeah, you're probably just already missing skiing down fresh powder..
:)
More information about the Magma
mailing list