MagmaCollectionReader behavior

Miguel Enrique Cobá Martínez miguel.coba at gmail.com
Wed Apr 1 00:04:32 UTC 2009


It is not clear from the magmaseaside tutorial, but the code from 
http://wiki.squeak.org/squeak/6021:

initialize
	| users |
	users := MagmaCollection new.
	users addIndex: (MaSearchStringIndex attribute: #email) beAscii.
        self at: #users put: users

findUserByEmail: anEmail
	^ (self users where: [ :each | each email equals: anEmail ] ) firstOrNil

without any doubt suggests that the where: method and the

each email equals: anEmail

gives a *exact* or *equal* match, but that is not the case.
In fact, the where: send returns a MagmaCollectionReader that stands for 
the *set* or *collection* of objects that matched the equals: method in 
direct relation with the index created for the MagmaCollection.

In this example, the index is created with the default (no keySize: 
especified) of 32 bits that merely gives you 4 meaningful characters 
when searching for a string, i.e. if you have users with emails like:

user   email
1    'miguel at domain1.com'
2    'miguel at domain2.com'
3    'miguel.coba at domain3.com'

a message send like:

findUserByEmail: 'miguel at domain1.com'

will give you a MagmaReader that represents the 3 users in the database, 
because they all share the same 4 initial characters. After that, the 
firstOrNil message, ensure that the user # 1 will *always* be returned, 
no matter what argument you are passing to findUserByEmail. So, the 
answer from

findUserByEmail: 'miguel at domain1.com'
findUserByEmail: 'miguel at domain2.com'
findUserByEmail: 'miguel.coba at domain3.com'

will be always user #1.

In summary, the method doesn't has a right behavior, because it can't be 
used for finding a specific user, that is the intended action.

After reading the Index documentation from the magma site, it was clear 
that this MagmaCollectionReader can't give accurate and exact results 
and by itself it can't be used for finding objects. You *always* have to 
apply some kind of searching over the already reduced collection 
represented by MagmaCollectionReader in order to find the *exact* match 
you are trying to locate.

So the code should be something like:

findUserByEmail: anEmail
	
   | user |
	"Here you are working over the entire magma repo"
   user := (self users where: [:each | each email equals: anEmail])
	   "Here you are working over the reduced set
             returned by the where and represented by a
             MagmaCollectionReader"
	      detect: [:each |
		"Here you are working on a plain Collection"
		each email = anEmail ]
	      ifNone: [nil]. "
	^ user


After changing the code this way, the example correctly can find the 
users with emails 'miguel at domain1.com', 'miguel at domain2.com' and 
'miguel.coba at domain3.com'.

Can someone confirm that this is the correct way to use a 
MagmaCollectionReader?

P.D. I tried with a larger keySize: at index creation (I even try 400 
bits) but this only postponed the point where the string matching stop 
working. Also, it is not efficient and with 400, squeak throws an error.
So that was not the way to go.

Thank for your comments,
Miguel Cobá


More information about the Magma mailing list