Finding and indexing 'similar' string
Dean_Swan at Mitel.COM
Dean_Swan at Mitel.COM
Tue Aug 26 18:00:28 UTC 2003
Chris,
There is also an algorithm called 'Metaphone' that was originally
published in Computer Language in 1990. It does a somewhat better job
of matching similar sounding words (at least in English). The principal
weakness of soundex is that it always uses the first letter of the word,
which can often be spelled differently.
You might also try searches on 'agrep' ("approximate grep") and
'string similarity' and 'approximate string matching' or
'approximate pattern matching' for other references.
Here are a few fairly good references:
http://www.bitmechanic.com/mail-archives/mysql/Jan1998/0666.html
http://aspell.net/metaphone/metaphone-kuhn.txt
http://www.dcc.ufmg.br/~ghuiban/paa/tp3/node18.html
-Dean
Chris Muller <afunkyobject at yahoo.com>
Sent by: squeak-dev-bounces at lists.squeakfoundation.org
08/26/03 12:01 PM
Please respond to chris; Please respond to The general-purpose Squeak
developers list
To: Squeak List <squeak-dev at lists.squeakfoundation.org>
cc:
Subject: Re: Finding and indexing 'similar' string
Jim Menard wrote:
> How about using the Soundex algorithm? A quick Google search found this
> brief explanation <http://www.frontiernet.net/~rjacob/soundex.htm>
Ohhh! Thank you Jim! What a simple, well-explained method for a
sounds-like
index. This would be a great new index type for MagmaCollections..
Do you know whether it works for other keywords? Or just Surnames? I
would
think it would, since some people's surname are regular words anyway..
- Chris
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20030826/48c59397/attachment.htm
More information about the Squeak-dev
mailing list
|