Finding and indexing 'similar' string

Jim Menard jimm at io.com
Mon Aug 25 15:22:22 UTC 2003


On Monday, August 25, 2003, at 03:14  AM, Joshua 'Schwa' Gargus wrote:

> On Mon, Aug 25, 2003 at 08:19:22AM +0200, Martin Drautzburg wrote:
>> does anybody know of a way of finding strings that match a given
>> pattern closely, but not necessarily exactly (like the Levinshtein
>> distance) available in Smalltalk ?
>
> You might take a look at 
> String>>correctAgainstDictionary:continuedFrom:,
> which is called when you type a variable/method name that doesn't
> exist in the system, and computes a list of likely spellings.
>
>>
>> And does anybody know a way to index strings so the strings that are
>> close to a pattern can be found quickly ?

How about using the Soundex algorithm? A quick Google search found this 
brief explanation <http://www.frontiernet.net/~rjacob/soundex.htm>, 
Soundex in Ruby <http://raa.ruby-lang.org/list.rhtml?name=Soundex>, and 
this C code <http://physics.nist.gov/cuu/Reference/soundex.html>.

Jim
-- 
Jim Menard, jimm at io.com, http://www.io.com/~jimm/
"All those who believe in psychokinesis raise my hand." -- Anon.



More information about the Squeak-dev mailing list