Finding and indexing 'similar' string

Yoshiki Ohshima Yoshiki.Ohshima at acm.org
Mon Aug 25 15:51:36 UTC 2003


  Martin,

  I happened to have written something like this while ago.

  According to the class comment, it is based on:
----------
Baeza-Yates, R. A., and Gonnet, G.H., A new approach to text
searcing, {\it Communications of the ACM}, 35, 10 (October
1992), pp 74-82.

Wu, S., and Manber, U., Fast text searching allowing errors,
{\it Communications of the ACM}, 35, 10 (October 1992), pp 82-91.
----------

  I don't remember how it worked well, but I had an application that
used this, so maybe not too bad if you know what's its limitation^^;

-- Yoshiki

At 25 Aug 2003 08:19:22 +0200,
Martin Drautzburg wrote:
> 
> does anybody know of a way of finding strings that match a given
> pattern closely, but not necessarily exactly (like the Levinshtein
> distance) available in Smalltalk ? 
> 
> And does anybody know a way to index strings so the strings that are
> close to a pattern can be found quickly ?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Text-FuzzyMatching.st
Type: application/octet-stream
Size: 6472 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20030825/80bc8c1f/Text-FuzzyMatching.obj


More information about the Squeak-dev mailing list