el*******@hotmail.com wrote:

This algorithm is called soundex. Here is one implementation example.

http://aspn.activestate.com/ASPN/Coo...n/Recipe/52213

here is another:

http://effbot.org/librarybook/soundex.htm

Soundex is *one* particular algorithm for approximate

string matching. It is optimised for matching

Anglo-American names (like Smith/Smythe), and is

considered to be quite old and obsolete for all but the

most trivial applications -- or so I'm told.

Soundex will not match arbitrary changes -- it will

match both cat and cet, but it won't match cat and mat.

A more sophisticated approximate string matching

algorithm will use the Levenshtein distance. You can

find a Useless implementation here:

http://www.uselesspython.com/download.php?script_id=108
Given a function levenshtein(s1, s2) that returns the

distance between two strings, you could use it for

approximate matching like this:

def approx_matching(strlist, target, dist=1):

"""Matches approximately strings in strlist to

a target string.

Returns a list of strings, where each string

matched is no further than an edit distance of

dist from the target.

"""

found = []

for s in strlist:

if levenshtein(s, target) <= dist:

found.append(s)

return s

--

Steven.