By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,299 Members | 1,054 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,299 IT Pros & Developers. It's quick & easy.

string similarity comparison

P: n/a
I'm looking for a method to compare two strings and grade them for
similarity.

My idea is to strip out common words and punctuation and create a checksum
of each remaining string. I would then compare the checksums and if they are
close then there's a potential match (to be judged by user interaction.)

Can someone suggest an existing function or class to perform such a task?

Thanks!
Jul 17 '05 #1
Share this Question
Share on Google+
11 Replies


P: n/a
Bosconian wrote:
I'm looking for a method to compare two strings and grade them for
similarity.


http://docs.php.net/en/function.levenshtein.html

or

http://docs.php.net/en/function.similar-text.html

Cheers,
Nhcholas Sherlock
Jul 17 '05 #2

P: n/a
"Nicholas Sherlock" <n_********@hotmail.com> wrote in message
news:d4**********@lust.ihug.co.nz...
Bosconian wrote:
I'm looking for a method to compare two strings and grade them for
similarity.


http://docs.php.net/en/function.levenshtein.html

or

http://docs.php.net/en/function.similar-text.html

Cheers,
Nhcholas Sherlock


Nhcholas, I'm embarrassed to say that I didn't check php.net before posting
my message. Shame on me, but thanks for the tip!
Jul 17 '05 #3

P: n/a
BTW, would you happen to know if this can be done at the query level?

"Nicholas Sherlock" <n_********@hotmail.com> wrote in message
news:d4**********@lust.ihug.co.nz...
Bosconian wrote:
I'm looking for a method to compare two strings and grade them for
similarity.


http://docs.php.net/en/function.levenshtein.html

or

http://docs.php.net/en/function.similar-text.html

Cheers,
Nhcholas Sherlock

Jul 17 '05 #4

P: n/a
"Nicholas Sherlock" <n_********@hotmail.com> wrote in message
news:d4**********@lust.ihug.co.nz...
Bosconian wrote:
I'm looking for a method to compare two strings and grade them for
similarity.


http://docs.php.net/en/function.levenshtein.html

or

http://docs.php.net/en/function.similar-text.html

Cheers,
Nhcholas Sherlock


BTW, would you happen to know if this can be done at the query level?
Jul 17 '05 #5

P: n/a
Bosconian wrote:
"Nicholas Sherlock" <n_********@hotmail.com> wrote in message
news:d4**********@lust.ihug.co.nz...
Bosconian wrote:
I'm looking for a method to compare two strings and grade them for
similarity.


http://docs.php.net/en/function.levenshtein.html

or

http://docs.php.net/en/function.similar-text.html

BTW, would you happen to know if this can be done at the query level?


Ah, do you want to something like this made up query:

SELECT * FROM mytable WHERE text IS SORT OF SIMILAR TO $mysearch ?

If so, you can't do this with PHP functions. You may be able to find an
add-on for your database server which will add functionality like this,
but I don't think that it comes standard with any databases.

Cheers,
Nicholas Sherlock
Jul 17 '05 #6

P: n/a
Nicholas Sherlock wrote:
Bosconian wrote: <snip>
http://docs.php.net/en/function.similar-text.html

BTW, would you happen to know if this can be done at the query

level? Ah, do you want to something like this made up query:

SELECT * FROM mytable WHERE text IS SORT OF SIMILAR TO $mysearch ?

If so, you can't do this with PHP functions. You may be able to find an add-on for your database server which will add functionality like this, but I don't think that it comes standard with any databases.


Though MySQL supports user defined functions, in SQLite, it is easy
AFAIK (never tried); you can mix PHP user functions with query
<http://in2.php.net/sqlite>
--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

Jul 17 '05 #7

P: n/a
R. Rajesh Jeba Anbiah wrote:
<http://in2.php.net/sqlite>


To avoid having everyone over for a party at _your_ local server, you
should trim your url: <http://php.net/sqlite> (it then jumps to a local
server appropriate for the visitor).
--
Firefox Web Browser - Rediscover the web - http://getffox.com/
Thunderbird E-mail and Newsgroups - http://gettbird.com/
Jul 17 '05 #8

P: n/a
Ewoud Dronkert wrote:
R. Rajesh Jeba Anbiah wrote:
<http://in2.php.net/sqlite>
To avoid having everyone over for a party at _your_ local server, you

should trim your url: <http://php.net/sqlite> (it then jumps to a local server appropriate for the visitor).


Oh, yes.

--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

Jul 17 '05 #9

P: n/a
"Ewoud Dronkert" <fi*******@lastname.net.invalid> wrote in message
news:42*********************@dreader4.news.xs4all. nl...
R. Rajesh Jeba Anbiah wrote:
<http://in2.php.net/sqlite>


To avoid having everyone over for a party at _your_ local server, you
should trim your url: <http://php.net/sqlite> (it then jumps to a local
server appropriate for the visitor).
--
Firefox Web Browser - Rediscover the web - http://getffox.com/
Thunderbird E-mail and Newsgroups - http://gettbird.com/


I usually use the server in Finland. Less traffic than the stateside
servers.
Jul 17 '05 #10

P: n/a
Chung Leong wrote:
"Ewoud Dronkert" <fi*******@lastname.net.invalid> wrote in message
news:42*********************@dreader4.news.xs4all. nl...
R. Rajesh Jeba Anbiah wrote:
<http://in2.php.net/sqlite>


To avoid having everyone over for a party at _your_ local server, you should trim your url: <http://php.net/sqlite> (it then jumps to a local server appropriate for the visitor).


I usually use the server in Finland. Less traffic than the stateside
servers.


I think, the mirror redirection is random and buggy. For me, it
often redirects to heavy traffic mirror.

--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

Jul 17 '05 #11

P: n/a
"Nicholas Sherlock" <n_********@hotmail.com> wrote in message
news:d4**********@lust.ihug.co.nz...
Bosconian wrote:
"Nicholas Sherlock" <n_********@hotmail.com> wrote in message
news:d4**********@lust.ihug.co.nz...
Bosconian wrote:

I'm looking for a method to compare two strings and grade them for
similarity.

http://docs.php.net/en/function.levenshtein.html

or

http://docs.php.net/en/function.similar-text.html

BTW, would you happen to know if this can be done at the query level?


Ah, do you want to something like this made up query:

SELECT * FROM mytable WHERE text IS SORT OF SIMILAR TO $mysearch ?

If so, you can't do this with PHP functions. You may be able to find an
add-on for your database server which will add functionality like this,
but I don't think that it comes standard with any databases.

Cheers,
Nicholas Sherlock


Something like your mock query makes sense... kind of a LIKE clause on
steroids. I'm surprise MySQL doesn't support it.

In my case it's not a big deal. I'm only dealing with a couple hundred
records at the most. I can simply loop through the recordset (like the
levenshtein php.net example) and find any/all similarities with a value of
less than 10 or whatever.
Jul 17 '05 #12

This discussion thread is closed

Replies have been disabled for this discussion.