471,309 Members | 1,077 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,309 software developers and data experts.

dotlucene question: determine count of matched tokens?

I am working on trying to compare "titles" from two different lists and
trying to determine the most likely match. When a similar entry appears in
both, the result works well, but what I need to do is refine my search to
filter out results that are not in both.

I am currently getting a super high percentage for items that clearly do not
match. For instance, the item "Essence Of A Miracle : Elizabeth Neal" is
matching 100% with the item "power of a flower : wilkinson neal" which is
not the case and only three keywords match (I am not using stop works so the
"of" and "a" are also matching).

What I would like to do is also compare the number of matching keywords with
the total number of keywords. In the case above, since three keywords match
and since there are 6 keywords total, that would mean that only 50% of the
keywords match. If I was able to calculate the 50%, I could then adjust the
score so that matches that to not have a certain percentage of similar
keywords (say 80%) would be discarded. Is this possible? Can I get access
to a keyword count (both total and matching)?
Feb 1 '06 #1
0 1103

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Xizor | last post: by
1 post views Thread by python_charmer2000 | last post: by
19 posts views Thread by Magnus Lie Hetland | last post: by
6 posts views Thread by BCC | last post: by
3 posts views Thread by Stephen | last post: by
2 posts views Thread by Peter Rilling | last post: by
5 posts views Thread by Michael Moreno | last post: by
6 posts views Thread by ma740988 | last post: by
3 posts views Thread by jobo | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.