If anyone has experience with DotLucene, then this question might be right
up your alley.
I have two lists of music titles. Each from a different source. I am
trying to match determine possible matches to associate them. I know that
any association on text will not be perfect, but I am interested in the
probability of two titles being the same.
Using dotLucene, I have created an index with one set and enumerating the
second while performing a search. I get back a set of hits but it does not
give me very precise results. Maybe it is because titles do not give much
content to search on. I am looking for a way to make the results more
strict or take into account work position and proximity when calculating the
score.
For example, the title "Give In To Me" is matching 100% with the title
"Heaven Give Me World". Likewise, "You Rock My World (Dance Mix)" is 100% a
match for "How's My World Treatin' You".
Can I tighten the search system so that it is more strict?