"Russell" wrote ...
what is the best method to remove alot of words such as "a", "and", "I",
"so", "that", "this" ...etc ... from the search string leaving only
keywords essentially per page/field that will be searched within for the
occurance of the users' input text through a input field
The idea being to only return the suitable records without alot of
rubbish...
You want to create yourself an IGNORE WORDS list, I put one of these
together before, contains about 390 now I think - you can search Google and
find results for this pretty easily, the pain is sometimes having to get
them from the web into a format you can use, perhaps from a web page to
excel to xml or something...
With these, you then just have a little function that is called passing in
your search criteria, iterate through each word in the search criteria and
the ignore words, if you find a match dont keep it as good criteria, if
there's no match then keep it...
Example:
Dim aIgnoreWords(4)
aIgnoreWords(0) = "a"
aIgnoreWords(1) = "i"
aIgnoreWords(2) = "them"
aIgnoreWords(3) = "must"
bMatchFound = False
sSearchCriteria = "I must find them and a donkey"
aSearchCriteriaWords = Split(sSearchCriteria, " ")
' iterate through search criteria words
For x = 0 To (UBound(aSearchCriteriaWords)-1)
' iterate through our ignore words
For y = 0 To (UBound(aIgnoreWords)-1)
' do we have a match?
If LCase(aSearchCritieriaWords(x)) = LCase(aIgnoreWords(y)) Then
' match found!
bMatchFound = True
Exit For
End If
Next
' if we didn't match our criteria word to any ignore words then its a
good criteria word, add it to our new search criteria
If bMarchFound = False Then
sNewSearchCriteria = sNewSearchCriteria & aSearchCriteria(x)
' add a space to separate the words
If x < (UBound(aIgnoreWords)-1) Then
sNewSearchCriteria & " "
End If
Else
' reset flag ready for next search criteria word
bMatchFound = False
End If
Next
By the end of this, your sNewSearchCriteria string should contain: "find and
my donkey"
Little example only and untested so it might error - you should also
consider doing some checks,
does the Len(sSearchCriteria) 1 (do we have anything to search for at
all?!)
have you loaded in your ignore words successful (perhaps from a
database/xml)
do you want to store the words that were admitted so you can do what google
used to do... "The following common words were exclude; a, i, them, must"
etc
My list of ignore words is below, its in XML format, but you cold easily do
a REPLACE ALL in word or something to remove the tags..hope its of use.
Regards
Rob
<?xml version="1.0" encoding="utf-8" ?>
<IgnoreWords>
<Word>a</Word>
<Word>about</Word>
<Word>above</Word>
<Word>according</Word>
<Word>across</Word>
<Word>actually</Word>
<Word>adj</Word>
<Word>after</Word>
<Word>afterwards</Word>
<Word>again</Word>
<Word>against</Word>
<Word>all</Word>
<Word>almost</Word>
<Word>alone</Word>
<Word>along</Word>
<Word>already</Word>
<Word>also</Word>
<Word>although</Word>
<Word>always</Word>
<Word>among</Word>
<Word>amongst</Word>
<Word>an</Word>
<Word>and</Word>
<Word>another</Word>
<Word>any</Word>
<Word>anyhow</Word>
<Word>anyone</Word>
<Word>anything</Word>
<Word>anywhere</Word>
<Word>are</Word>
<Word>aren't</Word>
<Word>around</Word>
<Word>as</Word>
<Word>at</Word>
<Word>b</Word>
<Word>be</Word>
<Word>became</Word>
<Word>because</Word>
<Word>become</Word>
<Word>becomes</Word>
<Word>becoming</Word>
<Word>been</Word>
<Word>before</Word>
<Word>beforehand</Word>
<Word>begin</Word>
<Word>beginning</Word>
<Word>behind</Word>
<Word>being</Word>
<Word>below</Word>
<Word>beside</Word>
<Word>besides</Word>
<Word>between</Word>
<Word>beyond</Word>
<Word>billion</Word>
<Word>both</Word>
<Word>but</Word>
<Word>by</Word>
<Word>c</Word>
<Word>can</Word>
<Word>can't</Word>
<Word>cannot</Word>
<Word>caption</Word>
<Word>co</Word>
<Word>co.</Word>
<Word>could</Word>
<Word>couldn't</Word>
<Word>d</Word>
<Word>did</Word>
<Word>didn't</Word>
<Word>do</Word>
<Word>does</Word>
<Word>doesn't</Word>
<Word>don't</Word>
<Word>down</Word>
<Word>during</Word>
<Word>e</Word>
<Word>each</Word>
<Word>eg</Word>
<Word>eight</Word>
<Word>eighty</Word>
<Word>either</Word>
<Word>else</Word>
<Word>elsewhere</Word>
<Word>end</Word>
<Word>ending</Word>
<Word>enough</Word>
<Word>etc</Word>
<Word>even</Word>
<Word>ever</Word>
<Word>every</Word>
<Word>everyone</Word>
<Word>everything</Word>
<Word>everywhere</Word>
<Word>except</Word>
<Word>f</Word>
<Word>few</Word>
<Word>fifty</Word>
<Word>first</Word>
<Word>five</Word>
<Word>for</Word>
<Word>former</Word>
<Word>formerly</Word>
<Word>forty</Word>
<Word>found</Word>
<Word>four</Word>
<Word>from</Word>
<Word>further</Word>
<Word>g</Word>
<Word>h</Word>
<Word>had</Word>
<Word>has</Word>
<Word>hasn't</Word>
<Word>have</Word>
<Word>haven't</Word>
<Word>he</Word>
<Word>he'd</Word>
<Word>he'll</Word>
<Word>he's</Word>
<Word>hence</Word>
<Word>her</Word>
<Word>here</Word>
<Word>here's</Word>
<Word>hereafter</Word>
<Word>hereby</Word>
<Word>herein</Word>
<Word>hereupon</Word>
<Word>hers</Word>
<Word>herself</Word>
<Word>him</Word>
<Word>himself</Word>
<Word>his</Word>
<Word>how</Word>
<Word>however</Word>
<Word>hundred</Word>
<Word>i</Word>
<Word>i'd</Word>
<Word>i'll</Word>
<Word>i'm</Word>
<Word>i've</Word>
<Word>ie</Word>
<Word>if</Word>
<Word>in</Word>
<Word>inc.</Word>
<Word>indeed</Word>
<Word>instead</Word>
<Word>into</Word>
<Word>is</Word>
<Word>isn't</Word>
<Word>it</Word>
<Word>it's</Word>
<Word>its</Word>
<Word>itself</Word>
<Word>j</Word>
<Word>k</Word>
<Word>l</Word>
<Word>last</Word>
<Word>later</Word>
<Word>latter</Word>
<Word>latterly</Word>
<Word>least</Word>
<Word>less</Word>
<Word>let</Word>
<Word>let's</Word>
<Word>like</Word>
<Word>likely</Word>
<Word>ltd</Word>
<Word>m</Word>
<Word>made</Word>
<Word>make</Word>
<Word>makes</Word>
<Word>many</Word>
<Word>maybe</Word>
<Word>me</Word>
<Word>meantime</Word>
<Word>meanwhile</Word>
<Word>might</Word>
<Word>million</Word>
<Word>miss</Word>
<Word>more</Word>
<Word>moreover</Word>
<Word>most</Word>
<Word>mostly</Word>
<Word>mr</Word>
<Word>mrs</Word>
<Word>much</Word>
<Word>must</Word>
<Word>my</Word>
<Word>myself</Word>
<Word>n</Word>
<Word>namely</Word>
<Word>neither</Word>
<Word>never</Word>
<Word>nevertheless</Word>
<Word>next</Word>
<Word>nine</Word>
<Word>ninety</Word>
<Word>no</Word>
<Word>nobody</Word>
<Word>none</Word>
<Word>nonetheless</Word>
<Word>noone</Word>
<Word>nor</Word>
<Word>not</Word>
<Word>nothing</Word>
<Word>now</Word>
<Word>nowhere</Word>
<Word>o</Word>
<Word>of</Word>
<Word>off</Word>
<Word>often</Word>
<Word>on</Word>
<Word>once</Word>
<Word>one</Word>
<Word>one's</Word>
<Word>only</Word>
<Word>onto</Word>
<Word>or</Word>
<Word>other</Word>
<Word>others</Word>
<Word>otherwise</Word>
<Word>our</Word>
<Word>ours</Word>
<Word>ourselves</Word>
<Word>out</Word>
<Word>over</Word>
<Word>overall</Word>
<Word>own</Word>
<Word>p</Word>
<Word>per</Word>
<Word>perhaps</Word>
<Word>q</Word>
<Word>r</Word>
<Word>rather</Word>
<Word>recent</Word>
<Word>recently</Word>
<Word>s</Word>
<Word>same</Word>
<Word>seem</Word>
<Word>seemed</Word>
<Word>seeming</Word>
<Word>seems</Word>
<Word>seven</Word>
<Word>seventy</Word>
<Word>several</Word>
<Word>she</Word>
<Word>she'd</Word>
<Word>she'll</Word>
<Word>she's</Word>
<Word>should</Word>
<Word>shouldn't</Word>
<Word>since</Word>
<Word>six</Word>
<Word>sixty</Word>
<Word>so</Word>
<Word>some</Word>
<Word>somehow</Word>
<Word>someone</Word>
<Word>something</Word>
<Word>sometime</Word>
<Word>sometimes</Word>
<Word>somewhere</Word>
<Word>still</Word>
<Word>stop</Word>
<Word>stoplist</Word>
<Word>such</Word>
<Word>t</Word>
<Word>taking</Word>
<Word>ten</Word>
<Word>than</Word>
<Word>that</Word>
<Word>that'll</Word>
<Word>that's</Word>
<Word>that've</Word>
<Word>the</Word>
<Word>their</Word>
<Word>them</Word>
<Word>themselves</Word>
<Word>then</Word>
<Word>thence</Word>
<Word>there</Word>
<Word>there'd</Word>
<Word>there'll</Word>
<Word>there're</Word>
<Word>there's</Word>
<Word>there've</Word>
<Word>thereafter</Word>
<Word>thereby</Word>
<Word>therefore</Word>
<Word>therein</Word>
<Word>thereupon</Word>
<Word>these</Word>
<Word>they</Word>
<Word>they'd</Word>
<Word>they'll</Word>
<Word>they're</Word>
<Word>they've</Word>
<Word>thirty</Word>
<Word>this</Word>
<Word>those</Word>
<Word>though</Word>
<Word>thousand</Word>
<Word>three</Word>
<Word>through</Word>
<Word>throughout</Word>
<Word>thru</Word>
<Word>thus</Word>
<Word>to</Word>
<Word>together</Word>
<Word>too</Word>
<Word>toward</Word>
<Word>towards</Word>
<Word>trillion</Word>
<Word>twenty</Word>
<Word>two</Word>
<Word>u</Word>
<Word>under</Word>
<Word>unless</Word>
<Word>unlike</Word>
<Word>unlikely</Word>
<Word>until</Word>
<Word>up</Word>
<Word>upon</Word>
<Word>us</Word>
<Word>used</Word>
<Word>using</Word>
<Word>v</Word>
<Word>very</Word>
<Word>via</Word>
<Word>w</Word>
<Word>was</Word>
<Word>wasn't</Word>
<Word>we</Word>
<Word>we'd</Word>
<Word>we'll</Word>
<Word>we're</Word>
<Word>we've</Word>
<Word>well</Word>
<Word>were</Word>
<Word>weren't</Word>
<Word>what</Word>
<Word>what'll</Word>
<Word>what's</Word>
<Word>what've</Word>
<Word>whatever</Word>
<Word>when</Word>
<Word>whence</Word>
<Word>whenever</Word>
<Word>where</Word>
<Word>where's</Word>
<Word>whereafter</Word>
<Word>whereas</Word>
<Word>whereby</Word>
<Word>wherein</Word>
<Word>whereupon</Word>
<Word>wherever</Word>
<Word>whether</Word>
<Word>which</Word>
<Word>while</Word>
<Word>whither</Word>
<Word>who</Word>
<Word>who'd</Word>
<Word>who'll</Word>
<Word>who's</Word>
<Word>whoever</Word>
<Word>whole</Word>
<Word>whom</Word>
<Word>whomever</Word>
<Word>whose</Word>
<Word>why</Word>
<Word>will</Word>
<Word>with</Word>
<Word>within</Word>
<Word>without</Word>
<Word>won't</Word>
<Word>would</Word>
<Word>wouldn't</Word>
<Word>x</Word>
<Word>y</Word>
<Word>yes</Word>
<Word>yet</Word>
<Word>you</Word>
<Word>you'd</Word>
<Word>you'll</Word>
<Word>you're</Word>
<Word>you've</Word>
<Word>your</Word>
<Word>yours</Word>
<Word>yourself</Word>
<Word>yourselves</Word>
<Word>z</Word>
</IgnoreWords>