473,545 Members | 2,663 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Search - stop words - array/database/text file?

Lo all,

Ok - I'm adding site search functionality to a database driven website.

I have a list of 390 stop/ignore words, having looked at ASPFAQ already I
see that the example uses an array, what I was wondering was whether this
would still be the best practice for this quantity of stop words?

There is a larger over head in me defining the array initially as I will
have to hard code them all in, alternatively I thought I could import them
into a SQL Server table from the excel file they are currently in and then
query that, but I believe the ASPFAQ article gave a good reason for not
doing that, my last thought was to read them in from a text file...

Anyone got any thoughts? Would an array be equally as efficient for 390 stop
words as it is for 10-20? Is it better to hard code them rather than grab
them from a database?

Any help / advice would be appreciated.

Regards

Rob
Jul 19 '05 #1
8 2395
Rob Meade wrote:
Lo all,

Ok - I'm adding site search functionality to a database driven
website.

I have a list of 390 stop/ignore words, having looked at ASPFAQ
already I see that the example uses an array, what I was wondering
was whether this would still be the best practice for this quantity
of stop words?

There is a larger over head in me defining the array initially as I
will have to hard code them all in, alternatively I thought I could
import them into a SQL Server table from the excel file they are
currently in and then query that, but I believe the ASPFAQ article
gave a good reason for not doing that, my last thought was to read
them in from a text file...

Anyone got any thoughts? Would an array be equally as efficient for
390 stop words as it is for 10-20? Is it better to hard code them
rather than grab them from a database?

Any help / advice would be appreciated.

Regards

Rob


If the list will be static, I would store it in an Application variable,
making the decision as to whether to store it in a database or a textfile
superfluous. If the list has no relationship to any of your database data,
then a text file on your web server seems to be indicated.

Moreover, I would suggest storing it as an XML DOMDocument, allowing you to
use the XML Parser DOM methods to easily search for values in the list.

Bob Barrows

--
Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"
Jul 19 '05 #2
"Bob Barrows" wrote ...
If the list will be static, I would store it in an Application variable,
making the decision as to whether to store it in a database or a textfile
superfluous.
Hi Bob,

Yes, initially this list will definately be static, I do not plan to add to
the list dynamically at this stage.
If the list has no relationship to any of your database data,
then a text file on your web server seems to be indicated.
ok
Moreover, I would suggest storing it as an XML DOMDocument, allowing you to use the XML Parser DOM methods to easily search for values in the list.


hmmm...hadn't thought of XML for this..

Wouldnt this be quite a bit of extra code considering all I want to do is
iterate through the list and chop those words out of the original search
string etc? Maybe not, not sure...can't see the advantages of this method?

Any further info appreciated..

Regards

Rob
Jul 19 '05 #3
Rob Meade wrote:
"Bob Barrows" wrote ...
If the list will be static, I would store it in an Application
variable, making the decision as to whether to store it in a
database or a textfile superfluous.


Hi Bob,

Yes, initially this list will definately be static, I do not plan to
add to the list dynamically at this stage.
If the list has no relationship to any of your database data,
then a text file on your web server seems to be indicated.


ok
Moreover, I would suggest storing it as an XML DOMDocument, allowing
you to use the XML Parser DOM methods to easily search for values in
the list.


hmmm...hadn't thought of XML for this..

Wouldnt this be quite a bit of extra code considering all I want to
do is iterate through the list and chop those words out of the
original search string etc? Maybe not, not sure...can't see the
advantages of this method?


Ah! I see. I was thinking you would need to do the opposite: find specific
words in the list.

To find a word in an array:
for i = 0 to ubound(ar)
if ar(i) = <something> then
exit for
end if
next

To find a word in an XML Document:
xmldoc.selectsi nglenode("/root/node[value='<somethi ng>']")

There is no extra code involved in looping through a DOM Document:

for each oNode in xmldoc.document element.childno des
'do something with oNode.Text
next

Given the comparative sizes of the array and xml document, if I did not need
search capabilities, I would go with the array.

Bob Barrows

--
Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"
Jul 19 '05 #4
"Bob Barrows" wrote ...
Given the comparative sizes of the array and xml document, if I did not need search capabilities, I would go with the array.


Hi Bob,

Many thanks for the reply, and examples, I will use the array method for now
then - many thanks - if you have time - I've another question - see Search
(part2) :oD

Cheers

Rob
Jul 19 '05 #5
"Bob Barrows" wrote:
: To find a word in an array:
: for i = 0 to ubound(ar)
: if ar(i) = <something> then
: exit for
: end if
: next
:
: To find a word in an XML Document:
: xmldoc.selectsi nglenode("/root/node[value='<somethi ng>']")
:
: There is no extra code involved in looping through a DOM Document:
:
: for each oNode in xmldoc.document element.childno des
: 'do something with oNode.Text
: next
:
: Given the comparative sizes of the array and xml document, if I did not
need
: search capabilities, I would go with the array.

Or you could use Filter and eliminate the For...Next loop:

<%@ Language=VBScri pt %>
<%
Option Explicit
Response.Buffer = True

sub lPrt(strMsg)
Response.Write( strMsg & "<br />" & vbCrLf)
end sub

sub Prt(strMsg)
Response.Write( strMsg)
end sub

sub findWord(arr, fWord)
if isFound(arr, fWord) = fWord Then
Response.Write( fWord & " found in array.<br />" & vbCrLf)
else
Response.Write( fWord & " not found in array.<br />" & vbCrLf)
end if
end sub

function isFound(arr, fWord)
dim f
f = Filter(arr, fWord)
if ubound(f) <> 0 Then
isFound = ""
else
isFound = f(0)
end if
end function

dim str, myarray, fWord
str = "one two three four five six seven eight nine ten"
myarray = Split(str)

lPrt("Using Filter to find words in an array")
lPrt("Array elements: " & str)
Prt("Testing eleven: ")
findWord myarray, "eleven"
Prt("Testing five: ")
findWord myarray, "five"
%>

http://kiddanger.com/lab/filter.asp

--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp
Jul 19 '05 #6
Roland Hall wrote:
"Bob Barrows" wrote:
To find a word in an array:
for i = 0 to ubound(ar)
if ar(i) = <something> then
exit for
end if
next

To find a word in an XML Document:
xmldoc.selectsi nglenode("/root/node[value='<somethi ng>']")

There is no extra code involved in looping through a DOM Document:

for each oNode in xmldoc.document element.childno des
'do something with oNode.Text
next

Given the comparative sizes of the array and xml document, if I did
not need search capabilities, I would go with the array.


Or you could use Filter and eliminate the For...Next loop:

<%@ Language=VBScri pt %>
<%
Option Explicit
Response.Buffer = True

sub lPrt(strMsg)
Response.Write( strMsg & "<br />" & vbCrLf)
end sub

sub Prt(strMsg)
Response.Write( strMsg)
end sub

sub findWord(arr, fWord)
if isFound(arr, fWord) = fWord Then
Response.Write( fWord & " found in array.<br />" & vbCrLf)
else
Response.Write( fWord & " not found in array.<br />" & vbCrLf)
end if
end sub

function isFound(arr, fWord)
dim f
f = Filter(arr, fWord)
if ubound(f) <> 0 Then
isFound = ""
else
isFound = f(0)
end if
end function

dim str, myarray, fWord
str = "one two three four five six seven eight nine ten"
myarray = Split(str)

lPrt("Using Filter to find words in an array")
lPrt("Array elements: " & str)
Prt("Testing eleven: ")
findWord myarray, "eleven"
Prt("Testing five: ")
findWord myarray, "five"
%>

http://kiddanger.com/lab/filter.asp


Hah! I had forgotten about that. Thanks for the heads-up.

Bob Barrows
--
Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"
Jul 19 '05 #7
"Bob Barrows" wrote:
: Roland Hall wrote:
: > "Bob Barrows" wrote:
: >> To find a word in an array:
: >> for i = 0 to ubound(ar)
: >> if ar(i) = <something> then
: >> exit for
: >> end if
: >> next
: >>
: >> To find a word in an XML Document:
: >> xmldoc.selectsi nglenode("/root/node[value='<somethi ng>']")
: >>
: >> There is no extra code involved in looping through a DOM Document:
: >>
: >> for each oNode in xmldoc.document element.childno des
: >> 'do something with oNode.Text
: >> next
: >>
: >> Given the comparative sizes of the array and xml document, if I did
: >> not need search capabilities, I would go with the array.
: >
: > Or you could use Filter and eliminate the For...Next loop:
: >
: > <%@ Language=VBScri pt %>
: > <%
: > Option Explicit
: > Response.Buffer = True
: >
: > sub lPrt(strMsg)
: > Response.Write( strMsg & "<br />" & vbCrLf)
: > end sub
: >
: > sub Prt(strMsg)
: > Response.Write( strMsg)
: > end sub
: >
: > sub findWord(arr, fWord)
: > if isFound(arr, fWord) = fWord Then
: > Response.Write( fWord & " found in array.<br />" & vbCrLf)
: > else
: > Response.Write( fWord & " not found in array.<br />" & vbCrLf)
: > end if
: > end sub
: >
: > function isFound(arr, fWord)
: > dim f
: > f = Filter(arr, fWord)
: > if ubound(f) <> 0 Then
: > isFound = ""
: > else
: > isFound = f(0)
: > end if
: > end function
: >
: > dim str, myarray, fWord
: > str = "one two three four five six seven eight nine ten"
: > myarray = Split(str)
: >
: > lPrt("Using Filter to find words in an array")
: > lPrt("Array elements: " & str)
: > Prt("Testing eleven: ")
: > findWord myarray, "eleven"
: > Prt("Testing five: ")
: > findWord myarray, "five"
: > %>
: >
: > http://kiddanger.com/lab/filter.asp
:
: Hah! I had forgotten about that. Thanks for the heads-up.

No problem. I was workin' on it couple of days ago so it was fresh in my
mind.

Roland
Jul 19 '05 #8
"Roland Hall" wrote ...
No problem. I was workin' on it couple of days ago so it was fresh in my
mind.


Thanks for that Roland,

Tell me, is using the filter method more efficient perhaps than what I
bashed out with two pencils stuck to my head yesterday (I'm guessing so but
figured would ask)...

<!--INSERT VERY LARGE ARRAY UP HERE-->

intMatch = 0

For intLoop = 0 To UBound(aSearchC riteria)

For intLoop2 = 0 To UBound(aIgnoreW ords)

If UCase(aSearchCr iteria(intLoop) ) = UCase(aIgnoreWo rds(intLoop2)) Then

intMatch = 1

strIgnoredWords = strIgnoredWords & aIgnoreWords(in tLoop2) & ", "

Exit For

End If

Next

If intMatch = 0 Then

strTempSearchCr iteria = strTempSearchCr iteria & aSearchCriteria (intLoop)
& " "

End If

intMatch = 0

Next

strSearchCriter ia = Trim(strTempSea rchCriteria)

In this I'm obviously iterating through the entire array of ignore words for
each word in the search criteria, I am then creating a new string of words
that are not found which eventually get used as criteria, and I also create
a string of 'ignored' words which then get dumped on the page to make it
look really clevaaarrr :oD

You example has less lines of code so I suspect its far more efficient and
probably the preferred way, mine was bashed out whilst drinking stella :o)

Regards

Rob
Jul 19 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
2600
by: David | last post by:
Hi, I'm trying to add a search facility to a page that looks for matches in one, other or both memo fields of a database. The code below works fine if the visitor types in one word, or the term just happens to exist in one of the queried fields. What I'd really like is for a visitor to type in an expression, or query in the same format...
83
5852
by: D. Dante Lorenso | last post by:
Trying to use the 'search' in the docs section of PostgreSQL.org is extremely SLOW. Considering this is a website for a database and databases are supposed to be good for indexing content, I'd expect a much faster performance. I submitted my search over two minutes ago. I just finished this email to the list. The results have still not...
7
5438
by: Timo Haberkern | last post by:
Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words> I used the german myspell dictionary from...
5
2490
by: Martien van Wanrooij | last post by:
I have been using phpdig in some websites but now I stored a lot of larger texts into a mysql database. In the phpdig search engine, when you entered a search word, the page where the search word was found was displayed with about 2 lines before and 2 lines behind the search word itself. Let us say you look for "peanut butter" an the word is...
3
2012
by: Russell | last post by:
Hey, ok i have numerous tables to search through for a 'site search'. some of the searchble fields have html embeded within so after some quick referencing, saw I can use the regExp function to strip out all the HTML leaving only the raw text. (done and works a treat) My issue is:
3
2768
by: vonclausowitz | last post by:
Hi All, I was thinking of creating a table in my database to index all words in the database. That way I can quickly search for one or more words and the index table will return the words and records I need. For example the iTable would look like this:
0
2066
by: | last post by:
I have a question about spawning and displaying subordinate list controls within a list control. I'm also interested in feedback about the design of my search application. Lots of code is at the end of this message, but I will start with an overview of the problem. I've made a content management solution for my work with a decently...
9
2238
by: tomjones75 | last post by:
dear community, i want to search the content of all fields in one table in a access database. it already works for the content of one field in the table. please take a look at the code in the resultpage: <%
1
3258
by: nganglove | last post by:
C++ string search -------------------------------------------------------------------------------- Hello, please can any one help me? I am given an assigment in C++ to read a text file and search for certain words.If those stated words are found, then I should print the enire them including the entire line. Below is an example of such a text...
0
7490
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7425
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
1
7449
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7780
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
1
5351
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3479
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3465
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1911
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1037
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.