By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,689 Members | 1,759 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,689 IT Pros & Developers. It's quick & easy.

The Ultimate HTML Search Script. But.......!

P: n/a
A year or two back I needed a search script to scan thru HTML files
on a client site. Usual sorta thing. A quick search turned up a
neat script that provided great search results. It was fast,
returned the hyperlinked page title, filename, and the body txt (30
preceding and following words) in context with the search word
highlighted. Excellent.!

See it working at: http://www.ipt.co.za
Just search for "firearm"

I now need to use it on a different site but I've run into two
limitations which made it unsuitable. It will only search the root
folder of a domain and won't cascade the search down through sub-
folders. Hmmmm. Pity. Checked the code but couldn't modify it
successfully. Aah well, look elsewhere.....

After hours and hours of scouring through the script sites I began to
realise what a lucky find that original script was. Almost
everything new that seemed promising either had major limitations,
presented poor results or just would not work properly. And, most
importantly, none of them offered that *excellent* feature where the
search term was displayed embedded in the context.

I now believe that this is a great script that needs two
modifications:

1. The ability to be pointed to a sub-folder of a domain as the
starting point for the search. But, as a compromise, this could also
be achieved as the folder in which the script itself is lying.

2. The ability for the search to cascade on through sub-folders
lying below the folder in which the search is initiated.

It would also be great if the original casing for the page title and
context could be preserved but that really isn't a critical issue.

Maybe one could even specify the file extensions for searching to
speed things up. Suggestions would be greatfully received.!

Please can somebody out there help with some ideas as to where this
script can be modified to provide these functions. I'm sure this
would an excellent resource for the ASP community and it certainly
fits my requirements.

Here's hoping ...... code follows below.!

TIA

.les.
Durban
South Africa
------------< clip: code to launch results.asp starts >------------

<FORM NAME="frmSearch" METHOD="GET" ACTION="results.asp">
<table border=1 width=175 bordercolor="#336633" cellspacing=0><tr><td
align=center background="green.gif" bgcolor="#009900"><div
align=center>
<font color="#FFFFFF"><b> search IPT
</b></font></div></td></tr><tr><td align=center ><div
align=center><br><img src="clear.gif" width=1 height=4>
<INPUT NAME="boolean" TYPE="hidden" VALUE="AND">
<INPUT NAME="selSearchWhere" TYPE="hidden" value="">
<INPUT NAME="terms" TYPE="TEXT" VALUE=" Search IPT Here" SIZE="14"
onFocus=clearfield() TITLE="Enter the keyword to search on here!">
<INPUT TYPE="SUBMIT" VALUE="find"></div><img src="clear.gif" width=1
height=4>
</td></tr></table>
</FORM>

------------< /clip: code to launch results.asp ends >------------

------------< file: results.asp starts >------------

<%@ Language=VBScript %>
<% Option Explicit %>
<html>
<head>
<TITLE>Search IPT</TITLE>
<SCRIPT LANGUAGE="javascript" src="jAvAsCrIpTs.js"></SCRIPT>
<link rel="stylesheet" href="ipt.css">
<title>http://www.ipt.co.za</title>
</head>
<body background="bg.jpg" bgcolor="#FFFFFF" link="#006600"
alink="#006600" vlink="#006600" marginwidth=0 marginheight=0
topmargin=0 leftmargin=0 bottommargin=0 rightmargin=0>

<%
Dim myCounter
If Request.QueryString("myCounter") <> "" Then
myCounter = Request.QueryString("myCounter")
Else
myCounter = 1
End If
%>
<%
Function FormatURL(strPath)
'Cut off everything before wwwroot and replace all \
with /
Dim iPos
iPos = InStr(1,strPath,"wwwroot",1)
Dim str
str = Mid(strPath,iPos+7,Len(strPath))
FormatURL = Replace(str,"\","/")
End Function

Function RemoveHTML( strDesc )
Dim RegEx
Set RegEx = New RegExp
RegEx.Pattern = "<[^>]*>"
RegEx.Global = True
strDesc = Replace(LCase(strDesc), "<br>", chr(10))
RemoveHTML = RegEx.Replace(strDesc, "")
End Function

Function GetFiles(objFolder, aLookFor, strLF,
bolLFFound,bolAnd, iCount)
If Left(objFolder.Name,1) = "_" then exit function
Const iListPerPage = 9

if iCount > iListPerPage then Exit Function

'Now, loop through each file
Dim objFile, objTextStream, objFSO, strContents,
iUBound, iLoop, bolValid
Dim strTitle, iPos, strDesc
iUBound = UBound(aLookFor)

For Each objFile in objFolder.Files
on error resume next
'What file to search, if u change the .htm
remember to cahange 4 to correpond
If (Right(objFile.Name,4)) = ".htm" then
'If Ucase(Right(objFile.Name,6)) = ".SHTML" or
Ucase(right(objFile.Name,4)) = ".ASP" then

If bolLFFound then
if objFile.Size > 0 then
Set objFSO =
Server.CreateObject("Scripting.FileSystemObject")
Set objTextStream =
objFSO.OpenTextFile(objFile.Path,1)
strContents =
objTextStream.ReadAll
strContents =
LCase(strContents)

'get the page title
then remove all HTML tags
iPos =
InStr(1,strContents,"<title>")
Dim iWhere, iLength,
DescPosition, DescLast
If iPos = 0
then

strTitle = "Untitled (" & objFile.Name & ")"
Else

strTitle =
Mid(strContents,iPos+7,InStr(iPos,strContents,"</title>")-iPos-7)
End If
strContents=RemoveHTML( strContents )
objTextStream.Close
Set objFSO = Nothing

if bolAnd then
bolValid = True else bolValid = False
For iLoop = 0 to
iUBound
If
InStr(1,strContents,aLookFor(iLoop),1) then
if Not
bolAnd then bolValid = True
Else
If
bolAnd then bolValid = False
End If
Next

If bolValid then

iWhere =
InStr(strContents,strKeywords)
iLength = 200
if iWhere > 0
then
if iWhere <
200 then iLength = iWhere-1

strDesc = mid(strContents,iWhere-iLength,iLength) &
mid(strContents,iWhere,200)

'strDesc = mid(strContents,iWhere-iLength,iLength) &
mid(strContents,iWhere,200+len(strKeywords))

strDesc=Replace(strDesc, "<", "&lt;")

strDesc=Replace(strDesc, ">", "&gt;")

strDesc=Trim(strDesc)

strDesc = Mid( strDesc, InStr(strDesc," ") + 1 )

strDesc = Left( strDesc, InStrRev(strDesc," ") - 1 )

strDesc=Replace(strDesc, strKeywords, "<font
class=highlight><b>"&strKeywords&"</b></font>")
end if

Response.Write
"<font size=-1><b>" & myCounter & ". <a href=""" & objFile.Name &
""">" & _

strTitle & "</b></a><font size=1>&nbsp; (" & objFile.Name &
")</font><br>" & vbCrLf
Response.Write
"<div class=results><font size=-1>" & strDesc
Response.Write
"</font></div><br>" & vbCrLf
strDesc = ""
strTitle = ""

myCounter =
myCounter + 1
iCount =
iCount + 1
End if
If iCount >
iListPerPage then
strLF =
FormatURL(objFile.Path)
exit function
End If
End If
Elseif FormatURL(objFile.Path) = strLF
then
bolLFFound = True
End If
End if
Next

'Dim objSubFolder
'For Each objSubFolder in objFolder.SubFolders
' GetFiles
objSubFolder,aLookFor,strLF,bolLFFound,bolAnd,iCou nt
'Next
End Function

'Search the site get keywords and take the first word!
Dim strKeywords
strKeywords = Request("terms")
strKeywords = LCase(strKeywords)
strKeywords = trim(strKeywords)
do while InStr(1,strKeywords," ")
If InStr(1,strKeywords," ") Then
strKeywords = Left( strKeywords, InStrRev(strKeywords," ") - 1
)
End If
loop

'Split the terms on spaces
Dim termsArray
termsArray = split(strKeywords," ")

'Set the boolean search option
Dim bolAnd
If Request("boolean") = "AND" then bolAnd = True else bolAnd =
False

Dim section
section = Request("selSearchWhere")

'Get the dirs to search
If section = "ipt" then
section = Server.MapPath("/")
else
section = Server.MapPath("/")
end if

'What page are we on?
Dim strLastFile
strLastFile=Request("lf")

Dim objFSO, objFolder
Set objFSO = Server.CreateObject("Scripting.FileSystemObject")

Set objFolder = objFSO.GetFolder(section)
Set objFSO = Nothing
%>

<blockquote>
<table width="550" ><tr><td>
<br>
<center><font size=+2><b>
IPT Search Results For "<% =strKeywords %>"
</b></font><br><form name="frmSearch" method="GET"
action="results.asp">
New Search:
<input name="boolean" type="hidden" size="20" value="AND">
<input name="selSearchWhere" type="hidden" size="20" value="">
<input name="terms" type="text" size="20" title="Enter the keyword to
search on here!">
<input name="submit" type="button" value="Search!">
</form>
</center>
<p>
<center>
<a href="index.htm" >home</a> |
<a href="schools.htm" >schools</a> |
<a href="police.htm" >police</a> |
<a href="Government.htm" >government</a> |
<a href="hiv_aids.htm" >HIV/Aids</a> |
<a href="public.htm" >publications</a> |
<a href="staff.htm" >staff</a> |
<a href="services.htm" >services</a> |
<a href="links.htm" >links</a> |
<a href="contact.htm" >contact</a>
</center>
<hr><p>
Below are the results of your search in no particular order...
<p>
<p>

<%
Dim iResults
iResults = 0

'Now, recurse the directories
If Len(strLastFile) = 0 then
GetFiles
objFolder,termsArray,strLastFile,True,bolAnd,iResu lts
Else
GetFiles
objFolder,termsArray,strLastFile,False,bolAnd,iRes ults
End If

Set objFolder = Nothing

If iResults = 10 then
'Show next page link
%>
<center>
<p><hr><p><b>
<a
href="results.asp?terms=<%=Server.URLEncode(strKey words)%>&boolean=<%=Request("boolean")%>&selSearch Where=<%=Request("selSearchWhere")%>&lf=<%=Server. URLEncode(strLastFile)%>&myCounter=<%=myCounter%>" >
Next 10 results...</b>
</a></center>
<p>
<% Elseif iResults = 0 then %>
<b>No results found!</b><br>

<% Elseif iResults < 10 then %>

<center>
<p><hr><p>
<b><form name="frmSearch" method="GET" action="results.asp">
New Search:
<input name="boolean" type="hidden" size="20" value="AND">
<input name="selSearchWhere" type="hidden" size="20" value="">
<input name="terms" type="text" size="20" title="Enter the keyword to
search on here!">
<input name="submit" type="button" value="Search!">
</form>
</center>
<% End IF %>
<hr><p>
<center>
<a href="index.htm" >home</a> |
<a href="schools.htm" >schools</a> |
<a href="police.htm" >police</a> |
<a href="Government.htm" >government</a> |
<a href="hiv_aids.htm" >HIV/Aids</a> |
<a href="public.htm" >publications</a> |
<a href="staff.htm" >staff</a> |
<a href="services.htm" >services</a> |
<a href="links.htm" >links</a> |
<a href="contact.htm" >contact</a>
</center>
<div align="right">
<p><font size=1>All material Copyright Independent Projects Trust
1990-2002</font></p>
</div>
</tr></td></table>
</blockquote>

</body>
</html>

------------< /file: results.asp ends >------------

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Les Juby
Webpro Internet / Prosoft Microsystems
Durban, KwaZulu-Natal, South Africa
we****@webpro.co.za
Tel: +27 (31) 563-8344 Fax: +27 (31) 563-1684
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
Les Juby le*****@anti-spam.iafrica.com
Webpro Internet - - - Prosoft Microsystems
Durban, KwaZulu-Natal, South Africa
P.O.Box 35243, Northway 4065, South Africa
Tel: +27 31 563-8344 Fax: +27 31 564-4928
o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
(you *do* know to take "anti-spam" out the address....?
Jul 19 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
OK - Not one response from you lot.! Sigh.! But then a post with
over 400 lines is a bit daunting. Can't see myself ever responding to
something like that.

So there was not option but to roll up ones slleves and get to it.
And, surprisingly, it only took a couple of hours and we'd solved all
the glitches.

This is a GREAT search text, beats anything I've seen in the script
archives at about eight ASP sites that we cruised thru. And we tried
out five of the best scripts.

If anyone would like a copy then mail me at we****@webpro.co.za and
I'll be happy to send you the (improved) ASP file. The only proviso
is that if you have any ideas on how to improve it, then let me know.
I'm even prepared to put up a discussion board for it if there's
enough interest.!

.les.

On Fri, 18 Jul 2003 06:38:25 GMT, we****@webpro.co.za (Les Juby)
wrote:
A year or two back I needed a search script to scan thru HTML files
on a client site. Usual sorta thing. A quick search turned up a
neat script that provided great search results. It was fast,
returned the hyperlinked page title, filename, and the body txt (30
preceding and following words) in context with the search word
highlighted. Excellent.!

See it working at: http://www.ipt.co.za
Just search for "firearm"

I now need to use it on a different site but I've run into two
limitations which made it unsuitable. It will only search the root
folder of a domain and won't cascade the search down through sub-
folders. Hmmmm. Pity. Checked the code but couldn't modify it
successfully. Aah well, look elsewhere.....

After hours and hours of scouring through the script sites I began to
realise what a lucky find that original script was. Almost
everything new that seemed promising either had major limitations,
presented poor results or just would not work properly. And, most
importantly, none of them offered that *excellent* feature where the
search term was displayed embedded in the context.

I now believe that this is a great script that needs two
modifications:

1. The ability to be pointed to a sub-folder of a domain as the
starting point for the search. But, as a compromise, this could also
be achieved as the folder in which the script itself is lying.
(fixed)
2. The ability for the search to cascade on through sub-folders
lying below the folder in which the search is initiated.
(fixed)
It would also be great if the original casing for the page title and
context could be preserved but that really isn't a critical issue.
(on the Wish List)
Maybe one could even specify the file extensions for searching to
speed things up. Suggestions would be greatfully received.!
(fixed)
Please can somebody out there help with some ideas as to where this
script can be modified to provide these functions. I'm sure this
would an excellent resource for the ASP community and it certainly
fits my requirements.

Here's hoping ...... code follows below.!

TIA

.les.
Durban
South Africa
------------< clip: code to launch results.asp starts >------------

Lotsa blah, blah code all snipped out........................

------------< /file: results.asp ends >------------

o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
Les Juby le*****@anti-spam.iafrica.com
Webpro Internet - - - Prosoft Microsystems
Durban, KwaZulu-Natal, South Africa
P.O.Box 35243, Northway 4065, South Africa
Tel: +27 31 563-8344 Fax: +27 31 564-4928
o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
(you *do* know to take "anti-spam" out the address....?
Jul 19 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.