"McKirahan" <Ne**@McKirahan.comwrote in message
news:ad******************************@comcast.com. ..
"Carstonio" <Ca*******@discussions.microsoft.comwrote in message
news:ED**********************************@microsof t.com...
I use ASP to display links to Word documents on an intranet. Is there a
way
in ASP to do text searches on the documents' contents? I want the
results
to
have the link to the Word document plus two or three lines from the
document
that include the search terms.
The following script may get you started.
Option Explicit
Const cVBS = "Word_doc.vbs"
Const cDOC = "Word_doc.doc"
Dim objMSW
Set objMSW = CreateObject("Word.Application.8")
Dim objDOC
Set objDOC = objMSW.Documents.Open(cDOC)
Dim strDOC
strDOC = objDOC.Content
objDOC.Close False
objMSW.Application.Quit True
Set objDOC = Nothing
Set objMSW = Nothing
WScript.Echo strDOC
Since the above script is slow, I might suggest that
you have a process that uses the above to preprocess
each of your MS-Word documents and stores the
result in a database or text file then use that for your
searches.
Here's a script that will process a list of MS-Word documents.
It will generate a Tab Separated Variable file with a header row of:
Document Line Text
Optionally, which can be opened up in MS-Excel for analysis
or review (via Data + Get External Data + Import Text File...).
Watch for word.wrap.
Option Explicit
'*
'* Declare Constants
'*
Const cVBS = "Document.vbs"
Const cTXT = "Document.txt"
Const cCSV = "Document.csv"
'*
'* Declare Variables
'*
Dim arrDOC
Dim intDOC
Dim strDOC
Dim intINS
Dim strOTF
Dim intTOT(1)
intTOT(0) = 0
intTOT(1) = 0
Dim strTOT
strTOT = "# Documents; ## Lines"
Dim arrTXT
Dim intTXT
Dim strTXT
'*
'* Declare Objects
'*
Dim objDOC
Dim objFSO
Set objFSO = CreateObject("Scripting.FileSystemObject")
Dim objMSW
Set objMSW = CreateObject("Word.Application.8")
Dim objOTF
'*
'* Read list of databases
'*
Set objOTF = objFSO.OpenTextFile(cTXT,1)
strOTF = objOTF.ReadAll
Set objOTF = Nothing
'*
'* Documents, Lines
'*
Set objOTF = objFSO.OpenTextFile(cCSV,2,True)
objOTF.WriteLine("Document" & vbTab & "Line" & vbTab & "Text")
arrDOC = Split(strOTF,vbCrLf)
For intDOC = 0 To UBound(arrDOC)
strDOC = arrDOC(intDOC)
If InStr(LCase(strDOC),".doc") 0 Then
intINS = InStr(strDOC,":")
If intINS 0 Then strDOC = Mid(strDOC,intINS-1)
intTOT(0) = intTOT(0) + 1
objOTF.WriteLine(intTOT(0) & vbTab & "0" & vbTab & strDOC)
Set objDOC = objMSW.Documents.Open(strDOC)
strTXT = objDOC.Content
arrTXT = Split(strTXT,vbCr)
For intTXT = 0 To UBound(arrTXT)
If Trim(arrTXT(intTXT)) <"" Then
intTOT(1) = intTOT(1) + 1
objOTF.WriteLine(intTOT(0) & vbTab & intTOT(1) & vbTab &
arrTXT(intTXT))
End If
Next
objDOC.Close False
objMSW.Application.Quit True
Set objDOC = Nothing
End If
Next
Set objOTF = Nothing
'*
'* Destroy Objects
'*
Set objMSW = Nothing
Set objFSO = Nothing
'*
'* Finish
'*
strTOT = Replace(strTOT,"##",FormatNumber(intTOT(1),0))
strTOT = Replace(strTOT,"#",FormatNumber(intTOT(0),0))
MsgBox strTOT,vbInformation,cVBS
The input file ("Document.txt") can be generated via the MS-DOS
command "attrib". To identify all MS-Word documents on a drive:
run the following form a Command prompt:
attrib \*.doc /s Document.txt
Alternatively, you can just enter the filenames of the documents
that you're interested in into a text file one per line; for example:
C:\My Documents\Document1.doc
C:\My Documents\Document2.doc
An example of the output follows:
Document Line Text
1 0 C:\MYDOCU~1\Document1.doc
1 1 First line
1 2 Last line
2 0 C:\MYDOCU~1\Document2.doc
2 1 line number one
2 2 line number two
2 2 line number three
To reduce space, the document's filename is identified only once.
The filename is always on a "Line" of "0". Any questions?