By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,771 Members | 1,709 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,771 IT Pros & Developers. It's quick & easy.

Keyword Parsing with ASP

P: n/a
ARK
I am writing a search program in ASP(VBScript). The user can enter keywords
and press submit.
The user can separate the keywords by spaces and/or commas and key words may
contain plain words, single quoted strings (phrases), double quoted strings
(phrases).
For example:

Keywords:

Jack, Jill, Jim, "Timothy Brown", Mary OR
Jack Jill Jim 'Timothy Brown' Mary OR
Jack, Jill Jim, 'Timothy Brown' "Mary"

When I parse it i store the keywords in an array. The results must be:

Jack
Jill
Jim
Timothy Brown
Mary

I have tried doing this using Split but am unable to get the Phrases. Some
suggestions, code examples or links would help.

Thanks in advance

ARK.
Jul 19 '05 #1
Share this Question
Share on Google+
19 Replies


P: n/a
You might want to replace the spaces the user puts in with commas and
then use the split command.

strVariable = Replace(strVariable," ",",")
Split(strVariable,",")

Then you should have your array of items.

hth,
Andrew

* * * Sent via DevBuilder http://www.devbuilder.org * * *
Developer Resources for High End Developers.
Jul 19 '05 #2

P: n/a
Unfortunately, that would put a comma in "Timothy Brown" as well.

My suggestion would be to work your way through the string a character at a
time. If the character is a space, and not within quotes (" or ') then add
a comma, otherwise move along


"Andrew J Durstewitz" <ad******@devbuilder.org> wrote in message
news:3f***********************@news.frii.net...
You might want to replace the spaces the user puts in with commas and
then use the split command.

strVariable = Replace(strVariable," ",",")
Split(strVariable,",")

Then you should have your array of items.

hth,
Andrew

* * * Sent via DevBuilder http://www.devbuilder.org * * *
Developer Resources for High End Developers.

Jul 19 '05 #3

P: n/a
> My suggestion would be to work your way through the string a character at
a
time. If the character is a space, and not within quotes (" or ') then add a comma, otherwise move along


This can get infinitely complex, e.g.

Bob, Mary, "Timothy, Brown" 'franke, "tom, hula hoop" mea, culpa"

You never know what a user is going to enter, and it's hard to write code to
understand exactly what they mean.

I'd be really interested to see how Google's parsing algorithm works. I
wasn't brave enough to do that: www.aspfaq.com supports all words, any
words, or exact phrase... but no combination of the three.
Jul 19 '05 #4

P: n/a
I agree, keep it simple as possible. Setting up validation characters
such as " will require that you analyze the string one character at a
time.

Andrew

* * * Sent via DevBuilder http://www.devbuilder.org * * *
Developer Resources for High End Developers.
Jul 19 '05 #5

P: n/a
Google is a non-stop source of awe. This is why I buy Google t-shirts. The
calculator can also do some math too in addition to unit conversion (okay,
that's also math, but fine), i.e.
5 percent of 343

Ray at work

"TomB" <sh*****@hotmail.com> wrote in message
news:u1**************@tk2msftngp13.phx.gbl...
Yes you are right, that would be complex.
Speaking of Google's parsing have you seen the calculator? Try "searching" for

100 kilometers in miles

Very cool.

Jul 19 '05 #6

P: n/a
SWEET! Thanks for the tip.

"TomB" <sh*****@hotmail.com> wrote in message
news:u1**************@tk2msftngp13.phx.gbl...
Yes you are right, that would be complex.
Speaking of Google's parsing have you seen the calculator? Try "searching" for

100 kilometers in miles

Very cool.
"Aaron Bertrand - MVP" <aa***@TRASHaspfaq.com> wrote in message
news:%2****************@TK2MSFTNGP11.phx.gbl...
My suggestion would be to work your way through the string a character at
a
time. If the character is a space, and not within quotes (" or ')
then add
a comma, otherwise move along


This can get infinitely complex, e.g.

Bob, Mary, "Timothy, Brown" 'franke, "tom, hula hoop" mea, culpa"

You never know what a user is going to enter, and it's hard to write

code to
understand exactly what they mean.

I'd be really interested to see how Google's parsing algorithm works. I
wasn't brave enough to do that: www.aspfaq.com supports all words, any
words, or exact phrase... but no combination of the three.


Jul 19 '05 #7

P: n/a
Yeah, I think that's why they call it a calculator ;)
I thought the fact that it was able to determine that I wanted a calculation
rather than a search for the words was the cool part.
"Ray at <%=sLocation%>" <myfirstname at lane34 dot com> wrote in message
news:uv**************@tk2msftngp13.phx.gbl...
Google is a non-stop source of awe. This is why I buy Google t-shirts. The calculator can also do some math too in addition to unit conversion (okay,
that's also math, but fine), i.e.
5 percent of 343

Ray at work

"TomB" <sh*****@hotmail.com> wrote in message
news:u1**************@tk2msftngp13.phx.gbl...
Yes you are right, that would be complex.
Speaking of Google's parsing have you seen the calculator? Try

"searching"
for

100 kilometers in miles

Very cool.


Jul 19 '05 #8

P: n/a
It even works with slightly more complex phrases, like 100 degrees
fahrenheit in celsius

"TomB" <sh*****@hotmail.com> wrote in message
news:ef**************@TK2MSFTNGP09.phx.gbl...
Yeah, I think that's why they call it a calculator ;)
I thought the fact that it was able to determine that I wanted a calculation rather than a search for the words was the cool part.

Jul 19 '05 #9

P: n/a
OK, I've come up with the following function that returns an array
containing the keywords. However, in order for this to work, you need to set
some ground rules:
1. Don't mix delimiters for a phrase. This will work correctly:
Jack, Jill Jim, "Timothy Brown", 'Mary'
but this will not:
Jack, Jill Jim, "Timothy Brown', 'Mary'

2. If literal delimiter characters are used, then they must not match the
delimiters used. For example, this will work:
"O'Malley"
but this will not:
'O'Malley'
Also, if literal delimiter characters are used, all delimiters in the entire
list must be the same. This will work:
Jim, "Tom Brown", "Pat O'Malley"
This won't:
Jim, 'Tom Brown', "Pat O'Malley"

Anyways, the function appears below my signature. You can use this code to
test it:
Dim iCount, arResult, sWords
sWords="Jack, Jill Jim, ""Timothy Brown"", 'Mary'"
Response.Write sWords & "<BR>"
arResult= ParseKeywords(sWords)
if IsArray(arResult) then
for iCount = 0 to UBound(arResult)
Response.Write arResult(iCount) & "<BR>"
next
end if

HTH,
Bob Barrows

Function ParseKeywords(pKeywords)
Dim sKeywords,iQuotes, arQuoted(), i, j, k, sTmp, bQfound, bSQFound
dim iCommas, arCommas, arSpaces, bArrayDefined, arKeywords()
bArrayDefined = false
sKeywords = pKeywords
'first see if sKeywords contains quoted sections - if so, make
'sure they are paired, ie, there is an even number of quotes
iQuotes = len(sKeywords) - len(Replace(sKeywords,"""",""))
bQfound = false
if iQuotes > 0 then
if iQuotes mod 2 = 0 then
bQfound = true
redim arQuoted(iQuotes/2 - 1)
i=instr(sKeywords,"""")
k = 0
Do Until i = 0
j = instr(i+1,sKeywords,"""")
sTmp = mid(sKeywords,i,j+1-i)
arQuoted(k) = sTmp
k=k+1
sKeywords = replace(sKeywords,sTmp,"")
i=instr(sKeywords,"""")
Loop
for i = 0 to ubound(arQuoted)
arQuoted(i) = replace(arQuoted(i),"""","")
next
end if
end if

'now find single-quoted sections
iQuotes = len(sKeywords) - len(Replace(sKeywords,"'",""))
bSQFound = false
if iQuotes > 0 then
if iQuotes mod 2 = 0 then
bSQFound = true
if bQfound = false then
redim arQuoted(iQuotes/2 - 1)
k = 0
else
k = ubound(arQuoted) + 1
Redim preserve arQuoted(UBound(arQuoted) + iQuotes/2)
end if
i=instr(sKeywords,"'")
Do Until i = 0
j = instr(i+1,sKeywords,"'")
sTmp = mid(sKeywords,i,j+1-i)
arQuoted(k) = sTmp
k=k+1
sKeywords = replace(sKeywords,sTmp,"")
i=instr(sKeywords,"'")
Loop
for i = 0 to ubound(arQuoted)
arQuoted(i) = replace(arQuoted(i),"'","")
next
end if
end if
sKeywords = RTrim(sKeywords)
do until right(sKeywords,1) <> ","
sKeywords = rtrim(left(sKeywords,len(sKeywords)-1))
loop

'add quoted sections to result array
if bQfound or bSQFound then
redim arKeywords(UBound(arQuoted))
for i = 0 to ubound(arQuoted)
arKeywords(i) = arQuoted(i)
next
bArrayDefined = true
end if

'now process commas and spaces

iCommas = len(sKeywords) - len(Replace(sKeywords,",",""))
arCommas=split(sKeywords,",")
for i = 0 to ubound(arCommas)
arCommas(i) = RTrim(LTrim(arCommas(i)))
if len(arCommas(i)) > 0 then
if instr(arCommas(i)," ") = 0 then
if bArrayDefined then
redim preserve arKeywords(UBound(arKeywords) + 1)
else
redim arKeywords(0)
end if
arKeywords(ubound(arKeywords)) = arCommas(i)
else
arSpaces = split(arCommas(i)," ")
for j = 0 to ubound(arSpaces)
arSpaces(j) = RTrim(LTrim(arSpaces(j)))
if len(arSpaces(j)) > 0 then
if bArrayDefined then
redim preserve arKeywords(UBound(arKeywords) + 1)
else
redim arKeywords(0)
end if
arKeywords(ubound(arKeywords)) = arSpaces(j)
end if
next
end if
end if
next
ParseKeywords=arKeywords
end function
ARK wrote:
I am writing a search program in ASP(VBScript). The user can enter
keywords and press submit.
The user can separate the keywords by spaces and/or commas and key
words may contain plain words, single quoted strings (phrases),
double quoted strings (phrases).
For example:

Keywords:

Jack, Jill, Jim, "Timothy Brown", Mary OR
Jack Jill Jim 'Timothy Brown' Mary OR
Jack, Jill Jim, 'Timothy Brown' "Mary"

When I parse it i store the keywords in an array. The results must be:

Jack
Jill
Jim
Timothy Brown
Mary

I have tried doing this using Split but am unable to get the Phrases.
Some suggestions, code examples or links would help.

Thanks in advance

ARK.

Jul 19 '05 #10

P: n/a
Chris Hohmann wrote:
Here's a regular expression alternative:
<%
Dim s,oRE,oMatches,oMatch
s = "Jack, Jill, Jim, 'Timothy Brown', Mary"
Set oRE = New RegExp
oRE.Global=True
oRE.Pattern = "\w+|('|"")([^\1]|\1{2})+\1"
Set oMatches = oRE.Execute(s)
For Each oMatch In oMatches
Response.Write oMatch.Value & "<br>"
Next
%>

Showoff! ;-)

Actually, I have to dive into this regexp stuff. I've been meaning to but I
just haven't had the time.

If you have a few min. could you break down that pattern you used and
explain each element?

I'm assuming the same ground rules I laid out still apply to your solution
here ... ?

Bob Barrows
Jul 19 '05 #11

P: n/a

"Bob Barrows" <re*******@yahoo.com> wrote in message
news:eB**************@TK2MSFTNGP12.phx.gbl...
Chris Hohmann wrote:
Here's a regular expression alternative:
<%
Dim s,oRE,oMatches,oMatch
s = "Jack, Jill, Jim, 'Timothy Brown', Mary"
Set oRE = New RegExp
oRE.Global=True
oRE.Pattern = "\w+|('|"")([^\1]|\1{2})+\1"
Set oMatches = oRE.Execute(s)
For Each oMatch In oMatches
Response.Write oMatch.Value & "<br>"
Next
%>
Showoff! ;-)

Actually, I have to dive into this regexp stuff. I've been meaning to

but I just haven't had the time.

If you have a few min. could you break down that pattern you used and
explain each element?

I'm assuming the same ground rules I laid out still apply to your solution here ... ?

Bob Barrows


Sure...

\w+ = a series of one(1) or more word characters, i.e.
[a-zA-Z0-9_]

| = OR

('|") = a quote (") OR an apostrophe ('), let call this submatch
QUALIFIER

([^\1]|\1{2})+ = one(1) or more characters that are either not the
QUALIFIER OR a double occurrence of the QUALIFIER (escaping quotes)

\1 = a closing instance of the QUALIFIER

A perennial favorite for those interested in regular expressions is
O'Reilly's "Mastering Regular Expressions" (ISBN:0596002890)

"HTH".replace(/HTH/g,"Hope that helps,");
-Chris
Jul 19 '05 #12

P: n/a
"Bob Barrows" <re*******@yahoo.com> wrote in message
news:eB**************@TK2MSFTNGP12.phx.gbl...
I'm assuming the same ground rules I laid out still apply to your solution here ... ?


Sorry, I forgot to answer this in my previous post. Your first rule
about balanced(matched) text qualifiers applies to my solution as well.
However, your second rule does not apply. The value list can contain a
mixture of quote-qualified phrases and apostrophe qualified phrases.
Also, a qualifier can be embedded into a phrase by doubling-it-up
(escaping). Finally, regular expression, by default are greedy
algorithms (although you can override this behavior). As such the
expression will match as much of the string as possible. Having said all
that, the following should be a valid value list:

Bob, Barrows, "Bob 'The Man' Barrows", 'Bob "The Man" Barrows', "Bob
""The Man"" Barrows", 'Bob ''The Man'' Barrows'

HTH
-Chris
Jul 19 '05 #13

P: n/a
ARK
Hi! Everyone,

Thanks for the replies. I will try out the code and post my findings. What
version onwards VBScript
supports Regular Expressions?

Thanks again!
ARK.

"ARK" <vi***@arkinfotech.com> wrote in message
news:u2**************@TK2MSFTNGP12.phx.gbl...
I am writing a search program in ASP(VBScript). The user can enter keywords and press submit.
The user can separate the keywords by spaces and/or commas and key words may contain plain words, single quoted strings (phrases), double quoted strings (phrases).
For example:

Keywords:

Jack, Jill, Jim, "Timothy Brown", Mary OR
Jack Jill Jim 'Timothy Brown' Mary OR
Jack, Jill Jim, 'Timothy Brown' "Mary"

When I parse it i store the keywords in an array. The results must be:

Jack
Jill
Jim
Timothy Brown
Mary

I have tried doing this using Split but am unable to get the Phrases. Some
suggestions, code examples or links would help.

Thanks in advance

ARK.

Jul 19 '05 #14

P: n/a
ARK
I tried the Function that uses RegExp but the following does not work
on my server (windows 2000 Prof./IIS 5.0) -

Set re = new RegExp

This Object is supposed to be supported by VBScript 5.0 which comes in
Windows 2000 / IE 5.0 upwards, how come it does not work on my server?

"ARK" <vi***@arkinfotech.com> wrote in message
news:u2**************@TK2MSFTNGP12.phx.gbl...
I am writing a search program in ASP(VBScript). The user can enter keywords and press submit.
The user can separate the keywords by spaces and/or commas and key words may contain plain words, single quoted strings (phrases), double quoted strings (phrases).
For example:

Keywords:

Jack, Jill, Jim, "Timothy Brown", Mary OR
Jack Jill Jim 'Timothy Brown' Mary OR
Jack, Jill Jim, 'Timothy Brown' "Mary"

When I parse it i store the keywords in an array. The results must be:

Jack
Jill
Jim
Timothy Brown
Mary

I have tried doing this using Split but am unable to get the Phrases. Some
suggestions, code examples or links would help.

Thanks in advance

ARK.

Jul 19 '05 #15

P: n/a
ARK wrote:
I tried the Function that uses RegExp but the following does not work
on my server (windows 2000 Prof./IIS 5.0) -

Set re = new RegExp

This Object is supposed to be supported by VBScript 5.0 which comes in
Windows 2000 / IE 5.0 upwards, how come it does not work on my server?


It's very difficult to troubleshoot when all we are told is that something
"does not work." If a user called you and said one of your programs did not
work, what would be your first response?

Bob Barrows
Jul 19 '05 #16

P: n/a
Bob Barrows wrote:
ARK wrote:
I tried the Function that uses RegExp but the following does not work
on my server (windows 2000 Prof./IIS 5.0) -

Set re = new RegExp

This Object is supposed to be supported by VBScript 5.0 which comes
in Windows 2000 / IE 5.0 upwards, how come it does not work on my
server?


It's very difficult to troubleshoot when all we are told is that
something "does not work." If a user called you and said one of your
programs did not work, what would be your first response?


Can I have your card # and expiry date ???

--
William Tasso - http://WilliamTasso.com
Jul 19 '05 #17

P: n/a
William Tasso wrote:
Bob Barrows wrote:
ARK wrote:
I tried the Function that uses RegExp but the following does not
work on my server (windows 2000 Prof./IIS 5.0) -

Set re = new RegExp

This Object is supposed to be supported by VBScript 5.0 which comes
in Windows 2000 / IE 5.0 upwards, how come it does not work on my
server?


It's very difficult to troubleshoot when all we are told is that
something "does not work." If a user called you and said one of your
programs did not work, what would be your first response?


Can I have your card # and expiry date ???


LOL
OK - I meant "after that"
Jul 19 '05 #18

P: n/a
"Bob Barrows" <re*******@yahoo.com> wrote in message
news:OM**************@TK2MSFTNGP12.phx.gbl...
Chris Hohmann wrote:
"Bob Barrows" <re*******@yahoo.com> wrote in message
news:eB**************@TK2MSFTNGP12.phx.gbl...
I'm assuming the same ground rules I laid out still apply to your
solution here ... ?
Sorry, I forgot to answer this in my previous post. Your first rule
about balanced(matched) text qualifiers applies to my solution as
well. However, your second rule does not apply. The value list can
contain a mixture of quote-qualified phrases and apostrophe qualified phrases. Also, a qualifier can be embedded into a phrase by
doubling-it-up (escaping). Finally, regular expression, by default
are greedy algorithms (although you can override this behavior). As
such the expression will match as much of the string as possible.


So the users will have to be trained to escape their quotes, eh? I

don't know ... it's hard enough to train some of the programmers to do this .... ;-)

Bob


Only if they want to embed quotes in the phrase they're looking for.
Most users (and programmers) can remain ignorantly blissful about the
concept. Perhaps you should teach your users about "stored procedure as
method", then they wouldn't have to worry about quotes/apostrophes in
their parameters. :)
Jul 19 '05 #19

P: n/a
ARK
Well the error shown is -
Technical Information (for support personnel)

a.. Error Type:
(0x8002801D)
Library not registered.
/regexp.asp, line 5
b.. Browser Type:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)

c.. Page:
GET /regexp.asp
and Line 5 happens to have the following -

Set re = new RegExp

I guess the dll is there somewhere and it's not got registered during
Windows 2K Install?

"William Tasso" <ng*@tbdata.com> wrote in message
news:O0**************@tk2msftngp13.phx.gbl...
Bob Barrows wrote:
ARK wrote:
I tried the Function that uses RegExp but the following does not work
on my server (windows 2000 Prof./IIS 5.0) -

Set re = new RegExp

This Object is supposed to be supported by VBScript 5.0 which comes
in Windows 2000 / IE 5.0 upwards, how come it does not work on my
server?


It's very difficult to troubleshoot when all we are told is that
something "does not work." If a user called you and said one of your
programs did not work, what would be your first response?


Can I have your card # and expiry date ???

--
William Tasso - http://WilliamTasso.com

Jul 19 '05 #20

This discussion thread is closed

Replies have been disabled for this discussion.