473,385 Members | 1,602 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Regex help needed...

JS
I am writing a C# app that needs to parse a sentence entered by the user
for a simple boolean search.
I need to capture all of the AND words that are not inside of double
quotes. However, I am having a heck of a time figuring out a regex for it.
Can anyone assist with a regex to find all the AND's not in double quotes?

An example sentence might be:

red and blue and "crazy elephant" and "orange and red" and stuff.

I would need the 1st, 2nd, 3rd and 5th AND in the sentence, but not the 4th
one that is in "orange AND red".

I have several other parsing expressions in this program, but for some
reason, this particular regex eludes me, and I have been at it for some
time.

Any help would be appreciated.

TIA
-JS

PS: if there is a better usenet group for this question, please advise, as
I could not find one just for regex.
Dec 2 '05 #1
4 3187
You don't want a Regular Expression here. For example, as a human user is
inputting the string, what happens when the user inputs the following:

red and blue and "crazy elephant and "orange and red" and stuff.

Note that there are THREE sets of double-quotes in the input. So, what's
inside double-quotes, and what is not? Is the "and" after "elephant" inside
double-quotes? Is the "and" between "orange" nad "red" inside double quotes?
Are both? are neither?

you're only option here is to split the string on the double-quotes, and
then count. When you hit a double-quote, anything after it that is followed
by the next double-quote is "inside double-quotes." If there IS no next
double-quote, NOTHING after the first double-quote is inside double-quotes.

You will need to split the string in order to parse it in any case.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
You can lead a fish to a bicycle,
but you can't make it stink.

"JS" <pl*********@use.net> wrote in message
news:Xn******************************@207.46.248.1 6...
I am writing a C# app that needs to parse a sentence entered by the user
for a simple boolean search.
I need to capture all of the AND words that are not inside of double
quotes. However, I am having a heck of a time figuring out a regex for
it.
Can anyone assist with a regex to find all the AND's not in double quotes?

An example sentence might be:

red and blue and "crazy elephant" and "orange and red" and stuff.

I would need the 1st, 2nd, 3rd and 5th AND in the sentence, but not the
4th
one that is in "orange AND red".

I have several other parsing expressions in this program, but for some
reason, this particular regex eludes me, and I have been at it for some
time.

Any help would be appreciated.

TIA
-JS

PS: if there is a better usenet group for this question, please advise, as
I could not find one just for regex.

Dec 2 '05 #2
JS
After over an hour of working on this one...it comes to me minutes after I
post...Murphy's Law I guess...

Anyway, in case anyone needs this, the answer is...

(?:".+?")?(\s+and\s+)(?:".+?")?
JS <pl*********@use.net> wrote in
news:Xn******************************@207.46.248.1 6:

<snip>
Can anyone assist with a regex to find all the AND's not in
double quotes?


<snip>
Dec 2 '05 #3
JS
Thanks for the replay. You provide a very good point about the quotes.
Such a string as you provided would not pass my initial validator. In
order to help prevent any type of SQL injections, I do not allow the user
to enter symbols within the quoted sets. I also do a check for even
number of double quotes. Both of these are in an end user syntax
validator message. All symbols outside of the double quotes are allowed,
but are subsiquently removed or replaced before this regex is applied.

"Kevin Spencer" <ke***@DIESPAMMERSDIEtakempis.com> wrote in
news:#p*************@TK2MSFTNGP15.phx.gbl:
You don't want a Regular Expression here. For example, as a human user
is inputting the string, what happens when the user inputs the
following:

red and blue and "crazy elephant and "orange and red" and stuff.

Note that there are THREE sets of double-quotes in the input. So,
what's inside double-quotes, and what is not? Is the "and" after
"elephant" inside double-quotes? Is the "and" between "orange" nad
"red" inside double quotes? Are both? are neither?

you're only option here is to split the string on the double-quotes,
and then count. When you hit a double-quote, anything after it that is
followed by the next double-quote is "inside double-quotes." If there
IS no next double-quote, NOTHING after the first double-quote is
inside double-quotes.

You will need to split the string in order to parse it in any case.


Dec 2 '05 #4
Hi JS,

How many search engines have you seen that throw an exception or do not
allow certain characters to be input by the user? I haven't seen any. The
reason is, users are not always very smart people, and get discouraged
easily. It's more user-friendly to simply accept the input and deal with the
inconsistencies and possible attacks internally. Just a suggestion.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
You can lead a fish to a bicycle,
but you can't make it stink.

"JS" <pl*********@use.net> wrote in message
news:Xn******************************@207.46.248.1 6...
Thanks for the replay. You provide a very good point about the quotes.
Such a string as you provided would not pass my initial validator. In
order to help prevent any type of SQL injections, I do not allow the user
to enter symbols within the quoted sets. I also do a check for even
number of double quotes. Both of these are in an end user syntax
validator message. All symbols outside of the double quotes are allowed,
but are subsiquently removed or replaced before this regex is applied.

"Kevin Spencer" <ke***@DIESPAMMERSDIEtakempis.com> wrote in
news:#p*************@TK2MSFTNGP15.phx.gbl:
You don't want a Regular Expression here. For example, as a human user
is inputting the string, what happens when the user inputs the
following:

red and blue and "crazy elephant and "orange and red" and stuff.

Note that there are THREE sets of double-quotes in the input. So,
what's inside double-quotes, and what is not? Is the "and" after
"elephant" inside double-quotes? Is the "and" between "orange" nad
"red" inside double quotes? Are both? are neither?

you're only option here is to split the string on the double-quotes,
and then count. When you hit a double-quote, anything after it that is
followed by the next double-quote is "inside double-quotes." If there
IS no next double-quote, NOTHING after the first double-quote is
inside double-quotes.

You will need to split the string in order to parse it in any case.

Dec 2 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: alphatan | last post by:
Is there relative source or document for this purpose? I've searched the index of "Mastering Regular Expression", but cannot get the useful information for C. Thanks in advanced. -- Learning...
4
by: H | last post by:
This is kind of an followup on oneof my previous questions, and it has with RegEx to do. I have a string containing of several words. What would a good regex expression looklike to get one match...
2
by: D | last post by:
My first attempt at this and I'm searching formulas like so RIGHT(TEXT(A15,'yy'),1)*1000+A15-CONCATENATE(1,'-','jan','-',TEXT(A15,'yy'))+1 I want to extract the row / col coordinates (A15 in...
17
by: steve | last post by:
here's the deal...cvs, tick encapsulted data. trying to use regex's to validate records. here's an example row: 'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',''...
4
by: Ya Ya | last post by:
Hi, I have a string with some fixed text and variable text. For example: "this is a fixed text THE NEEDED INFO more more fixed text". How do I get the the variable text (THE NEEDED INFO) from this...
5
by: lgbjr | last post by:
Hello All, I have the following type of string: "X:Y\Z.exe" "123" What I need is an array of strings with the information from within each set of quotes. I was trying to use a Regex.Split, but...
13
by: Chris Lieb | last post by:
I am trying to write a regex that will parse BBcode into HTML using JavaScript. Everything was going smoothly using the string class replace() operator with regex's until I got to the list tag....
4
by: ad | last post by:
I am useing VS2005 to develop wep application. I use a RegularExpress both in RegularExpressionValidator and Regex class to validate a value. The RegularExpress is 20|\-9|\-1|?\d{1} When I...
3
by: | last post by:
I'm analyzing large strings and finding matches using the Regex class. I want to find the context those matches are found in and to display excerpts of that context, just as a search engine might....
4
by: Flomo Togba Kwele | last post by:
I am having difficulty writing a Regex constructor. A line has a quote(") at its beginning and its end. I need to strip both characters off. If the line looks like "1", I need the result to be 1....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.