473,396 Members | 1,712 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

RegEx to find a word not enclosed in paranthesis

I have a text and I need to find a Word that are not enclosed in
paranthesis. Can it be done with a regex? Is someone could help me?
I am not familar with regex...

Example looking for WORD:
(there is a WORD in ( my string WORD )) and * WORD * to (find WORD)
and * WORD *

Should give me the to word between star (star ar not part of string)

thanks a lot

Oct 31 '06 #1
1 4183
I don't believe this can be done using Regular Expressions, at least not
practically. I'll tell you why:

In order to identify the WORD you're looking for, the only rule that can be
applied is that it is preceded by the exact same number of left and right
parentheses. That means that the number of left parentheses before the WORD
and the number of right parentheses before the WORD must be the same,
whether 0 or more, but the exact same number of each.

In addition, the left and right parentheses have to be in order, that is, if
there are 2 left parentheses, they must be followed (at some point) by 2
right parentheses. In other words, you can't have 1 left parenthesis
followed by 2 right parentheses followed by one left parenthesis. And you
can't start with right parentheses. You must always have a number higher
than 0 of left parenthesis, followed by some sequence of 0 or more
characters that is NOT "WORD" followed by the exact same number of right
parentheses.

Since Regular Expressions does not have the capacity to count, this can't be
done using Regular Expressions. However, as I was able to determine the rule
for identifying WORD, I also have some idea of how it might be done using
string and character manipulation.

Since you're looking for the incidences of a string within a string, you
don't need to actually match the string, but only to know what the indices
of the incidences of the string within the origin string are. That is, once
you know the indices of the incidences, and you know what the search string
is, you can find them all within the string any time you need to.

You would need 2 variables, one to keep a count of left parenteses, and one
to keep a count of right parentheses. When you hit a left parenthesis,
increment the left parenthesis variable. If the 2 variables are not of equal
value, you don't do anything. If they are, you begin to check the characters
following for the search string ("WORD"). Here's an example. I've tested
this using all possible combinations, with one exception. It assumes that
left and right parentheses will always be in left-right order. That is, if
there is a stray parenthesis, or if the parentheses are somehow reversed in
the string, it may not work as advertised, and you may need to revise it:

/// <summary">
/// Finds the indices of all incidences of <paramref name="searchString"/>
/// found in <paramref name="origin"/that are not
/// enclosed within parentheses.
/// </summary>
/// <param name="origin">String to Search.</param>
/// <param name="searchString">String to Find.</param>
/// <returns>An array of the indices of all incidences of <paramref
name="searchString"/>
/// found in <paramref name="origin"/that are not enclosed within
parentheses,
/// or an empty integer array if not found.</returns>
public static int[] IndicesWithoutParentheses(string origin, string
searchString)
{
char c;
int i, count = 0;
int leftCount = 0, rightCount = 0;
int originIndex, searchIndex;

int originLength = origin.Length;
int searchLength = searchString.Length;

int[] indices = new int[originLength]; // holds indices found
int[] result; // return value
for (i = 0; i < indices.Length; i++)
indices[i] = -1; // No index

// Iterate through the origin string
for (originIndex = 0; originIndex < originLength; originIndex++)
{
c = origin[originIndex]; // Current char
if (c == '(') leftCount++; // Count left parentheses
else if (c == ')') rightCount++; // Count right parentheses
else if (leftCount == rightCount)
{
i = originIndex;
// Find the first letter of searchString prior to any left parenthesis
while (i < origin.Length && origin[i] != searchString[0] &&
origin[i] != '(') i++;
// if we've reached the end of the origin string, we're done.
if (i == origin.Length) break;
// Otherwise, we set originIndex to i, and begin searching for
searchString
originIndex = i + 1;
if (origin[i] == '(')
{
leftCount++;
originIndex--;
continue;
}
// Begin looking for searchString
for (searchIndex = 1; searchIndex < searchLength; i++)
if (searchString[searchIndex++] != origin[originIndex++]) break;
// if the loop did not break, we have found one
if (searchIndex == searchLength) indices[count++] = originIndex -
searchIndex;
originIndex--; // need to back up one because outer loop increments.
}
}
i = Array.IndexOf<int>(indices, -1);
if (i <= 0) result = new int[0];
else
{
result = new int[i];
Array.Copy(indices, result, i);
}
return result;
}

--
HTH,

Kevin Spencer
Microsoft MVP
Short Order Coder
http://unclechutney.blogspot.com

The devil is in the yada yada yada
<vm*****@gmail.comwrote in message
news:11**********************@i42g2000cwa.googlegr oups.com...
>I have a text and I need to find a Word that are not enclosed in
paranthesis. Can it be done with a regex? Is someone could help me?
I am not familar with regex...

Example looking for WORD:
(there is a WORD in ( my string WORD )) and * WORD * to (find WORD)
and * WORD *

Should give me the to word between star (star ar not part of string)

thanks a lot

Oct 31 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Mark | last post by:
Hi, I've seen some postings on this but not exactly relating to this posting. I'm reading in a large mail message as a string. In the string is an xml attachment that I need to parse out and...
7
by: bill tie | last post by:
I'd appreciate it if you could advise. 1. How do I replace "\" (backslash) with anything? 2. Suppose I want to replace (a) every occurrence of characters "a", "b", "c", "d" with "x", (b)...
0
by: Derrick | last post by:
Hi Dave - Thanks, I'll give an easier example, say I have: sodium ion test and I search for "sodium ion test", matching, tagging, I end up with, say TEST${sodium ion test}
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
11
by: Steve | last post by:
Hi All, I'm having a tough time converting the following regex.compile patterns into the new re.compile format. There is also a differences in the regsub.sub() vs. re.sub() Could anyone lend...
2
by: Alex Maghen | last post by:
This is a bit of an abuse of this group. Just a nit, but I'm hoping someone really good with Regular Expressions can help me out here. I need to write a regular expression that will do the...
7
by: Extremest | last post by:
I am using this regex. static Regex paranthesis = new Regex("(\\d*/\\d*)", RegexOptions.IgnoreCase); it should find everything between parenthesis that have some numbers onyl then a forward...
4
by: vmoreau | last post by:
I have a text and I need to find a Word that are not enclosed in paranthesis. Can it be done with a regex? Is someone could help me? I am not familar with regex... Example looking for WORD:...
1
by: jonnyboy6969 | last post by:
Hi All Really hoping someone can help me out here with my deficient regex skills :) I have a function which takes a string of HTML and replaces a term (word or phrase) with a link. The pupose...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.