I need a pattern that matches a string that has the same number of '('
as ')':
findall( compile('...'), '42^((2x+2)sin( x)) + (log(2)/log(5))' ) = [
'((2x+2)sin(x)) ', '(log(2)/log(5))' ]
Can anybody help me out?
Thanks for any help! 14 2253
Chris wrote:
I need a pattern that matches a string that has the same number of '('
as ')':
findall( compile('...'), '42^((2x+2)sin( x)) + (log(2)/log(5))' ) = [
'((2x+2)sin(x)) ', '(log(2)/log(5))' ]
Can anybody help me out?
This is not possible with regular expressions  they can't "remember"
how many parens they already encountered.
You will need a real parser for this  pyparsing seems to be the most
popular choice today, I personally like spark. I'm sure you find an
examplegrammar that will parse simple arithmetical expressions like
the one above.
Diez
Chris wrote:
I need a pattern that matches a string that has the same number of '('
as ')':
findall( compile('...'), '42^((2x+2)sin( x)) + (log(2)/log(5))' ) = [
'((2x+2)sin(x)) ', '(log(2)/log(5))' ]
Can anybody help me out?
No, there is so such pattern. You will have to code up a function.
Consider what your spec really is: '42^((2x+2)sin( x)) +
(log(2)/log(5))' has the same number of left and right parentheses; so
does the zerolength string; so does ') + ('  perhaps you need to add
'and starts with a "("'
Consider what you are going to do with input like this:
print '(' + some_text + ')'
Maybe you need to do some lexical analysis and work at the level of
tokens rather than individual characters.
Which then raises the usual question: you have a perception that
regular expressions are the solution  to what problem??
HTH,
John
On 7 Oct 2006 15:00:29 0700, Diez B. Roggisch <de***@web.dewr ote:
>
Chris wrote:
I need a pattern that matches a string that has the same number of '('
as ')':
findall( compile('...'), '42^((2x+2)sin( x)) + (log(2)/log(5))' ) = [
'((2x+2)sin(x)) ', '(log(2)/log(5))' ]
Can anybody help me out?
This is not possible with regular expressions  they can't "remember"
how many parens they already encountered.
Remember that regular expressions are used to represent regular
grammars. Most regex engines actually aren't regular in that they
support fancy things like lookbehind/ahead and capture groups...IIRC,
these cannot be part of a true regular expression library.
With that said, the quoteunquote regexes in Lua have a special
feature that supports balanced expressions. I believe Python has a
PCRE lib somewhere; you may be able to use the experimental ??{ }
construct in that case.
 Theerasak
In article <11************ *********@e3g20 00cwe.googlegro ups.com>,
"Chris" <ch*********@gm ail.comwrote:
I need a pattern that matches a string that has the same number of '('
as ')':
findall( compile('...'), '42^((2x+2)sin( x)) + (log(2)/log(5))' ) = [
'((2x+2)sin(x)) ', '(log(2)/log(5))' ]
Can anybody help me out?
Thanks for any help!
Why does it need to be a regex? There is a very simple and wellknown
algorithm which does what you want.
Start with i=0. Walk the string one character at a time, incrementing i
each time you see a '(', and decrementing it each time you see a ')'. At
the end of the string, the count should be back to 0. If at any time
during the process, the count goes negative, you've got mismatched
parentheses.
The algorithm runs in O(n), same as a regex.
Regex is a wonderful tool, but it's not the answer to all problems.
Why does it need to be a regex? There is a very simple and wellknown
algorithm which does what you want.
Start with i=0. Walk the string one character at a time, incrementing i
each time you see a '(', and decrementing it each time you see a ')'. At
the end of the string, the count should be back to 0. If at any time
during the process, the count goes negative, you've got mismatched
parentheses.
The algorithm runs in O(n), same as a regex.
Regex is a wonderful tool, but it's not the answer to all problems.
Following Roy's suggestion, one could use something like:
>>s = '42^((2x+2)sin( x)) + (log(2)/log(5))' d = {'(':1, ')':1} sum(d.get(c , 0) for c in s)
0
If you get a sum() 0, then you have too many "(", and if you
have sum() < 0, you have too many ")" characters. A sum() of 0
means there's the same number of parens. It still doesn't solve
the aforementioned problem of things like ')))(((' which is
balanced, but psychotic. :)
tkc
hanumizzle wrote:
On 7 Oct 2006 15:00:29 0700, Diez B. Roggisch <de***@web.dewr ote:
Chris wrote:
I need a pattern that matches a string that has the same number of '('
as ')':
findall( compile('...'), '42^((2x+2)sin( x)) + (log(2)/log(5))' ) = [
'((2x+2)sin(x)) ', '(log(2)/log(5))' ]
Can anybody help me out?
This is not possible with regular expressions  they can't "remember"
how many parens they already encountered.
Remember that regular expressions are used to represent regular
grammars. Most regex engines actually aren't regular in that they
support fancy things like lookbehind/ahead and capture groups...IIRC,
these cannot be part of a true regular expression library.
Certainly true, and it always gives me a hard time because I don't know
to which extend a regular expression nowadays might do the job because
of these extensions. It was so much easier back in the old times....
With that said, the quoteunquote regexes in Lua have a special
feature that supports balanced expressions. I believe Python has a
PCRE lib somewhere; you may be able to use the experimental ??{ }
construct in that case.
Even if it has  I'm not sure if it really does you good, for several
reasons:
 regexes  even enhanced ones  don't build trees. But that is what
you ultimately want
from an expression like sin(log(x))
 even if they are more powerful these days, the theory of context
free grammars still applies.
so if what you need isn't LL(k) but LR(k), how do you specify that
to the regex engine?
 the regexes are useful because of their compact notations, parsers
allow for better structured outcome
Diez
On 8 Oct 2006 01:49:50 0700, Diez B. Roggisch <de***@web.dewr ote:
Even if it has  I'm not sure if it really does you good, for several
reasons:
 regexes  even enhanced ones  don't build trees. But that is what
you ultimately want
from an expression like sin(log(x))
 even if they are more powerful these days, the theory of context
free grammars still applies.
so if what you need isn't LL(k) but LR(k), how do you specify that
to the regex engine?
 the regexes are useful because of their compact notations, parsers
allow for better structured outcome
Just wait for Perl 6 :D
 Theerasak
Tim Chase:
It still doesn't solve the aforementioned problem
of things like ')))(((' which is balanced, but psychotic. :)
This may solve the problem:
def balanced(txt):
d = {'(':1, ')':1}
tot = 0
for c in txt:
tot += d.get(c, 0)
if tot < 0:
return False
return tot == 0
print balanced("42^(( 2x+2)sin(x)) + (log(2)/log(5))") # True
print balanced("42^(( 2x+2)sin(x) + (log(2)/log(5))") # False
print balanced("42^(( 2x+2)sin(x))) + (log(2)/log(5))") # False
print balanced(")))(( (") # False
A possibile alternative for Py 2.5. The dict solution looks better, but
this may be faster:
def balanced2(txt):
tot = 0
for c in txt:
tot += 1 if c=="(" else (1 if c==")" else 0)
if tot < 0:
return False
return tot == 0
Bye,
bearophile be************@ lycos.com wrote:
The dict solution looks better, but this may be faster:
it's slightly faster, but both your alternatives are about 10x slower
than a straightforward :
def balanced(txt):
return txt.count("(") == txt.count(")")
</F> This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics 
by: Neri 
last post by:
Some document processing program I write has to deal with documents
that have headers and footers that are unnecessary for the main
processing part. Therefore, I'm using a regular expression to go over
each document, find out if it contains a header and/or a footer and
extract only the main content part.
The headers and the footers have no...

by: hillcountry74 
last post by:
Hi,
I'm stuck with this regular expression from past 2 days. Desperately
need help.
I need a regular expression that will allow all characters except these
*:~<>'
This is my code in VB.Net
Dim regex As System.Text.RegularExpressions.Regex

by: Joe 
last post by:
Hi,
I have been using a regular expression that I donâ€™t uite understand to
filter the valid email address. My regular expression is as follows:
<asp:RegularExpressionValidator id="valValidEmail"
runat="server"
ControlToValidate="txtEmail"
ValidationExpression="^(+)(\.+)*@(+)(\.+)*(\.{2,4})$"

by: Q. John Chen 
last post by:
I have Vidation Controls
First One: Simple exluce certain special characters:
say no a or b or c in the string:
*
Second One:
I required date be entered in "MM/DD/YYYY" format:
//+4 How ??

by: Billa 
last post by:
Hi,
I am replaceing a big string using different regular expressions (see
some example at the end of the message). The problem is whenever I
apply a "replace" it makes a new copy of string and I want to avoid
that. My question here is if there is a way to pass either a memory
stream or array of "find", "replace" expressions or any other way...
 
by: Pete Davis 
last post by:
I'm using regular expressions to extract some data and some links from some
web pages. I download the page and then I want to get a list of certain
links.
For building regular expressions, I use an app call The Regulator, which
makes it pretty easy to build and test regular expressions.
As a warning, I'm real weak with regular...

by: Lucky 
last post by:
hi guys,
i'm practising regular expression. i've got one string and i want it to
split in groups.
i was trying to make one regular expression but i didn't successed.
please help me guys.
i'm using .NET 2.0's Regular expression class.
here is the string.

by: Mike 
last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in
matches. I would like to get what the actual regular expression is.
In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART (CONDUCTION
DEFECT) 37.33/2 HEART (CONDUCTION DEFECT) WITH
CATHETER 37.34/2 " the expression is "HEART (CONDUCTION DEFECT)". How
do I gain access...

by: deepak_kamath_n 
last post by:
Hello,
I am relatively new to the world of regex and require some help in
forming a regular expression to achieve the following:
I have an input stream similar to:
Slot: slot1
Description: this is a description
Slot: slot2

by: Hystou 
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language...

by: Oralloy 
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bitfields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it. ...
 
by: jinu1996 
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...

by: Hystou 
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...

by: tracyyun 
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, ZWave, WiFi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...

by: agi2029 
last post by:
Let's talk about the concept of autonomous AI software engineers and nocode agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...

by: adsilva 
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

by: 6302768590 
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
 
by: bsmnconsultancy 
last post by:
In today's digital era, a welldesigned website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...
 