473,782 Members | 2,479 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

A bug in Python's regular expression engine?

This won't compile for me:
regex = re.compile('(.* \\).*')
I get the error:
sre_constants.e rror: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Thank you...

Nov 27 '07 #1
5 5372
Just Another Victim of the Ambient Morality wrote:
This won't compile for me:
regex = re.compile('(.* \\).*')
I get the error:
sre_constants.e rror: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?
It pretty much says what the problem is - you escaped the closing
parenthesis, resulting in an invalid rex.

Either use raw-strings or put the proper amount of backslashes in your
string:

regex = re.compile(r'(. *\\).*') # raw string literal

regex = re.compile('(.* \\\\).*') # two consecutive \es, meaning an escaped
one

Diez
Nov 27 '07 #2
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotm ail.comwrote:
This won't compile for me:

regex = re.compile('(.* \\).*')

I get the error:

sre_constants.e rror: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Your code is equivalent to:
regex = re.compile(r'(. *\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(. *\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.

HTH
--
Paul Hankin
Nov 27 '07 #3

"Paul Hankin" <pa*********@gm ail.comwrote in message
news:31******** *************** ***********@j20 g2000hsi.google groups.com...
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotm ail.comwrote:
> This won't compile for me:

regex = re.compile('(.* \\).*')

I get the error:

sre_constants. error: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?

Your code is equivalent to:
regex = re.compile(r'(. *\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(. *\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.
Indeed, I did end up using os.path functions, instead.
I think I see what's going on. Backslash has special meaning in both
the regular expression and Python string declarations. So, my version
should have been something like this:
regex = re.compile('(.* \\\\).*')
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

Nov 27 '07 #4
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"
<ihates...@hotm ail.comwrote:

That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?
It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Nov 27 '07 #5
On Nov 27, 10:52 am, MonkeeSage <MonkeeS...@gma il.comwrote:
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"

<ihates...@hotm ail.comwrote:
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Arg! %w{} should have said %q{}
Nov 27 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
4525
by: Dieter Vanderelst | last post by:
Dear all, I'm currently comparing Python versus Perl to use in a project that involved a lot of text processing. I'm trying to determine what the most efficient language would be for our purposes. I have to admit that, although I'm very familiar with Python, I'm complete Perl noob (and I hope to stay one) which is reflected in my questions. I know that the web offers a lot of resources on Python/Perl differences. But I couldn't find a...
14
11483
by: olekristianvillabo | last post by:
I have a regular expression that is approximately 100k bytes. (It is basically a list of all known norwegian postal numbers and the corresponding place with | in between. I know this is not the intended use for regular expressions, but it should nonetheless work. the pattern is ur'(N-|NO-)?(5259 HJELLESTAD|4026 STAVANGER|4027 STAVANGER........|8305 SVOLVÆR)' The error message I get is:
9
3358
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use an app call The Regulator, which makes it pretty easy to build and test regular expressions. As a warning, I'm real weak with regular expressions. Let's say my regular expression is:
5
2290
by: Cylix | last post by:
I am going to write a function that the search engine done. in search engine, we may using double quotation to specify a pharse like "I love you", How can I using regular expression to sperate each pharse? test case: "I love" all "of you" I would like it return:
32
14706
by: Licheng Fang | last post by:
Basically, the problem is this: 'do' Python's NFA regexp engine trys only the first option, and happily rests on that. There's another example: 'oneself' The Python regular expression engine doesn't exaust all the
1
2831
by: Wehrdamned | last post by:
Hi, As I understand it, python uses a pcre engine to work with regular expression. My question is, then, why expressions like : Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/lib/python2.4/sre.py", line 180, in compile return _compile(pattern, flags)
6
2864
by: rorymo | last post by:
I have a regular expression that allows only certain characters to be valid in an xml doc as follows: <xs:pattern value="^*" /> What I want to do is also allow any unicode character that is enclosed in single quotes to also be valid, no matter where they appear. I tried the following: <xs:pattern value="^*('*)*" />
13
7492
by: Wiseman | last post by:
I'm kind of disappointed with the re regular expressions module. In particular, the lack of support for recursion ( (?R) or (?n) ) is a major drawback to me. There are so many great things that can be accomplished with regular expressions this way, such as validating a mathematical expression or parsing a language with nested parens, quoting or expressions. Another feature I'm missing is once-only subpatterns and possessive quantifiers...
0
10313
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10146
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10080
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9944
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8968
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6735
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5378
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3643
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2875
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.