473,403 Members | 2,366 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,403 software developers and data experts.

A bug in Python's regular expression engine?

This won't compile for me:
regex = re.compile('(.*\\).*')
I get the error:
sre_constants.error: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Thank you...

Nov 27 '07 #1
5 5350
Just Another Victim of the Ambient Morality wrote:
This won't compile for me:
regex = re.compile('(.*\\).*')
I get the error:
sre_constants.error: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?
It pretty much says what the problem is - you escaped the closing
parenthesis, resulting in an invalid rex.

Either use raw-strings or put the proper amount of backslashes in your
string:

regex = re.compile(r'(.*\\).*') # raw string literal

regex = re.compile('(.*\\\\).*') # two consecutive \es, meaning an escaped
one

Diez
Nov 27 '07 #2
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:
This won't compile for me:

regex = re.compile('(.*\\).*')

I get the error:

sre_constants.error: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Your code is equivalent to:
regex = re.compile(r'(.*\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(.*\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.

HTH
--
Paul Hankin
Nov 27 '07 #3

"Paul Hankin" <pa*********@gmail.comwrote in message
news:31**********************************@j20g2000 hsi.googlegroups.com...
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:
> This won't compile for me:

regex = re.compile('(.*\\).*')

I get the error:

sre_constants.error: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?

Your code is equivalent to:
regex = re.compile(r'(.*\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(.*\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.
Indeed, I did end up using os.path functions, instead.
I think I see what's going on. Backslash has special meaning in both
the regular expression and Python string declarations. So, my version
should have been something like this:
regex = re.compile('(.*\\\\).*')
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

Nov 27 '07 #4
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:

That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?
It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Nov 27 '07 #5
On Nov 27, 10:52 am, MonkeeSage <MonkeeS...@gmail.comwrote:
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"

<ihates...@hotmail.comwrote:
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Arg! %w{} should have said %q{}
Nov 27 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Dieter Vanderelst | last post by:
Dear all, I'm currently comparing Python versus Perl to use in a project that involved a lot of text processing. I'm trying to determine what the most efficient language would be for our...
14
by: olekristianvillabo | last post by:
I have a regular expression that is approximately 100k bytes. (It is basically a list of all known norwegian postal numbers and the corresponding place with | in between. I know this is not the...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
5
by: Cylix | last post by:
I am going to write a function that the search engine done. in search engine, we may using double quotation to specify a pharse like "I love you", How can I using regular expression to sperate...
32
by: Licheng Fang | last post by:
Basically, the problem is this: 'do' Python's NFA regexp engine trys only the first option, and happily rests on that. There's another example: 'oneself' The Python regular expression...
1
by: Wehrdamned | last post by:
Hi, As I understand it, python uses a pcre engine to work with regular expression. My question is, then, why expressions like : Traceback (most recent call last): File "<stdin>", line 1, in...
6
by: rorymo | last post by:
I have a regular expression that allows only certain characters to be valid in an xml doc as follows: <xs:pattern value="^*" /> What I want to do is also allow any unicode character that is...
13
by: Wiseman | last post by:
I'm kind of disappointed with the re regular expressions module. In particular, the lack of support for recursion ( (?R) or (?n) ) is a major drawback to me. There are so many great things that can...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.