472,980 Members | 1,969 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,980 software developers and data experts.

A bug in Python's regular expression engine?

This won't compile for me:
regex = re.compile('(.*\\).*')
I get the error:
sre_constants.error: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Thank you...

Nov 27 '07 #1
5 5323
Just Another Victim of the Ambient Morality wrote:
This won't compile for me:
regex = re.compile('(.*\\).*')
I get the error:
sre_constants.error: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?
It pretty much says what the problem is - you escaped the closing
parenthesis, resulting in an invalid rex.

Either use raw-strings or put the proper amount of backslashes in your
string:

regex = re.compile(r'(.*\\).*') # raw string literal

regex = re.compile('(.*\\\\).*') # two consecutive \es, meaning an escaped
one

Diez
Nov 27 '07 #2
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:
This won't compile for me:

regex = re.compile('(.*\\).*')

I get the error:

sre_constants.error: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Your code is equivalent to:
regex = re.compile(r'(.*\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(.*\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.

HTH
--
Paul Hankin
Nov 27 '07 #3

"Paul Hankin" <pa*********@gmail.comwrote in message
news:31**********************************@j20g2000 hsi.googlegroups.com...
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:
> This won't compile for me:

regex = re.compile('(.*\\).*')

I get the error:

sre_constants.error: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?

Your code is equivalent to:
regex = re.compile(r'(.*\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(.*\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.
Indeed, I did end up using os.path functions, instead.
I think I see what's going on. Backslash has special meaning in both
the regular expression and Python string declarations. So, my version
should have been something like this:
regex = re.compile('(.*\\\\).*')
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

Nov 27 '07 #4
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:

That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?
It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Nov 27 '07 #5
On Nov 27, 10:52 am, MonkeeSage <MonkeeS...@gmail.comwrote:
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"

<ihates...@hotmail.comwrote:
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Arg! %w{} should have said %q{}
Nov 27 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Dieter Vanderelst | last post by:
Dear all, I'm currently comparing Python versus Perl to use in a project that involved a lot of text processing. I'm trying to determine what the most efficient language would be for our...
14
by: olekristianvillabo | last post by:
I have a regular expression that is approximately 100k bytes. (It is basically a list of all known norwegian postal numbers and the corresponding place with | in between. I know this is not the...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
5
by: Cylix | last post by:
I am going to write a function that the search engine done. in search engine, we may using double quotation to specify a pharse like "I love you", How can I using regular expression to sperate...
32
by: Licheng Fang | last post by:
Basically, the problem is this: 'do' Python's NFA regexp engine trys only the first option, and happily rests on that. There's another example: 'oneself' The Python regular expression...
1
by: Wehrdamned | last post by:
Hi, As I understand it, python uses a pcre engine to work with regular expression. My question is, then, why expressions like : Traceback (most recent call last): File "<stdin>", line 1, in...
6
by: rorymo | last post by:
I have a regular expression that allows only certain characters to be valid in an xml doc as follows: <xs:pattern value="^*" /> What I want to do is also allow any unicode character that is...
13
by: Wiseman | last post by:
I'm kind of disappointed with the re regular expressions module. In particular, the lack of support for recursion ( (?R) or (?n) ) is a major drawback to me. There are so many great things that can...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM) Please note that the UK and Europe revert to winter time on...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.