By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,907 Members | 2,039 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,907 IT Pros & Developers. It's quick & easy.

A bug in Python's regular expression engine?

P: n/a
This won't compile for me:
regex = re.compile('(.*\\).*')
I get the error:
sre_constants.error: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Thank you...

Nov 27 '07 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Just Another Victim of the Ambient Morality wrote:
This won't compile for me:
regex = re.compile('(.*\\).*')
I get the error:
sre_constants.error: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?
It pretty much says what the problem is - you escaped the closing
parenthesis, resulting in an invalid rex.

Either use raw-strings or put the proper amount of backslashes in your
string:

regex = re.compile(r'(.*\\).*') # raw string literal

regex = re.compile('(.*\\\\).*') # two consecutive \es, meaning an escaped
one

Diez
Nov 27 '07 #2

P: n/a
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:
This won't compile for me:

regex = re.compile('(.*\\).*')

I get the error:

sre_constants.error: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Your code is equivalent to:
regex = re.compile(r'(.*\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(.*\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.

HTH
--
Paul Hankin
Nov 27 '07 #3

P: n/a

"Paul Hankin" <pa*********@gmail.comwrote in message
news:31**********************************@j20g2000 hsi.googlegroups.com...
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:
> This won't compile for me:

regex = re.compile('(.*\\).*')

I get the error:

sre_constants.error: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?

Your code is equivalent to:
regex = re.compile(r'(.*\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(.*\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.
Indeed, I did end up using os.path functions, instead.
I think I see what's going on. Backslash has special meaning in both
the regular expression and Python string declarations. So, my version
should have been something like this:
regex = re.compile('(.*\\\\).*')
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

Nov 27 '07 #4

P: n/a
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:

That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?
It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Nov 27 '07 #5

P: n/a
On Nov 27, 10:52 am, MonkeeSage <MonkeeS...@gmail.comwrote:
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"

<ihates...@hotmail.comwrote:
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Arg! %w{} should have said %q{}
Nov 27 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.