471,349 Members | 1,427 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,349 software developers and data experts.

A bug in Python's regular expression engine?

This won't compile for me:
regex = re.compile('(.*\\).*')
I get the error:
sre_constants.error: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Thank you...

Nov 27 '07 #1
5 5215
Just Another Victim of the Ambient Morality wrote:
This won't compile for me:
regex = re.compile('(.*\\).*')
I get the error:
sre_constants.error: unbalanced parenthesis
I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?
It pretty much says what the problem is - you escaped the closing
parenthesis, resulting in an invalid rex.

Either use raw-strings or put the proper amount of backslashes in your
string:

regex = re.compile(r'(.*\\).*') # raw string literal

regex = re.compile('(.*\\\\).*') # two consecutive \es, meaning an escaped
one

Diez
Nov 27 '07 #2
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:
This won't compile for me:

regex = re.compile('(.*\\).*')

I get the error:

sre_constants.error: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads me
to believe the problem is Python. Can anyone confirm or deny this bug?
Your code is equivalent to:
regex = re.compile(r'(.*\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(.*\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.

HTH
--
Paul Hankin
Nov 27 '07 #3

"Paul Hankin" <pa*********@gmail.comwrote in message
news:31**********************************@j20g2000 hsi.googlegroups.com...
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:
> This won't compile for me:

regex = re.compile('(.*\\).*')

I get the error:

sre_constants.error: unbalanced parenthesis

I'm running Python 2.5 on WinXP. I've tried this expression with
another RE engine in another language and it works just fine which leads
me
to believe the problem is Python. Can anyone confirm or deny this bug?

Your code is equivalent to:
regex = re.compile(r'(.*\).*')

Written like this, it's easier to see that you've started a regular
expression group with '(', but it's never closed since your closed
parenthesis is escaped (which causes it to match a literal ')' when
used). Hence the reported error (which isn't a bug).

Perhaps you meant this?
regex = re.compile(r'(.*\\).*')

This matches any number of characters followed by a backslash (group
1), and then any number of characters. If you're using this for path
splitting filenames under Windows, you should look at os.path.split
instead of writing your own.
Indeed, I did end up using os.path functions, instead.
I think I see what's going on. Backslash has special meaning in both
the regular expression and Python string declarations. So, my version
should have been something like this:
regex = re.compile('(.*\\\\).*')
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

Nov 27 '07 #4
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.comwrote:

That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?
It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Nov 27 '07 #5
On Nov 27, 10:52 am, MonkeeSage <MonkeeS...@gmail.comwrote:
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality"

<ihates...@hotmail.comwrote:
That is funny. Thank you for your help...
Just for clarification, what does the "r" in your code do?

It means a "raw" string (as you know ruby, think of it like %w{}):

This page explains about string literal prefixes (see especially the
end-notes):

http://docs.python.org/ref/strings.html

HTH,
Jordan
Arg! %w{} should have said %q{}
Nov 27 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

9 posts views Thread by Dieter Vanderelst | last post: by
14 posts views Thread by olekristianvillabo | last post: by
9 posts views Thread by Pete Davis | last post: by
5 posts views Thread by Cylix | last post: by
1 post views Thread by Wehrdamned | last post: by
6 posts views Thread by rorymo | last post: by
13 posts views Thread by Wiseman | last post: by
reply views Thread by XIAOLAOHU | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.