By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,036 Members | 1,963 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,036 IT Pros & Developers. It's quick & easy.

re.sub and empty groups

P: n/a
Hi!

I'm trying to do a search-replace in places where some groups are
optional... Here's an example:
>re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola").groups()
('ola', None)
>re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola|").groups()
('ola', '')
>re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola|ole").groups()
('ola', 'ole')

The second and third results are right, but not the first one, where
it should be equal to the second (i.e., it should be an empty string
instead of None). This is because I want to use re.sub() and when the
group is None, it blows up with a stack trace...

Maybe I'm not getting the essence of groups and non-grouping groups.
Someone care to explain (and, give the correct solution :)) ?

Thanks in advance,

Hugo Ferreira

--
GPG Fingerprint: B0D7 1249 447D F5BB 22C5 5B9B 078C 2615 504B 7B85
Jan 16 '07 #1
Share this Question
Share on Google+
2 Replies


P: n/a

Hugo Ferreira wrote:
Hi!

I'm trying to do a search-replace in places where some groups are
optional... Here's an example:
re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola").groups()
('ola', None)
re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola|").groups()
('ola', '')
re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola|ole").groups()
('ola', 'ole')

The second and third results are right, but not the first one, where
it should be equal to the second (i.e., it should be an empty string
instead of None). This is because I want to use re.sub() and when the
group is None, it blows up with a stack trace...

Maybe I'm not getting the essence of groups and non-grouping groups.
Someone care to explain (and, give the correct solution :)) ?

Thanks in advance,

Hugo Ferreira

--
GPG Fingerprint: B0D7 1249 447D F5BB 22C5 5B9B 078C 2615 504B 7B85
>From the documentation:
groups( [default])
Return a tuple containing all the subgroups of the match, from 1 up to
however many groups are in the pattern. The default argument is used
for groups that did not participate in the match; it defaults to None.

Your second group is optional and does not take part in the match in
your first example. You can, however, still use this regular expression
if you use groups('') rather than groups().

A better way probably is to use a simplified regular expression

re.match(r"Image:([^\|]+)\|?(.*)", "Image:ola").groups()

i.e. match the text "Image:" followed by at least one character not
matching "|" followed by an optional "|" followed by any remaining
characters.

Jan 16 '07 #2

P: n/a

Hugo Ferreira wrote:
Hi!

I'm trying to do a search-replace in places where some groups are
optional... Here's an example:
re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola").groups()
('ola', None)
re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola|").groups()
('ola', '')
re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola|ole").groups()
('ola', 'ole')

The second and third results are right, but not the first one, where
it should be equal to the second (i.e., it should be an empty string
instead of None). This is because I want to use re.sub() and when the
group is None, it blows up with a stack trace...

Maybe I'm not getting the essence of groups and non-grouping groups.
Someone care to explain (and, give the correct solution :)) ?

Thanks in advance,

Hugo Ferreira

--
GPG Fingerprint: B0D7 1249 447D F5BB 22C5 5B9B 078C 2615 504B 7B85
>From the documentation:
groups( [default])
Return a tuple containing all the subgroups of the match, from 1 up to
however many groups are in the pattern. The default argument is used
for groups that did not participate in the match; it defaults to None.

Your second group is optional and does not take part in the match in
your first example. You can, however, still use this regular expression
if you use groups('') rather than groups().

A better way probably is to use a simplified regular expression

re.match(r"Image:([^\|]+)\|?(.*)", "Image:ola").groups()

i.e. match the text "Image:" followed by at least one character not
matching "|" followed by an optional "|" followed by any remaining
characters.

Jan 16 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.