440,629 Members | 1,222 Online
Need help? Post your question and get tips & solutions from a community of 440,629 IT Pros & Developers. It's quick & easy.

# Some more odd behaviour from the Regexp library

 P: n/a Can anyone explain why it won't give me my captured group? In [1]: a = 'exit: gkdfjgfjdfsgdjglkghdfgkd' In [2]: import re In [3]: b = re.search(r'exit: (.*?)', a) In [4]: b.group(0) Out[4]: 'exit: ' In [5]: b.group(1) Out[5]: '' In [6]: b.group(2) IndexError: no such group Oct 20 '05 #1
4 Replies

 P: n/a In article <11**********************@g44g2000cwa.googlegroups .com>, "David Veerasingam" wrote: Can anyone explain why it won't give me my captured group? In [1]: a = 'exit: gkdfjgfjdfsgdjglkghdfgkd' In [2]: import re In [3]: b = re.search(r'exit: (.*?)', a) In [4]: b.group(0) Out[4]: 'exit: ' In [5]: b.group(1) Out[5]: '' In [6]: b.group(2) IndexError: no such group The ? tells (.*?) to match as little as possible and that is nothing. If you change it to (.*) it should do what you want. -- Doug Schwarz dmschwarz&urgrad,rochester,edu Make obvious changes to get real email address. Oct 20 '05 #2

 P: n/a "David Veerasingam" writes: Can anyone explain why it won't give me my captured group? In [1]: a = 'exit: gkdfjgfjdfsgdjglkghdfgkd' In [2]: import re In [3]: b = re.search(r'exit: (.*?)', a) In [4]: b.group(0) Out[4]: 'exit: ' In [5]: b.group(1) Out[5]: '' In [6]: b.group(2) IndexError: no such group It is giving you your captured group. While the * operator matches as long a string as possible, the *? operator matches as *short* a string as possible. Since '' matches .*?, that's all it's ever going to capture. So b.group(1) is '', which is what it's giving you. a = 'exit: gkdfjgfjdfsgdjglkghdfgkd' import re b = re.search(r'exit: (.*)', a) b.group(0) 'exit: gkdfjgfjdfsgdjglkghdfgkd' b.group(1) 'gkdfjgfjdfsgdjglkghdfgkd' which I suspect is what you actually want. Of course, being the founder of SPARE, I have to point out that a.split(': ') will get you the same two strings as the re I used above. http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information. Oct 20 '05 #3

 P: n/a Mike Meyer wrote: "David Veerasingam" writes: [...] Of course, being the founder of SPARE, I have to point out that a.split(': ') will get you the same two strings as the re I used above. Let me guess: the Society for the Prevention of Abuse of Regular Expressions? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ Oct 20 '05 #4

 P: n/a Thanks for all your replies. I guess I've always used .*? as sort of an idiom for a non-greedy match, but I guess it only works if I specify the end point (which I didn't in the above case). e.g. re.search(r'exit: (.*?)\$', a) Thanks for pointing that out! David Oct 20 '05 #5

### This discussion thread is closed

Replies have been disabled for this discussion.