
November 21st, 2008, 05:45 PM
|
| |
Steve Holden wrote: Quote:
Please keep this on the list.
>
scsoce wrote: Quote:
>Steve Holden wrote: Quote:
>>scsoce wrote:
>>>
>>>say, when I try to search and match every char from variable length
>>>string, such as string '123456', i tried re.findall( r'(\d)*, '12346' )
>>>>
>>I think you will find you missed a quote out there. Always better to
>>copy and paste ...
>>>
>>>
>>>, but only get '6' and Python doc indeed say: "If a group is contained
>>>in a part of the pattern that matched multiple times, the last match is
>>>returned."
>>>>
>>So use
>>>
>> r'(\d*)'
>>>
>>instead and then the group includes all the digits you match.
>>>
>>>
>>>cause the regx engine cannot remember all the past history then ? is it
>>>nature to all regx engine or only to Python ?
>>>>
>>Different regex engines have different capabilities, so I can't speak to
>>them all. If you wanted *all* the matches of *all* groups, how would you
>>have them returned? As a list? That would make the case where there was
>>only one match much tricker to handle. And what would you do with
>>>
>> r'((\w)*\d)*)'
>>>
>>Also, what about named groups? I can see enough potential implementation
>>issues that I can perfectly understand why Python works the way it does,
>>so I'd be interested to know why it doesn't makes sense to you, and what
>>you would prefer it to do.
>>>
>>regards
>> Steve
>>>
| >maybe my expression was not clear. I want to capture every matched part
>in a repeated pattern, not only the last, say, for string '123456', I
>want to back reference any one char, not only the '6'. and i know the
>example is very simple, so we can got the whole string using regx and
>get every char using other python statements, but if the pattern in
>group is complex?
>and I test in VIM, it can do the 'back reference':
>==you text in vim:
>123456
>== pattern:
>:%s/\(\d\)*/$2
>text will turn to be:
>2
>>
| 'Fraid the Python re implementers just decided not to do it that way.
>
| Nor Perl.
Probably what you want is re.findall(r"(\d)", "123456"), which returns a
list of what it captured. |