473,387 Members | 1,493 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Re: Re: how to get all repeated group with regular expression

MRAB wrote:
<div class="moz-text-flowed" style="font-family: -moz-fixed">Steve
Holden wrote:
>Please keep this on the list.

scsoce wrote:
>>Steve Holden wrote:
scsoce wrote:

say, when I try to search and match every char from variable length
string, such as string '123456', i tried re.findall( r'(\d)*,
'12346' )
>
I think you will find you missed a quote out there. Always better to
copy and paste ...
, but only get '6' and Python doc indeed say: "If a group is
contained
in a part of the pattern that matched multiple times, the last
match is
returned."
>
So use

r'(\d*)'

instead and then the group includes all the digits you match.
cause the regx engine cannot remember all the past history then ?
is it
nature to all regx engine or only to Python ?
>
Different regex engines have different capabilities, so I can't
speak to
them all. If you wanted *all* the matches of *all* groups, how
would you
have them returned? As a list? That would make the case where there
was
only one match much tricker to handle. And what would you do with

r'((\w)*\d)*)'

Also, what about named groups? I can see enough potential
implementation
issues that I can perfectly understand why Python works the way it
does,
so I'd be interested to know why it doesn't makes sense to you, and
what
you would prefer it to do.

regards
Steve

maybe my expression was not clear. I want to capture every matched
part
in a repeated pattern, not only the last, say, for string '123456', I
want to back reference any one char, not only the '6'. and i know the
example is very simple, so we can got the whole string using regx and
get every char using other python statements, but if the pattern in
group is complex?
and I test in VIM, it can do the 'back reference':
==you text in vim:
123456
== pattern:
:%s/\(\d\)*/$2
text will turn to be:
2
'Fraid the Python re implementers just decided not to do it that way.
Nor Perl.

Probably what you want is re.findall(r"(\d)", "123456"), which returns
a list of what it captured.
</div>
Yes, you are right, but this way findall() capture only the 'top' group.
What I really need to do is to capture nested and repated patterns, say,
<tabletag in html contains many <tr>, <tr contains many <td>,
the data in <td is i need, so I write the regx like this:
regx ='''
<table.*\n
(
(\s*<tr.*\n
(\s*<td.*</td>\n|\n)*
\s*</tr>\n
|\n)*
)
\s*</table>
'''
Steve Holden wrote:
I can see enough potential implementation
issues that I can perfectly understand why Python works the way it does,
so I'd be interested to know why it doesn't makes sense to you, and what
you would prefer it to do.
As Steve said, if re really cannot do this kind of work , so I have to
split the one line regx down, and capture <tablefirst, and then loop
to catpure <tr>, and then <td>, and so on ... . I donnot like this way
compared with the above one clean regx line.

Nov 22 '08 #1
0 2781

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Leif K-Brooks | last post by:
How do I make a regular expression which will match the same character repeated one or more times, instead of matching repetitions of any (possibly non-same) characters like ".+" does? In other...
0
by: WALDO | last post by:
Assuming the following code: Dim strPattern As String = "(\d{3})(-)(\d{4})" Dim strMatch As String = "555-1234" Dim regExp As New RegEx(strPattern) Dim matches As MatchCollection =...
0
by: maersa | last post by:
hi all, i've got a group list like the following, and i'm trying to use the MatchAttribute to get the contents within the list and generate my necessary classes. ...
7
by: matteosartori | last post by:
Hi all, I've spent all morning trying to work this one out: I've got the following string: ...
4
by: Robert Dodier | last post by:
Hello all, I'm trying to find substrings that look like 'FOO blah blah blah' in a string. For example give 'blah FOO blah1a blah1b FOO blah2 FOO blah3a blah3b blah3b' I want to get three...
1
by: bitwxtadpl | last post by:
Hi, I have a simple parsing regular expression that is expecting data and a delimiter. Axy where A is the data and xy is the delimiter When the delimiter is xy it works as expected. But, when I...
4
by: DomoChan | last post by:
When I attempt to name a group in a regular expression under TR1, the library throws a non descriptive error "regular expression error". The numbered reference group works, as in /1 to reference...
1
by: scsoce | last post by:
say, when I try to search and match every char from variable length string, such as string '123456', i tried re.findall( r'(\d)*, '12346' ) , but only get '6' and Python doc indeed say: "If a...
0
by: Steve Holden | last post by:
Please keep this on the list. scsoce wrote: 'Fraid the Python re implementers just decided not to do it that way. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.