473,396 Members | 2,018 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

returning regex matches as lists

I am in the last phase of building a Django app based on something I
wrote in Java a while back. Right now I am stuck on how to return the
matches of a regular expression as a list *at all*, and in particular
given that the regex has a number of groupings. The only method I've
seen that returns a list is .findall(string), but then I get back the
groups as tuples, which is sort of a problem.

Thank you,
Jonathan
Feb 15 '08 #1
3 1569
On Feb 16, 6:07 am, Jonathan Lukens <jonathan.luk...@gmail.comwrote:
I am in the last phase of building a Django app based on something I
wrote in Java a while back. Right now I am stuck on how to return the
matches of a regular expression as a list *at all*, and in particular
given that the regex has a number of groupings. The only method I've
seen that returns a list is .findall(string), but then I get back the
groups as tuples, which is sort of a problem.
It would help if you explained what you want the contents of the list
to be, why you want a list as opposed to a tuple or a generator or
whatever ... we can't be expected to imagine why getting groups as
tuples is "sort of a problem".

Use a concrete example, e.g.
>>import re
regex = re.compile(r'(\w+)\s+(\d+)')
text = 'python 1 junk xyzzy 42 java 666'
r = regex.findall(text)
r
[('python', '1'), ('xyzzy', '42'), ('java', '666')]
>>>
What would you like to see instead?
Feb 15 '08 #2
En Fri, 15 Feb 2008 17:07:21 -0200, Jonathan Lukens
<jo*************@gmail.comescribió:
I am in the last phase of building a Django app based on something I
wrote in Java a while back. Right now I am stuck on how to return the
matches of a regular expression as a list *at all*, and in particular
given that the regex has a number of groupings. The only method I've
seen that returns a list is .findall(string), but then I get back the
groups as tuples, which is sort of a problem.
Do you want something like this?

pyre.findall(r"([a-z]+)([0-9]+)", "foo bar3 w000 no abc123")
[('bar', '3'), ('w', '000'), ('abc', '123')]
pyre.findall(r"(([a-z]+)([0-9]+))", "foo bar3 w000 no abc123")
[('bar3', 'bar', '3'), ('w000', 'w', '000'), ('abc123', 'abc', '123')]
pygroups = re.findall(r"(([a-z]+)([0-9]+))", "foo bar3 w000 no abc123")
pygroups
[('bar3', 'bar', '3'), ('w000', 'w', '000'), ('abc123', 'abc', '123')]
py[group[0] for group in groups]
['bar3', 'w000', 'abc123']

--
Gabriel Genellina

Feb 16 '08 #3
John,
(1) raw string for improved legibility
ru'(?u)\b([á-ñ]{2,}\s+)([<<"][Á-Ñá-ñ]+)(\s*-?[Á-Ñá-ñ]+)*([>>"])'
This actually escaped my notice after I had posted -- the letters with
diacritics are incorrectly decoded Cyrillic letters -- I suppose I
code use the Unicode escape sequences (the sets [á-ñ] and [Á-Ñá-ñ] are
the Cyrillic equivalents of [a-z] and [A-Za-z]) but then suddenly the
legibility goes out the window again.
(3) what appears between [] is a set of characters, so [<<"] is the
same as [<"] and probably isn't doing what you expect; have you tested
this regex for correctness?
These were angled quotation marks in the original Unicode. Sorry
again. The regex matches everything it is supposed to. The extra
parentheses were because I had somehow missed the .group method and it
had only been returning what was only in the one needed set of
parentheses.
I can't imagine how "not a programmer" implies "interested to know if
there is a more elegant way".
More carefully stated: "I am self-taught have no real training or
experience as a programmer and would be interested in seeing how a
programmer with training
and experience would go about this."

Thank you,
Jonathan
Feb 16 '08 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Mr.Clean | last post by:
I am working on modifying a syntax highlighter written in javascript and it uses several regexes. I need to add a language to the avail highlighters and need the following regexes modified to...
7
by: alphatan | last post by:
Is there relative source or document for this purpose? I've searched the index of "Mastering Regular Expression", but cannot get the useful information for C. Thanks in advanced. -- Learning...
7
by: bill tie | last post by:
I'd appreciate it if you could advise. 1. How do I replace "\" (backslash) with anything? 2. Suppose I want to replace (a) every occurrence of characters "a", "b", "c", "d" with "x", (b)...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
5
by: Chris | last post by:
How Do I use the following auto-generated code from The Regulator? '------------------------------------------------------------------------------ ' <autogenerated> ' This code was generated...
13
by: Chris Lieb | last post by:
I am trying to write a regex that will parse BBcode into HTML using JavaScript. Everything was going smoothly using the string class replace() operator with regex's until I got to the list tag....
11
by: Steve | last post by:
Hi All, I'm having a tough time converting the following regex.compile patterns into the new re.compile format. There is also a differences in the regsub.sub() vs. re.sub() Could anyone lend...
2
by: O.B. | last post by:
In the following example, the Matches operation never returns 4 matches as I am expecting. What's wrong with my syntax? private const string DOUBLE_REGEX = @"?*?*"; private const string...
1
by: al.moorthi | last post by:
the below program is working in Suse and not working on Cent 5: can any body have the solution ? #include <regex.h> #include <stdlib.h> #include <stdio.h> int main(){ char cool =...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.