472,358 Members | 1,979 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,358 software developers and data experts.

Working with named groups in re module

A found some clues on lexing using the re module in Python in an
article by Martin L÷wis.

http://www.python.org/community/sigs...ards-standard/

He writes:
[...]
A scanner based on regular expressions is usually implemented
as an alternative of all token definitions. For XPath, a
fragment of this expressions looks like this:
(?P<Number>\\d+(\\.\\d*)?|\\.\\d+)|
(?P<VariableReference>\\$""" + QName + """)|
(?P<NCName>"""+NCName+""")|
(?P<QName>"""+QName+""")|
(?P<LPAREN>\\()|

Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.
[...]

Item 2 is where I get stuck. There doesn't seem to be an obvious
way to do it, which I understand is a bad thing in Python.
Whatever source code went with the article originally is not
linked from the above page, so I don't know what Martin did.

Here's what I came up with (with a trivial example regex):

import re
r = re.compile('(?P<x>x+)|(?P<a>a+)')
m = r.match('aaxaxx')
if m:
for k in r.groupindex:
if m.group(k):
# Find the token type.
token = (k, m.group())

I wish I could do something obvious instead, like m.name().

--
Neil Cerutti
After finding no qualified candidates for the position of principal, the
school board is pleased to announce the appointment of David Steele to the
post. --Philip Streifer
Jan 10 '07 #1
2 1642
Neil Cerutti wrote:
A found some clues on lexing using the re module in Python in an
article by Martin L÷wis.
Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.
you can use lastgroup, or lastindex:

http://effbot.org/zone/xml-scanner.htm

there's also a "hidden" ready-made scanner class inside the SRE module
that works pretty well for simple cases; see:

http://aspn.activestate.com/ASPN/Coo.../Recipe/457664

</F>

Jan 10 '07 #2
On 2007-01-10, Fredrik Lundh <fr*****@pythonware.comwrote:
Neil Cerutti wrote:
>A found some clues on lexing using the re module in Python in
an article by Martin L÷wis.
> Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.

you can use lastgroup, or lastindex:

http://effbot.org/zone/xml-scanner.htm

there's also a "hidden" ready-made scanner class inside the SRE
module that works pretty well for simple cases; see:

http://aspn.activestate.com/ASPN/Coo.../Recipe/457664
Thanks for the excellent pointers.

I got tripped up:
>>m = re.match('(a+(b*)a+)', 'abbbbaa')
dir(m)
['__copy__', '__deepcopy__', 'end', 'expand', 'group', 'groupdict', 'groups', 'span', 'start']

There are some notable omissions there. That's not much of an
excuse for my not understanding the handy docs, but I guess it
can can function as a warning against relying on the interactive
help.

I'd seen the lastgroup definition in the documentation, but I
realize it was exactly what I needed. I didn't think carefully
enough about what "last matched capturing group" actually meant,
given my regex. I don't think I saw "name" there either. ;-)

lastgroup

The name of the last matched capturing group, or None if the
group didn't have a name, or if no group was matched at all.

--
Neil Cerutti
We dispense with accuracy --sign at New York drug store
Jan 10 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: x-herbert | last post by:
Hi, I have a small test to "compile" al litle script as a WMI-Tester. The script include a wmi-wrapper and "insert" the Win32-modeles. here the code: my "WMI-Tester.py" ----- import wmi
4
by: Tim Daneliuk | last post by:
OK, I've Googled for this and cannot seem to quite find what I need. So, I turn to the Gentle Geniuses here for help. Here is what I need to do from within a script: Given a username and a...
3
by: john morales | last post by:
Hi guys, I have a problem and i know there must be a solution for this as it is such a basic common practice in asp.net development. Scenario: i have many webforms in a site, most with two...
8
by: Paddy | last post by:
Proposal: Named RE variables ====================== The problem I have is that I am writing a 'good-enough' verilog tag extractor as a long regular expression (with the 'x' flag for...
3
by: flit | last post by:
Hello All, I am struggling with some ldap files. I am using the csv module to work with this files (I exported the ldap to a csv file). I have this string on a field...
3
by: MLH | last post by:
I have a table named tblDoItems. It has a text field named . There is no default value property setting at the table level. I have a query named qryAdminDoList based solely on the table that looks...
3
by: Lee | last post by:
Has anyone ran into this problem? I've done extensive googling and research and I cannot seem to find the answer. I downloaded the source for 2.5.1 from python.org compiled and installed it on a...
2
by: Juha S. | last post by:
Hi, I'm trying to use the Python profilers to test my code, but I get the following output for cProfile.run() at the interpreter: Traceback (most recent call last): File "<stdin>", line 1, in...
0
by: Michael Matthews | last post by:
Hello, I'm fairly new to Python, and have run into dead ends in trying to figure out what is going on. The basic thing I am trying to do is get pylibpcap working on a Python installation. More...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
1
by: Matthew3360 | last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function. Here is my code. header("Location:".$urlback); Is this the right layout the...
2
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it so the python app could use a http request to get...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and credentials and received a successful connection...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific technical details, Gmail likely implements measures...
0
Oralloy
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the synthesis of my design into a bitstream, not the C++...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
0
by: Rahul1995seven | last post by:
Introduction: In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python has gained popularity among beginners and experts...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.