473,385 Members | 1,531 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Working with named groups in re module

A found some clues on lexing using the re module in Python in an
article by Martin L÷wis.

http://www.python.org/community/sigs...ards-standard/

He writes:
[...]
A scanner based on regular expressions is usually implemented
as an alternative of all token definitions. For XPath, a
fragment of this expressions looks like this:
(?P<Number>\\d+(\\.\\d*)?|\\.\\d+)|
(?P<VariableReference>\\$""" + QName + """)|
(?P<NCName>"""+NCName+""")|
(?P<QName>"""+QName+""")|
(?P<LPAREN>\\()|

Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.
[...]

Item 2 is where I get stuck. There doesn't seem to be an obvious
way to do it, which I understand is a bad thing in Python.
Whatever source code went with the article originally is not
linked from the above page, so I don't know what Martin did.

Here's what I came up with (with a trivial example regex):

import re
r = re.compile('(?P<x>x+)|(?P<a>a+)')
m = r.match('aaxaxx')
if m:
for k in r.groupindex:
if m.group(k):
# Find the token type.
token = (k, m.group())

I wish I could do something obvious instead, like m.name().

--
Neil Cerutti
After finding no qualified candidates for the position of principal, the
school board is pleased to announce the appointment of David Steele to the
post. --Philip Streifer
Jan 10 '07 #1
2 1694
Neil Cerutti wrote:
A found some clues on lexing using the re module in Python in an
article by Martin L÷wis.
Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.
you can use lastgroup, or lastindex:

http://effbot.org/zone/xml-scanner.htm

there's also a "hidden" ready-made scanner class inside the SRE module
that works pretty well for simple cases; see:

http://aspn.activestate.com/ASPN/Coo.../Recipe/457664

</F>

Jan 10 '07 #2
On 2007-01-10, Fredrik Lundh <fr*****@pythonware.comwrote:
Neil Cerutti wrote:
>A found some clues on lexing using the re module in Python in
an article by Martin L÷wis.
> Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.

you can use lastgroup, or lastindex:

http://effbot.org/zone/xml-scanner.htm

there's also a "hidden" ready-made scanner class inside the SRE
module that works pretty well for simple cases; see:

http://aspn.activestate.com/ASPN/Coo.../Recipe/457664
Thanks for the excellent pointers.

I got tripped up:
>>m = re.match('(a+(b*)a+)', 'abbbbaa')
dir(m)
['__copy__', '__deepcopy__', 'end', 'expand', 'group', 'groupdict', 'groups', 'span', 'start']

There are some notable omissions there. That's not much of an
excuse for my not understanding the handy docs, but I guess it
can can function as a warning against relying on the interactive
help.

I'd seen the lastgroup definition in the documentation, but I
realize it was exactly what I needed. I didn't think carefully
enough about what "last matched capturing group" actually meant,
given my regex. I don't think I saw "name" there either. ;-)

lastgroup

The name of the last matched capturing group, or None if the
group didn't have a name, or if no group was matched at all.

--
Neil Cerutti
We dispense with accuracy --sign at New York drug store
Jan 10 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: x-herbert | last post by:
Hi, I have a small test to "compile" al litle script as a WMI-Tester. The script include a wmi-wrapper and "insert" the Win32-modeles. here the code: my "WMI-Tester.py" ----- import wmi
4
by: Tim Daneliuk | last post by:
OK, I've Googled for this and cannot seem to quite find what I need. So, I turn to the Gentle Geniuses here for help. Here is what I need to do from within a script: Given a username and a...
3
by: john morales | last post by:
Hi guys, I have a problem and i know there must be a solution for this as it is such a basic common practice in asp.net development. Scenario: i have many webforms in a site, most with two...
8
by: Paddy | last post by:
Proposal: Named RE variables ====================== The problem I have is that I am writing a 'good-enough' verilog tag extractor as a long regular expression (with the 'x' flag for...
3
by: flit | last post by:
Hello All, I am struggling with some ldap files. I am using the csv module to work with this files (I exported the ldap to a csv file). I have this string on a field...
3
by: MLH | last post by:
I have a table named tblDoItems. It has a text field named . There is no default value property setting at the table level. I have a query named qryAdminDoList based solely on the table that looks...
3
by: Lee | last post by:
Has anyone ran into this problem? I've done extensive googling and research and I cannot seem to find the answer. I downloaded the source for 2.5.1 from python.org compiled and installed it on a...
2
by: Juha S. | last post by:
Hi, I'm trying to use the Python profilers to test my code, but I get the following output for cProfile.run() at the interpreter: Traceback (most recent call last): File "<stdin>", line 1, in...
0
by: Michael Matthews | last post by:
Hello, I'm fairly new to Python, and have run into dead ends in trying to figure out what is going on. The basic thing I am trying to do is get pylibpcap working on a Python installation. More...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.