Working with named groups in re module

Neil Cerutti

A found some clues on lexing using the re module in Python in an
article by Martin L÷wis.

http://www.python.org/community/sigs...ards-standard/

He writes:
[...]
A scanner based on regular expressions is usually implemented
as an alternative of all token definitions. For XPath, a
fragment of this expressions looks like this:
(?P<Number>\\d+(\\.\\d*)?|\\.\\d+)|
(?P<VariableReference>\\$""" + QName + """)|
(?P<NCName>"""+NCName+""")|
(?P<QName>"""+QName+""")|
(?P<LPAREN>\\()|

Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.
[...]

Item 2 is where I get stuck. There doesn't seem to be an obvious
way to do it, which I understand is a bad thing in Python.
Whatever source code went with the article originally is not
linked from the above page, so I don't know what Martin did.

Here's what I came up with (with a trivial example regex):

import re
r = re.compile('(?P<x>x+)|(?P<a>a+)')
m = r.match('aaxaxx')
if m:
for k in r.groupindex:
if m.group(k):
# Find the token type.
token = (k, m.group())

I wish I could do something obvious instead, like m.name().

--
Neil Cerutti
After finding no qualified candidates for the position of principal, the
school board is pleased to announce the appointment of David Steele to the
post. --Philip Streifer

Jan 10 '07 #1

Subscribe Post Reply

1694

Fredrik Lundh

Neil Cerutti wrote:

A found some clues on lexing using the re module in Python in an
article by Martin L÷wis.

Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.

you can use lastgroup, or lastindex:

http://effbot.org/zone/xml-scanner.htm

there's also a "hidden" ready-made scanner class inside the SRE module
that works pretty well for simple cases; see:

http://aspn.activestate.com/ASPN/Coo.../Recipe/457664

</F>

Jan 10 '07 #2

Neil Cerutti

On 2007-01-10, Fredrik Lundh <fr*****@pythonware.comwrote:

Neil Cerutti wrote:
>A found some clues on lexing using the re module in Python in
an article by Martin L÷wis.

> Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.

you can use lastgroup, or lastindex:

http://effbot.org/zone/xml-scanner.htm

there's also a "hidden" ready-made scanner class inside the SRE
module that works pretty well for simple cases; see:

http://aspn.activestate.com/ASPN/Coo.../Recipe/457664

Thanks for the excellent pointers.

I got tripped up:

>>m = re.match('(a+(b*)a+)', 'abbbbaa')
dir(m)

['__copy__', '__deepcopy__', 'end', 'expand', 'group', 'groupdict', 'groups', 'span', 'start']

There are some notable omissions there. That's not much of an
excuse for my not understanding the handy docs, but I guess it
can can function as a warning against relying on the interactive
help.

I'd seen the lastgroup definition in the documentation, but I
realize it was exactly what I needed. I didn't think carefully
enough about what "last matched capturing group" actually meant,
given my regex. I don't think I saw "name" there either. ;-)

lastgroup

The name of the last matched capturing group, or None if the
group didn't have a name, or if no group was matched at all.

--
Neil Cerutti
We dispense with accuracy --sign at New York drug store

Jan 10 '07 #3

by: x-herbert | last post by:

Hi, I have a small test to "compile" al litle script as a WMI-Tester. The script include a wmi-wrapper and "insert" the Win32-modeles. here the code: my "WMI-Tester.py" ----- import wmi

Python

Validating A User/Password Pair + Getting Groups On Unix

by: Tim Daneliuk | last post by:

OK, I've Googled for this and cannot seem to quite find what I need. So, I turn to the Gentle Geniuses here for help. Here is what I need to do from within a script: Given a username and a...

Python

2 validation groups problem

by: john morales | last post by:

Hi guys, I have a problem and i know there must be a solution for this as it is such a basic common practice in asp.net development. Scenario: i have many webforms in a site, most with two...

ASP.NET

Named regexp variables, an extension proposal.

by: Paddy | last post by:

Proposal: Named RE variables ====================== The problem I have is that I am writing a 'good-enough' verilog tag extractor as a long regular expression (with the 'x' flag for...

Python

working with ldap files

by: flit | last post by:

Hello All, I am struggling with some ldap files. I am using the csv module to work with this files (I exported the ldap to a csv file). I have this string on a field...

Python

Why default value might not be working?

by: MLH | last post by:

I have a table named tblDoItems. It has a text field named . There is no default value property setting at the table level. I have a query named qryAdminDoList based solely on the table that looks...

Microsoft Access / VBA

Subprocess Not Working on Solaris

by: Lee | last post by:

Has anyone ran into this problem? I've done extensive googling and research and I cannot seem to find the answer. I downloaded the source for 2.5.1 from python.org compiled and installed it on a...

Python

No Module Named pstats

by: Juha S. | last post by:

Hi, I'm trying to use the Python profilers to test my code, but I get the following output for cProfile.run() at the interpreter: Traceback (most recent call last): File "<stdin>", line 1, in...

Python

Trying to get pcap working

by: Michael Matthews | last post by:

Hello, I'm fairly new to Python, and have run into dead ends in trying to figure out what is going on. The basic thing I am trying to do is get pylibpcap working on a Python installation. More...

Python

Cloud Servers without Credit Card and Email Registration: A Simpler Way to Get on the Cloud

by: CloudSolutions | last post by:

Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...

General

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Working with named groups in re module

Similar topics