473,788 Members | 2,893 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Working with named groups in re module

A found some clues on lexing using the re module in Python in an
article by Martin L÷wis.

http://www.python.org/community/sigs...ards-standard/

He writes:
[...]
A scanner based on regular expressions is usually implemented
as an alternative of all token definitions. For XPath, a
fragment of this expressions looks like this:
(?P<Number>\\d+ (\\.\\d*)?|\\.\ \d+)|
(?P<VariableRef erence>\\$""" + QName + """)|
(?P<NCName>"""+ NCName+""")|
(?P<QName>"""+Q Name+""")|
(?P<LPAREN>\\() |

Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.
[...]

Item 2 is where I get stuck. There doesn't seem to be an obvious
way to do it, which I understand is a bad thing in Python.
Whatever source code went with the article originally is not
linked from the above page, so I don't know what Martin did.

Here's what I came up with (with a trivial example regex):

import re
r = re.compile('(?P <x>x+)|(?P<a>a+ )')
m = r.match('aaxaxx ')
if m:
for k in r.groupindex:
if m.group(k):
# Find the token type.
token = (k, m.group())

I wish I could do something obvious instead, like m.name().

--
Neil Cerutti
After finding no qualified candidates for the position of principal, the
school board is pleased to announce the appointment of David Steele to the
post. --Philip Streifer
Jan 10 '07 #1
2 1710
Neil Cerutti wrote:
A found some clues on lexing using the re module in Python in an
article by Martin L÷wis.
Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.
you can use lastgroup, or lastindex:

http://effbot.org/zone/xml-scanner.htm

there's also a "hidden" ready-made scanner class inside the SRE module
that works pretty well for simple cases; see:

http://aspn.activestate.com/ASPN/Coo.../Recipe/457664

</F>

Jan 10 '07 #2
On 2007-01-10, Fredrik Lundh <fr*****@python ware.comwrote:
Neil Cerutti wrote:
>A found some clues on lexing using the re module in Python in
an article by Martin L÷wis.
> Here, each alternative in the regular expression defines a
named group. Scanning proceeds in the following steps:

1. Given the complete input, match the regular expression
with the beginning of the input.
2. Find out which alternative matched.

you can use lastgroup, or lastindex:

http://effbot.org/zone/xml-scanner.htm

there's also a "hidden" ready-made scanner class inside the SRE
module that works pretty well for simple cases; see:

http://aspn.activestate.com/ASPN/Coo.../Recipe/457664
Thanks for the excellent pointers.

I got tripped up:
>>m = re.match('(a+(b *)a+)', 'abbbbaa')
dir(m)
['__copy__', '__deepcopy__', 'end', 'expand', 'group', 'groupdict', 'groups', 'span', 'start']

There are some notable omissions there. That's not much of an
excuse for my not understanding the handy docs, but I guess it
can can function as a warning against relying on the interactive
help.

I'd seen the lastgroup definition in the documentation, but I
realize it was exactly what I needed. I didn't think carefully
enough about what "last matched capturing group" actually meant,
given my regex. I don't think I saw "name" there either. ;-)

lastgroup

The name of the last matched capturing group, or None if the
group didn't have a name, or if no group was matched at all.

--
Neil Cerutti
We dispense with accuracy --sign at New York drug store
Jan 10 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
17730
by: x-herbert | last post by:
Hi, I have a small test to "compile" al litle script as a WMI-Tester. The script include a wmi-wrapper and "insert" the Win32-modeles. here the code: my "WMI-Tester.py" ----- import wmi
4
3211
by: Tim Daneliuk | last post by:
OK, I've Googled for this and cannot seem to quite find what I need. So, I turn to the Gentle Geniuses here for help. Here is what I need to do from within a script: Given a username and a password (plain text): 1) Validate that the password is correct for that user *without actually logging in*. 2) If the password is valid, return a list of all the groups the user belongs to. Otherwise, return some error string.
3
2831
by: john morales | last post by:
Hi guys, I have a problem and i know there must be a solution for this as it is such a basic common practice in asp.net development. Scenario: i have many webforms in a site, most with two different custom controls, one is a login module, the other is a generic data entry module. When there are no validation controls on the page there is no problem. But when the the data entry module has validation controls, the login module triggers...
8
1521
by: Paddy | last post by:
Proposal: Named RE variables ====================== The problem I have is that I am writing a 'good-enough' verilog tag extractor as a long regular expression (with the 'x' flag for readability), and find myself both 1) Repeating sections of the RE, and 2) Wanting to add '(?P<some_clarifier>...) ' around sections because I know what the section does but don't really want the group.
3
2391
by: flit | last post by:
Hello All, I am struggling with some ldap files. I am using the csv module to work with this files (I exported the ldap to a csv file). I have this string on a field CN=pointhairedpeoplethatsux,OU=Groups,OU=Hatepeople,OU=HR,DC=fabrika,DC=com;CN=pointhairedboss,OU=Groups,OU=Hatepeople,OU=HR,DC=fabrika,DC=com this string is all the groups one user has membership. So what I am trying to do.
3
5113
by: MLH | last post by:
I have a table named tblDoItems. It has a text field named . There is no default value property setting at the table level. I have a query named qryAdminDoList based solely on the table that looks like this: SELECT DISTINCTROW tblDoItems.DoItemID, tblDoItems.Title, tblDoItems.Details, tblDoItems.Done, tblDoItems.Priority, tblDoItems.EntryDate, IIf(=1,"High Priority",IIf(=2,"Medium Priority","Low Priority")) AS PriorityWord,...
3
2627
by: Lee | last post by:
Has anyone ran into this problem? I've done extensive googling and research and I cannot seem to find the answer. I downloaded the source for 2.5.1 from python.org compiled and installed it on a Solaris box, uname -a returns SunOS unicom5 5.8 Generic_117350-26 sun4u sparc SUNW,Sun-Fire-V210
2
10319
by: Juha S. | last post by:
Hi, I'm trying to use the Python profilers to test my code, but I get the following output for cProfile.run() at the interpreter: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.5/cProfile.py", line 36, in run result = prof.print_stats(sort) File "/usr/lib/python2.5/cProfile.py", line 80, in print_stats
0
1771
by: Michael Matthews | last post by:
Hello, I'm fairly new to Python, and have run into dead ends in trying to figure out what is going on. The basic thing I am trying to do is get pylibpcap working on a Python installation. More precisely, I want to get it working on an ActiveState Python installation. I have it working on cygwin (Windows XP). I was hoping to just be able to copy the pcap.py and _pcapmodule.dll file to a directory in the ActiveState install's sys.path...
0
9656
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9498
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10173
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10110
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8993
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7517
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5399
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5536
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
2894
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.