473,761 Members | 4,511 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Re: template strings for matching?

Wow, this was harder than I thought (at least for a rusty Pythoneer
like myself). Here's my stab at an implementation. Remember, the
goal is to add a "match" method to Template which works like
Template.substi tute, but in reverse: given a string, if that string
matches the template, then it should return a dictionary mapping each
template field to the corresponding value in the given string.

Oh, and as one extra feature, I want to support a ".greedy" attribute
on the Template object, which determines whether the matching of
fields should be done in a greedy or non-greedy manner.

------------------------------------------------------------
#!/usr/bin/python

from string import Template
import re

def templateMatch(s elf, s):
# start by finding the fields in our template, and building a map
# from field position (index) to field name.
posToName = {}
pos = 1
for item in self.pattern.fi ndall(self.temp late):
# each item is a tuple where item 1 is the field name
posToName[pos] = item[1]
pos += 1

# determine if we should match greedy or non-greedy
greedy = False
if self.__dict__.h as_key('greedy' ):
greedy = self.greedy

# now, build a regex pattern to compare against s
# (taking care to escape any characters in our template that
# would have special meaning in regex)
pat = self.template.r eplace('.', '\\.')
pat = pat.replace('(' , '\\(')
pat = pat.replace(')' , '\\)') # there must be a better way...

if greedy:
pat = self.pattern.su b('(.*)', pat)
else:
pat = self.pattern.su b('(.*?)', pat)
p = re.compile(pat)

# try to match this to the given string
match = p.match(s)
if match is None: return None
out = {}
for i in posToName.keys( ):
out[posToName[i]] = match.group(i)
return out
Template.match = templateMatch

t = Template("The $object in $location falls mainly in the $subloc.")
print t.match( "The rain in Spain falls mainly in the train." )
------------------------------------------------------------

This sort-of works, but it won't properly handle $$ in the template,
and I'm not too sure whether it handles the ${fieldname} form,
either. Also, it only escapes '.', '(', and ')' in the template...
there must be a better way of escaping all characters that have
special meaning to RegEx, except for '$' (which is why I can't use
re.escape).

Probably the rest of the code could be improved too. I'm eager to
hear your feedback.

Thanks,
- Joe
Oct 9 '08 #1
1 3451
On Oct 9, 5:20*pm, Joe Strout <j...@strout.ne twrote:
Wow, this was harder than I thought (at least for a rusty Pythoneer *
like myself). *Here's my stab at an implementation. *Remember, the *
goal is to add a "match" method to Template which works like *
Template.substi tute, but in reverse: given a string, if that string *
matches the template, then it should return a dictionary mapping each *
template field to the corresponding value in the given string.

Oh, and as one extra feature, I want to support a ".greedy" attribute *
on the Template object, which determines whether the matching of *
fields should be done in a greedy or non-greedy manner.

------------------------------------------------------------
#!/usr/bin/python

from string import Template
import re

def templateMatch(s elf, s):
* * * * # start by finding the fields in our template, and building a map
* * * * # from field position (index) to field name.
* * * * posToName = {}
* * * * pos = 1
* * * * for item in self.pattern.fi ndall(self.temp late):
* * * * * * * * # each item is a tuple where item 1 is the field name
* * * * * * * * posToName[pos] = item[1]
* * * * * * * * pos += 1

* * * * # determine if we should match greedy or non-greedy
* * * * greedy = False
* * * * if self.__dict__.h as_key('greedy' ):
* * * * * * * * greedy = self.greedy

* * * * # now, build a regex pattern to compare against s
* * * * # (taking care to escape any characters in our template that
* * * * # would have special meaning in regex)
* * * * pat = self.template.r eplace('.', '\\.')
* * * * pat = pat.replace('(' , '\\(')
* * * * pat = pat.replace(')' , '\\)') # there must be a better way...

* * * * if greedy:
* * * * * * * * pat = self.pattern.su b('(.*)', pat)
* * * * else:
* * * * * * * * pat = self.pattern.su b('(.*?)', pat)
* * * * p = re.compile(pat)

* * * * # try to match this to the given string
* * * * match = p.match(s)
* * * * if match is None: return None
* * * * out = {}
* * * * for i in posToName.keys( ):
* * * * * * * * out[posToName[i]] = match.group(i)
* * * * return out

Template.match = templateMatch

t = Template("The $object in $location falls mainly in the $subloc.")
print t.match( "The rain in Spain falls mainly in the train." )
------------------------------------------------------------

This sort-of works, but it won't properly handle $$ in the template, *
and I'm not too sure whether it handles the ${fieldname} form, *
either. *Also, it only escapes '.', '(', and ')' in the template... *
there must be a better way of escaping all characters that have *
special meaning to RegEx, except for '$' (which is why I can't use *
re.escape).

Probably the rest of the code could be improved too. *I'm eager to *
hear your feedback.

Thanks,
- Joe
How about something like:

import re

def placeholder(m):
if m.group(1):
return "(?P<%s>.+) " % m.group(1)
elif m.group(2):
return "\\$"
else:
return re.escape(m.gro up(3))

regex = re.compile(r"\$ (\w+)|(\$\$)")

t = "The $object in $location falls mainly in the $subloc."
print regex.sub(place holder, t)
Oct 9 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2348
by: Joachim Spoerhase | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I am a XSLT-beginner and i read the XSLT-recommendation of the W3C through. But I did'nt really understand section 5.5 of the latest specification. The Problem is what template rule to be used if there are more than one matching template rule. You will find the following:
8
2590
by: Bob | last post by:
I need to create a Regex to extract all strings (including quotations) from a C# or C++ source file. After being unsuccessful myself, I found this sample on the internet: @"@?""""|@?"".*?(?!\\).""|''|'.*?(?!\\).'" I am inputting the entire source file string and using it with RegexOptions.Singleline. This works OK with, unless the string ends with a back-slash. For example: "This is a test\\". Can anybody see how to fix this...
6
1831
by: Neal | last post by:
Hi All, I used an article on XSLT and XML and creating a TOC written on the MSDN CodeCorner. ms-help://MS.VSCC.2003/MS.MSDNQTR.2003FEB.1033/dncodecorn/html/corner042699.htm However, it did'nt quite answer all my questions. How would one create a 3 level TOC when each item level / node was differently named (They used Template match and for-each, but the template match worked as on a 3 level structure they usedf the same named xml...
9
2682
by: | last post by:
I am interested in scanning web pages for content of interest, and then auto-classifying that content. I have tables of metadata that I can use for the classification, e.g. : "John P. Jones" "Jane T. Smith" "Fred Barzowsky" "Department of Oncology" "Office of Student Affairs" "Lewis Hall" etc. etc. etc. I am wondering what the efficient way to do this in code might be. The dumb and brute-force way would be to loop through the content...
1
1226
by: George2 | last post by:
Hello everyone, I am feeling template function is more tricky than template class. For the reason that the compiler will do the matching automatically for template function, but for template class, developer can assign how to match. Sometimes compiler is doing mysterious matching rules for template function, which makes us confused. Does anyone have the same
4
2061
by: abir | last post by:
I am matching a template, and specializing based of a template, rather than a single class. The codes are like, template<template<typename T,typename Alloc = std::allocator<T> class pix{ }; template<> class pix<vector>{
6
2616
by: abir | last post by:
i have a template as shown template<typename Sclass Indexer{}; i want to have a specialization for std::vector both const & non const version. template<typename T,typename Aclass Indexer<std::vector<T,A {} matches only with nonconst version. anyway to match it for both? and get if it is const or nonconst? Actually i want 2 specialization, one for std::vector<T,Aconst & non const other for std::deque<T,Aconst & non const.
2
1245
by: Joe Strout | last post by:
Catching up on what's new in Python since I last used it a decade ago, I've just been reading up on template strings. These are pretty cool! However, just as a template string has some advantages over % substitution for building a string, it seems like it would have advantages over manually constructing a regex for string matching. So... is there any way to use a template string for matching? I expected something like: templ =...
0
207
by: Robin Becker | last post by:
Joe Strout wrote: ........ you could use something like this to record the lookups .... def __new__(cls,*args,**kwds): .... self = dict.__new__(cls,*args,**kwds) .... self.__record = set() .... return self .... def _record_clear(self):
2
465
by: Bruce !C!+ | last post by:
as we known , we can use function pointer as: float Minus (float a, float b) { return a-b; } float (*getOp())(float, float) { return &Minus; } int main() { float (*opFun)(float, float) = NULL; opFun= getOp();
0
9554
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9377
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10136
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9989
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9925
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9811
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7358
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5405
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3509
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.