Wow, this was harder than I thought (at least for a rusty Pythoneer
like myself). Here's my stab at an implementation. Remember, the
goal is to add a "match" method to Template which works like
Template.substitute, but in reverse: given a string, if that string
matches the template, then it should return a dictionary mapping each
template field to the corresponding value in the given string.
Oh, and as one extra feature, I want to support a ".greedy" attribute
on the Template object, which determines whether the matching of
fields should be done in a greedy or non-greedy manner.
------------------------------------------------------------
#!/usr/bin/python
from string import Template
import re
def templateMatch(self, s):
# start by finding the fields in our template, and building a map
# from field position (index) to field name.
posToName = {}
pos = 1
for item in self.pattern.findall(self.template):
# each item is a tuple where item 1 is the field name
posToName[pos] = item[1]
pos += 1
# determine if we should match greedy or non-greedy
greedy = False
if self.__dict__.has_key('greedy'):
greedy = self.greedy
# now, build a regex pattern to compare against s
# (taking care to escape any characters in our template that
# would have special meaning in regex)
pat = self.template.replace('.', '\\.')
pat = pat.replace('(', '\\(')
pat = pat.replace(')', '\\)') # there must be a better way...
if greedy:
pat = self.pattern.sub('(.*)', pat)
else:
pat = self.pattern.sub('(.*?)', pat)
p = re.compile(pat)
# try to match this to the given string
match = p.match(s)
if match is None: return None
out = {}
for i in posToName.keys():
out[posToName[i]] = match.group(i)
return out
Template.match = templateMatch
t = Template("The $object in $location falls mainly in the $subloc.")
print t.match( "The rain in Spain falls mainly in the train." )
------------------------------------------------------------
This sort-of works, but it won't properly handle $$ in the template,
and I'm not too sure whether it handles the ${fieldname} form,
either. Also, it only escapes '.', '(', and ')' in the template...
there must be a better way of escaping all characters that have
special meaning to RegEx, except for '$' (which is why I can't use
re.escape).
Probably the rest of the code could be improved too. I'm eager to
hear your feedback.
Thanks,
- Joe 1 3289
On Oct 9, 5:20*pm, Joe Strout <j...@strout.netwrote:
Wow, this was harder than I thought (at least for a rusty Pythoneer *
like myself). *Here's my stab at an implementation. *Remember, the *
goal is to add a "match" method to Template which works like *
Template.substitute, but in reverse: given a string, if that string *
matches the template, then it should return a dictionary mapping each *
template field to the corresponding value in the given string.
Oh, and as one extra feature, I want to support a ".greedy" attribute *
on the Template object, which determines whether the matching of *
fields should be done in a greedy or non-greedy manner.
------------------------------------------------------------
#!/usr/bin/python
from string import Template
import re
def templateMatch(self, s):
* * * * # start by finding the fields in our template, and building a map
* * * * # from field position (index) to field name.
* * * * posToName = {}
* * * * pos = 1
* * * * for item in self.pattern.findall(self.template):
* * * * * * * * # each item is a tuple where item 1 is the field name
* * * * * * * * posToName[pos] = item[1]
* * * * * * * * pos += 1
* * * * # determine if we should match greedy or non-greedy
* * * * greedy = False
* * * * if self.__dict__.has_key('greedy'):
* * * * * * * * greedy = self.greedy
* * * * # now, build a regex pattern to compare against s
* * * * # (taking care to escape any characters in our template that
* * * * # would have special meaning in regex)
* * * * pat = self.template.replace('.', '\\.')
* * * * pat = pat.replace('(', '\\(')
* * * * pat = pat.replace(')', '\\)') # there must be a better way...
* * * * if greedy:
* * * * * * * * pat = self.pattern.sub('(.*)', pat)
* * * * else:
* * * * * * * * pat = self.pattern.sub('(.*?)', pat)
* * * * p = re.compile(pat)
* * * * # try to match this to the given string
* * * * match = p.match(s)
* * * * if match is None: return None
* * * * out = {}
* * * * for i in posToName.keys():
* * * * * * * * out[posToName[i]] = match.group(i)
* * * * return out
Template.match = templateMatch
t = Template("The $object in $location falls mainly in the $subloc.")
print t.match( "The rain in Spain falls mainly in the train." )
------------------------------------------------------------
This sort-of works, but it won't properly handle $$ in the template, *
and I'm not too sure whether it handles the ${fieldname} form, *
either. *Also, it only escapes '.', '(', and ')' in the template... *
there must be a better way of escaping all characters that have *
special meaning to RegEx, except for '$' (which is why I can't use *
re.escape).
Probably the rest of the code could be improved too. *I'm eager to *
hear your feedback.
Thanks,
- Joe
How about something like:
import re
def placeholder(m):
if m.group(1):
return "(?P<%s>.+)" % m.group(1)
elif m.group(2):
return "\\$"
else:
return re.escape(m.group(3))
regex = re.compile(r"\$(\w+)|(\$\$)")
t = "The $object in $location falls mainly in the $subloc."
print regex.sub(placeholder, t) This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Joachim Spoerhase |
last post by:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
I am a XSLT-beginner and i read the XSLT-recommendation of the W3C through.
But I did'nt...
|
by: Bob |
last post by:
I need to create a Regex to extract all strings (including quotations) from
a C# or C++ source file. After being unsuccessful myself, I found this...
|
by: Neal |
last post by:
Hi All,
I used an article on XSLT and XML and creating a TOC written on the MSDN
CodeCorner....
|
by: |
last post by:
I am interested in scanning web pages for content of interest, and then
auto-classifying that content. I have tables of metadata that I can use for...
|
by: George2 |
last post by:
Hello everyone,
I am feeling template function is more tricky than template class. For
the reason that the compiler will do the matching...
|
by: abir |
last post by:
I am matching a template, and specializing based of a template, rather
than a single class.
The codes are like,
template<template<typename...
|
by: abir |
last post by:
i have a template as shown
template<typename Sclass Indexer{};
i want to have a specialization for std::vector both const & non const
version....
|
by: Joe Strout |
last post by:
Catching up on what's new in Python since I last used it a decade ago,
I've just been reading up on template strings. These are pretty
cool! ...
|
by: Robin Becker |
last post by:
Joe Strout wrote:
........
you could use something like this to record the lookups
.... def __new__(cls,*args,**kwds):
.... self =...
|
by: Bruce !C!+ |
last post by:
as we known , we can use function pointer as:
float Minus (float a, float b) { return a-b; }
float (*getOp())(float, float)
{
return &Minus;
}...
|
by: Kemmylinns12 |
last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
|
by: Naresh1 |
last post by:
What is WebLogic Admin Training?
WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
|
by: antdb |
last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine
In the overall architecture, a new "hyper-convergence" concept was...
|
by: Matthew3360 |
last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function.
Here is my code.
...
|
by: Matthew3360 |
last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
|
by: WisdomUfot |
last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...
|
by: Matthew3360 |
last post by:
Hi,
I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web...
|
by: Carina712 |
last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand....
|
by: Rahul1995seven |
last post by:
Introduction:
In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python...
| |