Wow, this was harder than I thought (at least for a rusty Pythoneer
like myself). Here's my stab at an implementation. Remember, the
goal is to add a "match" method to Template which works like
Template.substitute, but in reverse: given a string, if that string
matches the template, then it should return a dictionary mapping each
template field to the corresponding value in the given string.
Oh, and as one extra feature, I want to support a ".greedy" attribute
on the Template object, which determines whether the matching of
fields should be done in a greedy or non-greedy manner.
------------------------------------------------------------
#!/usr/bin/python
from string import Template
import re
def templateMatch(self, s):
# start by finding the fields in our template, and building a map
# from field position (index) to field name.
posToName = {}
pos = 1
for item in self.pattern.findall(self.template):
# each item is a tuple where item 1 is the field name
posToName[pos] = item[1]
pos += 1
# determine if we should match greedy or non-greedy
greedy = False
if self.__dict__.has_key('greedy'):
greedy = self.greedy
# now, build a regex pattern to compare against s
# (taking care to escape any characters in our template that
# would have special meaning in regex)
pat = self.template.replace('.', '\\.')
pat = pat.replace('(', '\\(')
pat = pat.replace(')', '\\)') # there must be a better way...
if greedy:
pat = self.pattern.sub('(.*)', pat)
else:
pat = self.pattern.sub('(.*?)', pat)
p = re.compile(pat)
# try to match this to the given string
match = p.match(s)
if match is None: return None
out = {}
for i in posToName.keys():
out[posToName[i]] = match.group(i)
return out
Template.match = templateMatch
t = Template("The $object in $location falls mainly in the $subloc.")
print t.match( "The rain in Spain falls mainly in the train." )
------------------------------------------------------------
This sort-of works, but it won't properly handle $$ in the template,
and I'm not too sure whether it handles the ${fieldname} form,
either. Also, it only escapes '.', '(', and ')' in the template...
there must be a better way of escaping all characters that have
special meaning to RegEx, except for '$' (which is why I can't use
re.escape).
Probably the rest of the code could be improved too. I'm eager to
hear your feedback.
Thanks,
- Joe 1 3402
On Oct 9, 5:20*pm, Joe Strout <j...@strout.netwrote:
Wow, this was harder than I thought (at least for a rusty Pythoneer *
like myself). *Here's my stab at an implementation. *Remember, the *
goal is to add a "match" method to Template which works like *
Template.substitute, but in reverse: given a string, if that string *
matches the template, then it should return a dictionary mapping each *
template field to the corresponding value in the given string.
Oh, and as one extra feature, I want to support a ".greedy" attribute *
on the Template object, which determines whether the matching of *
fields should be done in a greedy or non-greedy manner.
------------------------------------------------------------
#!/usr/bin/python
from string import Template
import re
def templateMatch(self, s):
* * * * # start by finding the fields in our template, and building a map
* * * * # from field position (index) to field name.
* * * * posToName = {}
* * * * pos = 1
* * * * for item in self.pattern.findall(self.template):
* * * * * * * * # each item is a tuple where item 1 is the field name
* * * * * * * * posToName[pos] = item[1]
* * * * * * * * pos += 1
* * * * # determine if we should match greedy or non-greedy
* * * * greedy = False
* * * * if self.__dict__.has_key('greedy'):
* * * * * * * * greedy = self.greedy
* * * * # now, build a regex pattern to compare against s
* * * * # (taking care to escape any characters in our template that
* * * * # would have special meaning in regex)
* * * * pat = self.template.replace('.', '\\.')
* * * * pat = pat.replace('(', '\\(')
* * * * pat = pat.replace(')', '\\)') # there must be a better way...
* * * * if greedy:
* * * * * * * * pat = self.pattern.sub('(.*)', pat)
* * * * else:
* * * * * * * * pat = self.pattern.sub('(.*?)', pat)
* * * * p = re.compile(pat)
* * * * # try to match this to the given string
* * * * match = p.match(s)
* * * * if match is None: return None
* * * * out = {}
* * * * for i in posToName.keys():
* * * * * * * * out[posToName[i]] = match.group(i)
* * * * return out
Template.match = templateMatch
t = Template("The $object in $location falls mainly in the $subloc.")
print t.match( "The rain in Spain falls mainly in the train." )
------------------------------------------------------------
This sort-of works, but it won't properly handle $$ in the template, *
and I'm not too sure whether it handles the ${fieldname} form, *
either. *Also, it only escapes '.', '(', and ')' in the template... *
there must be a better way of escaping all characters that have *
special meaning to RegEx, except for '$' (which is why I can't use *
re.escape).
Probably the rest of the code could be improved too. *I'm eager to *
hear your feedback.
Thanks,
- Joe
How about something like:
import re
def placeholder(m):
if m.group(1):
return "(?P<%s>.+)" % m.group(1)
elif m.group(2):
return "\\$"
else:
return re.escape(m.group(3))
regex = re.compile(r"\$(\w+)|(\$\$)")
t = "The $object in $location falls mainly in the $subloc."
print regex.sub(placeholder, t) This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Joachim Spoerhase |
last post by:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
I am a XSLT-beginner and i read the XSLT-recommendation of the W3C through.
But I did'nt really understand section 5.5 of the latest...
|
by: Bob |
last post by:
I need to create a Regex to extract all strings (including quotations) from
a C# or C++ source file. After being unsuccessful myself, I found this
sample on the internet:
...
|
by: Neal |
last post by:
Hi All,
I used an article on XSLT and XML and creating a TOC written on the MSDN
CodeCorner.
ms-help://MS.VSCC.2003/MS.MSDNQTR.2003FEB.1033/dncodecorn/html/corner042699.htm
However, it did'nt...
|
by: |
last post by:
I am interested in scanning web pages for content of interest, and then
auto-classifying that content. I have tables of metadata that I can use for
the classification, e.g. : "John P. Jones" "Jane...
|
by: George2 |
last post by:
Hello everyone,
I am feeling template function is more tricky than template class. For
the reason that the compiler will do the matching automatically for
template function, but for template...
|
by: abir |
last post by:
I am matching a template, and specializing based of a template, rather
than a single class.
The codes are like,
template<template<typename T,typename Alloc = std::allocator<T>
class pix{
};
...
|
by: abir |
last post by:
i have a template as shown
template<typename Sclass Indexer{};
i want to have a specialization for std::vector both const & non const
version.
template<typename T,typename Aclass...
|
by: Joe Strout |
last post by:
Catching up on what's new in Python since I last used it a decade ago,
I've just been reading up on template strings. These are pretty
cool! However, just as a template string has some advantages...
|
by: Robin Becker |
last post by:
Joe Strout wrote:
........
you could use something like this to record the lookups
.... def __new__(cls,*args,**kwds):
.... self = dict.__new__(cls,*args,**kwds)
.... self.__record =...
|
by: Bruce !C!+ |
last post by:
as we known , we can use function pointer as:
float Minus (float a, float b) { return a-b; }
float (*getOp())(float, float)
{
return &Minus;
}
int main()
{
float (*opFun)(float, float) =...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
| |