By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,070 Members | 1,718 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,070 IT Pros & Developers. It's quick & easy.

Pattern matching with string and list

P: n/a
Hi,

I'd need to perform simple pattern matching within a string using a
list of possible patterns. For example, I want to know if the substring
starting at position n matches any of the string I have a list, as
below:

sentence = "the color is $red"
patterns = ["blue","red","yellow"]
pos = sentence.find($)
# here I need to find whether what's after 'pos' matches any of the
strings of my 'patterns' list
bmatch = ismatching( sentence[pos:], patterns)

Is an equivalent of this ismatching() function existing in some Python
lib?

Thanks,

Olivier.

Dec 12 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
ol****@gmail.com wrote:
Hi,

I'd need to perform simple pattern matching within a string using a
list of possible patterns. For example, I want to know if the substring
starting at position n matches any of the string I have a list, as
below:

sentence = "the color is $red"
patterns = ["blue","red","yellow"]
pos = sentence.find($)
# here I need to find whether what's after 'pos' matches any of the
strings of my 'patterns' list
bmatch = ismatching( sentence[pos:], patterns)

Is an equivalent of this ismatching() function existing in some Python
lib?

Thanks,

Olivier.

As I think you define it, ismatching can be written as:
def ismatching(sentence, patterns): ... re_pattern = re.compile("(%s)\Z" % "|".join(patterns))
... return bool(re_pattern.match(sentence))
... ismatching(sentence[pos+1:], patterns) True ismatching(sentence[pos+1:], ["green", "blue"]) False (For help with regular expressions, see: http://www.amk.ca/python/howto/regex/)
or, you can ask the regexp engine to starting looking at a point you specify:
def ismatching(sentence, patterns, startingpos = 0): ... re_pattern = re.compile("(%s)\Z" % "|".join(patterns))
... return bool(re_pattern.match(sentence, startingpos))
... ismatching(sentence, patterns, pos+1) True

but, you may be able to save the separate step of determining pos, by including
it in the regexp, e.g.,
def matching(patterns, sentence): ... re_pattern = re.compile("\$(%s)" % "|".join(patterns))
... return bool(re_pattern.search(sentence))
... matching(patterns, sentence) True matching(["green", "blue"], sentence) False
then, it might be more general useful to return the match, rather than the
boolean value - you can still use it in truth testing, since a no-match will
evaluate to False
def matching(patterns, sentence): ... re_pattern = re.compile("\$(%s)" % "|".join(patterns))
... return re_pattern.search(sentence)
... if matching(patterns, sentence): print "Match" ...
Match

Finally, if you are going to be doing a lot of these it would be faster to take
the pattern compilation out of the function, and simply use the pre-compiled
regexp, or as below, its bound method: search:
matching = re.compile("\$(%s)\Z" % "|".join(patterns)).search
matching(sentence) <_sre.SRE_Match object at 0x01847E60> bool(_) True bool(matching("the color is $red but there is more")) False bool(matching("the color is $pink")) False bool(matching("the $color is $red")) True


HTH

Michael


Dec 13 '05 #2

P: n/a
On Mon, 12 Dec 2005 ol****@gmail.com wrote:
I'd need to perform simple pattern matching within a string using a list
of possible patterns. For example, I want to know if the substring
starting at position n matches any of the string I have a list, as
below:

sentence = "the color is $red"
patterns = ["blue","red","yellow"]
pos = sentence.find($)
I assume that's a typo for "sentence.find('$')", rather than some new
syntax i've not learned yet!
# here I need to find whether what's after 'pos' matches any of the
strings of my 'patterns' list
bmatch = ismatching( sentence[pos:], patterns)

Is an equivalent of this ismatching() function existing in some Python
lib?


I don't think so, but it's not hard to write:

def ismatching(target, patterns):
for pattern in patterns:
if target.startswith(pattern):
return True
return False

You don't say what bmatch should be at the end of this, so i'm going with
a boolean; it would be straightforward to return the pattern which
matched, or the index of the pattern which matched in the pattern list, if
that's what you want.

The tough guy way to do this would be with regular expressions (in the re
module); you could do the find-the-$ and the match-a-pattern bit in one
go:

import re
patternsRe = re.compile(r"\$(blue)|(red)|(yellow)")
bmatch = patternsRe.search(sentence)

At the end, bmatch is None if it didn't match, or an instance of re.Match
(from which you can get details of the match) if it did.

If i was doing this myself, i'd be a bit cleaner and use non-capturing
groups:

patternsRe = re.compile(r"\$(?:blue)|(?:red)|(?:yellow)")

And if i did want to capture the colour string, i'd do it like this:

patternsRe = re.compile(r"\$((?:blue)|(?:red)|(?:yellow))")

If this all looks like utter gibberish, DON'T PANIC! Regular expressions
are quite scary to begin with (and certainly not very regular-looking!),
but they're actually quite simple, and often a very powerful tool for text
processing (don't get carried way, though; regular expressions are a bit
like absinthe, in that a little helps your creativity, but overindulgence
makes you use perl).

In fact, we can tame the regular expressions quite neatly by writing a
function which generates them:

def regularly_express_patterns(patterns):
pattern_regexps = map(
lambda pattern: "(?:%s)" % re.escape(pattern),
patterns)
regexp = r"\$(" + "|".join(pattern_regexps) + ")"
return re.compile(regexp)

patternsRe = regularly_express_patterns(patterns)

tom

--
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG
Dec 13 '05 #3

P: n/a
Taking you literally, I'm not sure you need regex. If you know or can
find position n, then can't you just:

sentence = "the color is $red"
patterns = ["blue","red","yellow"]
pos = sentence.find("$")
for x in patterns:
if x==sentence[pos+1:]:
print x, pos+1

But maybe I'm oversimplifying.

rpd

Dec 13 '05 #4

P: n/a
Even without the marker, can't you do:

sentence = "the fabric is red"
colors = ["red", "white", "blue"]

for color in colors:
if (sentence.find(color) > 0):
print color, sentence.find(color)

Dec 13 '05 #5

P: n/a
BartlebyScrivener wrote:
Even without the marker, can't you do:

sentence = "the fabric is red"
colors = ["red", "white", "blue"]

for color in colors:
if (sentence.find(color) > 0):
print color, sentence.find(color)

That depends on whether you're only looking for whole words:
colors = ['red', 'green', 'blue']
def findIt(sentence): .... for color in colors:
.... if sentence.find(color) > 0:
.... print color, sentence.find(color)
.... findIt("This is red") red 8 findIt("Fredrik Lundh") red 1


It's easy to see all the cases that this approach will fail for...
Dec 13 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.