443,908 Members | 1,850 Online
Need help? Post your question and get tips & solutions from a community of 443,908 IT Pros & Developers. It's quick & easy.

# How do I find possible matches using regular expression?

 P: n/a Hi there, I'm trying to do some predicting work over user input, here's my question: for pattern r'match me', the string 'no' will definitely fail to match, but 'ma' still has a chance if user keep on inputting characters after 'ma', so how do I mark 'ma' as a possible match string? Thanks a lot, Andy Nov 23 '06 #1
9 Replies

 P: n/a "Andy"

 P: n/a The problem is the input will be much more complex than the example, it could be something like "30 minutes later" where any string starting with a number is a possible match. Paul McGuire ¼g¹D¡G "Andy"

 P: n/a Andy wrote: The problem is the input will be much more complex than the example, it could be something like "30 minutes later" where any string starting with a number is a possible match. so if I type "1", are you going to suggest all possible numbers that start with that digit? doesn't strike me as very practical. maybe you could post a more detailed example, where you clearly explain what a pattern is and how it is defined, what prediction means, and what you want to happen as new input arrives, so we don't have to guess? Nov 23 '06 #4

 P: n/a Andy wrote: Hi there, I'm trying to do some predicting work over user input, here's my question: for pattern r'match me', the string 'no' will definitely fail to match, but 'ma' still has a chance if user keep on inputting characters after 'ma', so how do I mark 'ma' as a possible match string? The answer is: Using regular expressions doesn't seem like a good idea. If you want to match against only one target, then target.startswith(user_input) is, as already mentioned, just fine. However if you have multiple targets, like a list of computer-language keywords, or the similar problem of an IME for a language like Chinese, then you can set up a prefix-tree dictionary so that you can search the multiple target keywords in parallel. All you need to do is keep a "finger" pointed at the node you have reached along the path; after each input character, either the finger gets pointed at the next relevant node (if the input character is valid) or you return/raise a failure indication. HTH, John Nov 23 '06 #5

 P: n/a Andy wrote: I'm trying to do some predicting work over user input, here's my question: for pattern r'match me', the string 'no' will definitely fail to match, but 'ma' still has a chance if user keep on inputting characters after 'ma', so how do I mark 'ma' as a possible match string? The following may or may not work in the real world: import re def parts(regex, flags=0): candidates = [] for stop in reversed(range(1, len(regex)+1)): partial = regex[:stop] try: r = re.compile(partial + "\$", flags) except re.error: pass else: candidates.append(r) candidates.reverse() return candidates if __name__ == "__main__": candidates = parts(r"[a-z]+\s*=\s*\d+", re.IGNORECASE) def check(*args): s = var.get() for c in candidates: m = c.match(s) if m: entry.configure(foreground="#008000") break else: entry.configure(foreground="red") import Tkinter as tk root = tk.Tk() var = tk.StringVar() var.trace_variable("w", check) entry = tk.Entry(textvariable=var) entry.pack() root.mainloop() The example lets you write an assignment of a numerical value, e. g meaning = 42 and colours the text in green or red for legal/illegal entries. Peter Nov 23 '06 #6

 P: n/a OK, here's what I want... I'm doing a auto-tasking tool in which user can specify the execution rule by inputting English instead of a complex GUI interface(Normally a combination of many controls). This way is way better user interaction. So the problem comes down to "understanding" user input and "suggesting" possible inputs when user is writing a rule. Rules will be like "30 minutes later", "Next Monday", "Every 3 hours", "3pm"...Sure this is an infinite collection, but it doesn't have to be perfect , it might make mistakes given inputs like "10 minutes after Every Monday noon". The "suggesting" feature is even harder, I'm still investigating possibilities. Tried NLTK_Lite, I'm sure it can understands well a good user input, but it is not doing good with some bad inputs("2 hours later here"), bad inputs makes the toolkit fails to parse it. And NLTK also does not help on the suggesting part. Now I'm thinking manipulating regular expressions. I think it's possible to come up with a collection of REs to understand basic execution rules. And the question is again how to do suggestions with the RE collection. Any thoughts on this subject? I'm not a native English speaker so...please, be mistake tolerant with my post here:-) "Fredrik Lundh Ð´µÀ£º " Andy wrote: The problem is the input will be much more complex than the example, it could be something like "30 minutes later" where any string starting with a number is a possible match. so if I type "1", are you going to suggest all possible numbers that start with that digit? doesn't strike me as very practical. maybe you could post a more detailed example, where you clearly explain what a pattern is and how it is defined, what prediction means, and what you want to happen as new input arrives, so we don't have to guess? Nov 23 '06 #7

 P: n/a The seems good to me, I'll try it out, thanks for the posting. "Peter Otten Ð´µÀ£º " Andy wrote: I'm trying to do some predicting work over user input, here's my question: for pattern r'match me', the string 'no' will definitely fail to match, but 'ma' still has a chance if user keep on inputting characters after 'ma', so how do I mark 'ma' as a possible match string? The following may or may not work in the real world: import re def parts(regex, flags=0): candidates = [] for stop in reversed(range(1, len(regex)+1)): partial = regex[:stop] try: r = re.compile(partial + "\$", flags) except re.error: pass else: candidates.append(r) candidates.reverse() return candidates if __name__ == "__main__": candidates = parts(r"[a-z]+\s*=\s*\d+", re.IGNORECASE) def check(*args): s = var.get() for c in candidates: m = c.match(s) if m: entry.configure(foreground="#008000") break else: entry.configure(foreground="red") import Tkinter as tk root = tk.Tk() var = tk.StringVar() var.trace_variable("w", check) entry = tk.Entry(textvariable=var) entry.pack() root.mainloop() The example lets you write an assignment of a numerical value, e. g meaning = 42 and colours the text in green or red for legal/illegal entries. Peter Nov 23 '06 #8

 P: n/a This works well as a checking strategy, but what I want is a suggesting list... Maybe what I want is not practical at all? Thanks anyway Peter. Andy Wu Andy Œ‘µÀ£º The seems good to me, I'll try it out, thanks for the posting. "Peter Otten Ð´µÀ£º " Andy wrote: I'm trying to do some predicting work over user input, here's my question: > for pattern r'match me', the string 'no' will definitely fail to match, but 'ma' still has a chance if user keep on inputting characters after 'ma', so how do I mark 'ma' as a possible match string? The following may or may not work in the real world: import re def parts(regex, flags=0): candidates = [] for stop in reversed(range(1, len(regex)+1)): partial = regex[:stop] try: r = re.compile(partial + "\$", flags) except re.error: pass else: candidates.append(r) candidates.reverse() return candidates if __name__ == "__main__": candidates = parts(r"[a-z]+\s*=\s*\d+", re.IGNORECASE) def check(*args): s = var.get() for c in candidates: m = c.match(s) if m: entry.configure(foreground="#008000") break else: entry.configure(foreground="red") import Tkinter as tk root = tk.Tk() var = tk.StringVar() var.trace_variable("w", check) entry = tk.Entry(textvariable=var) entry.pack() root.mainloop() The example lets you write an assignment of a numerical value, e. g meaning = 42 and colours the text in green or red for legal/illegal entries. Peter Nov 24 '06 #9

 P: n/a Andy wrote: This works well as a checking strategy, but what I want is a suggesting list... Indeed, I was grossly misreading your question. Peter Nov 24 '06 #10

### This discussion thread is closed

Replies have been disabled for this discussion.