By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,676 Members | 2,262 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,676 IT Pros & Developers. It's quick & easy.

re Insanity

P: n/a
For some reason, I am having the hardest time doing something that should
be obvious. (Note time of posting ;)

Given an arbitrary string, I want to find each individual instance of
text in the form: "[PROMPT:optional text]"

I tried this:

y=re.compile(r'\[PROMPT:.*\]')

Which works fine when the text is exactly "[PROMPT:whatever]" but
does not match on:

"something [PROMPT:foo] something [PROMPT:bar] something ..."

The overall goal is to identify the beginning and end of each [PROMPT...]
string in the line.

Ideas anyone?
--
----------------------------------------------------------------------------
Tim Daneliuk tu****@tundraware.com
PGP Key: http://www.tundraware.com/PGP/

Jul 18 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Tim Daneliuk wrote:
Given an arbitrary string, I want to find each individual instance of
text in the form: "[PROMPT:optional text]"

I tried this:

y=re.compile(r'\[PROMPT:.*\]')

Which works fine when the text is exactly "[PROMPT:whatever]"
didn't you leave something out here? "compile" only compiles that pattern;
it doesn't match it against your string...
but does not match on:

"something [PROMPT:foo] something [PROMPT:bar] something ..."

The overall goal is to identify the beginning and end of each [PROMPT...]
string in the line.


if the pattern can occur anywhere in the string, you need to use "search",
not "match". if you want multiple matches, you can use "findall" or, better
in this case, "finditer":

import re

s = "something [PROMPT:foo] something [PROMPT:bar] something"

for m in re.finditer(r'\[PROMPT:[^]]*\]', s):
print m.span(0)

prints

(10, 22)
(33, 45)

which looks reasonably correct.

(note the "[^x]*x" form, which is an efficient way to spell "non-greedy match"
for cases like this)

</F>

Jul 18 '05 #2

P: n/a
Tim Daneliuk wrote:

I tried this:

y=re.compile(r'\[PROMPT:.*\]')

Which works fine when the text is exactly "[PROMPT:whatever]" but
does not match on:

"something [PROMPT:foo] something [PROMPT:bar] something ..."

The overall goal is to identify the beginning and end of each [PROMPT...]
string in the line.


The answer sort of depends on exactly what can be in your optional text:
import re
s = "something [PROMPT:foo] something [PROMPT:bar] something ..."
y=re.compile(r'\[PROMPT:.*\]')
y.findall(s) ['[PROMPT:foo] something [PROMPT:bar]'] y=re.compile(r'\[PROMPT:.*?\]')
y.findall(s) ['[PROMPT:foo]', '[PROMPT:bar]'] y=re.compile(r'\[PROMPT:[^]]*\]')
y.findall(s) ['[PROMPT:foo]', '[PROMPT:bar]']


..* will match as long a string as possible.

..*? will match as short a string as possible. By default this won't match
any newlines.

[^]]* will match as long a string that doesn't contain ']' as possible.
This will match newlines.
Jul 18 '05 #3

P: n/a
In article <4l*************@eskimo.tundraware.com>,
Tim Daneliuk <tu****@tundraware.com> wrote:

Given an arbitrary string, I want to find each individual instance of
text in the form: "[PROMPT:optional text]"

I tried this:

y=re.compile(r'\[PROMPT:.*\]')

Which works fine when the text is exactly "[PROMPT:whatever]" but
does not match on:

"something [PROMPT:foo] something [PROMPT:bar] something ..."

The overall goal is to identify the beginning and end of each [PROMPT...]
string in the line.

Ideas anyone?


Yeah, read the Friedl book. (Okay, so that's not gonna help right now,
but trust me, if you're going to write lots of regexes, READ THAT BOOK.)
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing." --Alan Perlis
Jul 18 '05 #4

P: n/a
Aahz wrote:
In article <4l*************@eskimo.tundraware.com>,
Tim Daneliuk <tu****@tundraware.com> wrote:
Given an arbitrary string, I want to find each individual instance of
text in the form: "[PROMPT:optional text]"

I tried this:

y=re.compile(r'\[PROMPT:.*\]')

Which works fine when the text is exactly "[PROMPT:whatever]" but
does not match on:

"something [PROMPT:foo] something [PROMPT:bar] something ..."

The overall goal is to identify the beginning and end of each [PROMPT...]
string in the line.

Ideas anyone?

Yeah, read the Friedl book. (Okay, so that's not gonna help right now,
but trust me, if you're going to write lots of regexes, READ THAT BOOK.)


I've read significant parts of it. The problem is that I don't write
re often enough to recall all the subtle details ... plus I am getting
old and feeble... ;)

--
----------------------------------------------------------------------------
Tim Daneliuk tu****@tundraware.com
PGP Key: http://www.tundraware.com/PGP/
Jul 18 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.