By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,784 Members | 3,536 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,784 IT Pros & Developers. It's quick & easy.

a string problem

P: n/a
hi

if i have a some lines like this
a ) "here is first string"
b ) "here is string2"
c ) "here is string3"

When i specify i only want to print the lines that contains "string" ie
the first line and not the others. If i use re module, how to compile
the expression to do this? I tried the re module and using simple
search() and everytime it gives me all the 3 lines that have "string"
in it, whereas i only need line 1.
If re module is not needed, how can i use string manipulation to do
this? thanks

Jun 13 '06 #1
Share this Question
Share on Google+
10 Replies


P: n/a
mi*******@hotmail.com wrote:
hi

if i have a some lines like this
a ) "here is first string"
b ) "here is string2"
c ) "here is string3"

When i specify i only want to print the lines that contains "string" ie
the first line and not the others. If i use re module, how to compile
the expression to do this? I tried the re module and using simple
search() and everytime it gives me all the 3 lines that have "string"
in it, whereas i only need line 1.
If re module is not needed, how can i use string manipulation to do
this? thanks


As far as re goes, you can search for the pattern '\bstring\b', which
will find just the word 'string' itself. Not sure if there's a better
way to do it with REs.

And I'm actually ashamed to admit that I know the RE way, but not the
regular string manipulation way, if there is one! This seems like
something easy enough to do without REs though.
Jun 13 '06 #2

P: n/a

John Salerno wrote:
mi*******@hotmail.com wrote:
hi

if i have a some lines like this
a ) "here is first string"
b ) "here is string2"
c ) "here is string3"

When i specify i only want to print the lines that contains "string" ie
the first line and not the others. If i use re module, how to compile
the expression to do this? I tried the re module and using simple
search() and everytime it gives me all the 3 lines that have "string"
in it, whereas i only need line 1.
If re module is not needed, how can i use string manipulation to do
this? thanks


As far as re goes, you can search for the pattern '\bstring\b', which
will find just the word 'string' itself. Not sure if there's a better
way to do it with REs.

And I'm actually ashamed to admit that I know the RE way, but not the
regular string manipulation way, if there is one! This seems like
something easy enough to do without REs though.


thanks !

Jun 13 '06 #3

P: n/a
MTD
> When i specify i only want to print the lines that contains "string" ie
the first line and not the others. If i use re module, how to compile
the expression to do this? I tried the re module and using simple
search() and everytime it gives me all the 3 lines that have "string"
in it, whereas i only need line 1.


That's because all three lines DO include the substring "string"

Jun 13 '06 #4

P: n/a

John Salerno wrote:
mi*******@hotmail.com wrote:
hi

if i have a some lines like this
a ) "here is first string"
b ) "here is string2"
c ) "here is string3"

When i specify i only want to print the lines that contains "string" ie
the first line and not the others. If i use re module, how to compile
the expression to do this? I tried the re module and using simple
search() and everytime it gives me all the 3 lines that have "string"
in it, whereas i only need line 1.
If re module is not needed, how can i use string manipulation to do
this? thanks


As far as re goes, you can search for the pattern '\bstring\b', which
will find just the word 'string' itself. Not sure if there's a better
way to do it with REs.

And I'm actually ashamed to admit that I know the RE way, but not the
regular string manipulation way, if there is one! This seems like
something easy enough to do without REs though.


if RE has the \b and it works, can we look into the source of re and
see how its done for \b ?

Jun 13 '06 #5

P: n/a

John Salerno wrote:
mi*******@hotmail.com wrote:
hi

if i have a some lines like this
a ) "here is first string"
b ) "here is string2"
c ) "here is string3"

When i specify i only want to print the lines that contains "string" ie
the first line and not the others. If i use re module, how to compile
the expression to do this? I tried the re module and using simple
search() and everytime it gives me all the 3 lines that have "string"
in it, whereas i only need line 1.
If re module is not needed, how can i use string manipulation to do
this? thanks


As far as re goes, you can search for the pattern '\bstring\b', which
will find just the word 'string' itself. Not sure if there's a better
way to do it with REs.

And I'm actually ashamed to admit that I know the RE way, but not the
regular string manipulation way, if there is one! This seems like
something easy enough to do without REs though.


just curious , if RE has the \b and it works, can we look into the
source of re and see how its done for \b ?

Jun 13 '06 #6

P: n/a
mi*******@hotmail.com wrote:
just curious , if RE has the \b and it works, can we look into the
source of re and see how its done for \b ?


I had a look in the sre module (which re seems to import), but I
couldn't find much. I'm not the best at analyzing source code, though. :)

What is it you want to know about \b? It searches for the empty string
before and after a word (word being an alphanumeric character that can
include underscores).

A little more specific info is in the docs:

Matches the empty string, but only at the beginning or end of a word. A
word is defined as a sequence of alphanumeric or underscore characters,
so the end of a word is indicated by whitespace or a non-alphanumeric,
non-underscore character. Note that \b is defined as the boundary
between \w and \ W, so the precise set of characters deemed to be
alphanumeric depends on the values of the UNICODE and LOCALE flags.
Inside a character range, \b represents the backspace character, for
compatibility with Python's string literals.
Jun 13 '06 #7

P: n/a

John Salerno wrote:
mi*******@hotmail.com wrote:
just curious , if RE has the \b and it works, can we look into the
source of re and see how its done for \b ?


I had a look in the sre module (which re seems to import), but I
couldn't find much. I'm not the best at analyzing source code, though. :)

What is it you want to know about \b? It searches for the empty string
before and after a word (word being an alphanumeric character that can
include underscores).

A little more specific info is in the docs:

Matches the empty string, but only at the beginning or end of a word. A
word is defined as a sequence of alphanumeric or underscore characters,
so the end of a word is indicated by whitespace or a non-alphanumeric,
non-underscore character. Note that \b is defined as the boundary
between \w and \ W, so the precise set of characters deemed to be
alphanumeric depends on the values of the UNICODE and LOCALE flags.
Inside a character range, \b represents the backspace character, for
compatibility with Python's string literals.


thanks..actually i had seen \b in the docs before, just that it slipped
my mind when i was doing the coding. was even meddling with look aheads
...which is not the answer anyway.
well, since re has the \b, was wondering why there is no implementation
of it in strings. So the idea of looking at the source or re on how
it's done came to my mine..i suppose we have to go down to viewing the
C source then..:-)

Jun 13 '06 #8

P: n/a

mi*******@hotmail.com wrote:
hi

if i have a some lines like this
a ) "here is first string"
b ) "here is string2"
c ) "here is string3"

When i specify i only want to print the lines that contains "string" ie
the first line and not the others. If i use re module, how to compile
the expression to do this? I tried the re module and using simple
search() and everytime it gives me all the 3 lines that have "string"
in it, whereas i only need line 1.
If re module is not needed, how can i use string manipulation to do
this? thanks


If this is a RL-situation,
if mystring.endswith('string') will do

Jun 13 '06 #9

P: n/a
Le Mardi 13 Juin 2006 15:59, John Salerno a écrit*:
And I'm actually ashamed to admit that I know the RE way, but not the
regular string manipulation way, if there is one!
eheh,

In [39]: import string

In [40]: sub, s1, s2 = 'string', 're string2, ,string1', 're string2, ,string'

In [41]: sub in [ e.strip(string.punctuation) for e in s1.split() ]
Out[41]: False

In [42]: sub in [ e.strip(string.punctuation) for e in s2.split() ]
Out[42]: True
This seems like
something easy enough to do without REs though.


Yes, but python way seems a little faster

python2.4 -mtimeit -s "import re" "re.match('\bstring\b', 're
string2, ,string1') and True"
100000 loops, best of 3: 7.3 usec per loop
python2.4 -mtimeit -s "import string" "'string' in [
e.strip(string.punctuation) for e in 're string2, ,string1'.split() ]"
100000 loops, best of 3: 6.99 usec per loop
--
_____________

Maric Michaud
_____________

Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097
Jun 13 '06 #10

P: n/a
John Salerno wrote:
mi*******@hotmail.com wrote:
hi

if i have a some lines like this
a ) "here is first string"
b ) "here is string2"
c ) "here is string3"

When i specify i only want to print the lines that contains "string" ie

...
And I'm actually ashamed to admit that I know the RE way, but not the
regular string manipulation way, if there is one! This seems like
something easy enough to do without REs though.


I'd just split it on whitespace, just like with the RE:
if "string" in "here is first string".split(): ...
Jun 13 '06 #11

This discussion thread is closed

Replies have been disabled for this discussion.