469,916 Members | 2,371 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,916 developers. It's quick & easy.

need help of RE

hi all
a string like

"(word1 & (Word2|woRd3))"

how can i use the re to split it to

['word1', 'word2', 'word3']

Jul 19 '05 #1
11 1320

On May 29, 2005, at 12:39 AM, cheng wrote:
hi all
a string like

"(word1 & (Word2|woRd3))"

how can i use the re to split it to

['word1', 'word2', 'word3']


Could you be more exact about what the string is like? Does it
literally contain the characters '&' and '|' ? If so, just split at
them.

-- Elliot Temple
http://www.curi.us/
---
[This E-mail scanned for viruses by Declude Virus]

Jul 19 '05 #2
im sorry, my engilsh is not vell well,

the string not only contain '&' and '|' and it can be anyting

i just want to split out the "whole word" inside the string

Jul 19 '05 #3
i try

query = query.lower()
print re.split(r'\W+',theString)

the reslut is :
['', 'word1', 'word2', 'word3', '']

how can i fix the statment to get

['word1', 'word2', 'word3']

Jul 19 '05 #4
i try

theString= theString.lower()
print re.split(r'\W+',theString)

the reslut is :
['', 'word1', 'word2', 'word3', '']

how can i fix the statment to get

['word1', 'word2', 'word3']

Jul 19 '05 #5

On May 29, 2005, at 12:57 AM, cheng wrote:
im sorry, my engilsh is not vell well,

the string not only contain '&' and '|' and it can be anyting

i just want to split out the "whole word" inside the string


If the string could be anything, how do you know where the words are?

If it's whitespace that separates words, try out str.split()

-- Elliot Temple
http://www.curi.us/
---
[This E-mail scanned for viruses by Declude Virus]

Jul 19 '05 #6
cheng wrote:
im sorry, my engilsh is not vell well,
That's alright, you could have been french. ;)
the string not only contain '&' and '|' and it can be anyting

i just want to split out the "whole word" inside the string


Look at the example for split function of re module in the doc [1].

In short:
import re
s="(word1 & (Word2|woRd3))"
re.split("\W+",s) ['', 'word1', 'Word2', 'woRd3', ''] [w.lower() for w in re.split("\W+",s) if w != ''] ['word1', 'word2', 'word3']

[1]http://python.org/doc/lib/node114.html
Jul 19 '05 #7
On Sun, 29 May 2005 at 07:39 GMT, cheng wrote:
hi all
a string like

"(word1 & (Word2|woRd3))"

how can i use the re to split it to

['word1', 'word2', 'word3']


This splits the string on one or more occurrences of any character
that is not alphanumeric:

import re
str = "(word1 & (Word2|woRd3))"
s = re.sub("[^a-zA-Z0-9]+"," ",str).split()

--
Chris F.A. Johnson <http://cfaj.freeshell.org>
================================================== ================
Shell Scripting Recipes: A Problem-Solution Approach, 2005, Apress
<http://www.torfree.net/~chris/books/cfaj/ssr.html>
Jul 19 '05 #8
thx for help..i got it now :)

Jul 19 '05 #9

"cheng" <ma******@spymac.com> schrieb im Newsbeitrag
news:11**********************@z14g2000cwz.googlegr oups.com...
| hi all
| a string like
|
| "(word1 & (Word2|woRd3))"
|
| how can i use the re to split it to
|
| ['word1', 'word2', 'word3']
|

import re
s = "(word1 & (Word2|woRd3)"
parts = re.split('\W+', s)
print [p for p in parts if p]

--

Vincent Wehren
Jul 19 '05 #10
cheng wrote:
hi all
a string like

"(word1 & (Word2|woRd3))"

how can i use the re to split it to

['word1', 'word2', 'word3']


OK, so you know about the re module.

Look in the manual: there's a module-level function called
"split", with an example similar to yours. Did you try that?
Let's do it now:
import re
text = "(word1 & (Word2|woRd3))".lower() # you seem to want downshifting ... re.split(r"\W+", text) ['', 'word1', 'word2', 'word3', '']
Hmmm ... near, but not exactly what you want. We need to throw away
those empty strings, which will appear if you have non-word characters
at the ends of your text.

Two ways of doing that:
filter(None, re.split(r"\W+", text)) ['word1', 'word2', 'word3']

or
[x for x in re.split(r"\W+", text) if x]

['word1', 'word2', 'word3']

Forget about "filter". Read about "list comprehensions" and "generator
expressions" -- they are more general and powerful.

Cheers,
John

Jul 19 '05 #11
John Machin wrote:
>>> import re
>>> text = "(word1 & (Word2|woRd3))".lower() # you seem to want downshifting ... >>> re.split(r"\W+", text) ['', 'word1', 'word2', 'word3', ''] >>>


Hmmm ... near, but not exactly what you want. We need to throw away
those empty strings, which will appear if you have non-word characters
at the ends of your text.


You can also avoid the empty strings at the end by using re.findall with
\w instead of re.split with \W:

py> import re
py> text = "(word1 & (Word2|woRd3))".lower()
py> re.findall(r"\w+", text)
['word1', 'word2', 'word3']

STeVe
Jul 19 '05 #12

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by Sofia | last post: by
6 posts views Thread by Robert Maas, see http://tinyurl.com/uh3t | last post: by
reply views Thread by Gregory Nans | last post: by
8 posts views Thread by JustSomeGuy | last post: by
2 posts views Thread by Michael R. Pierotti | last post: by
8 posts views Thread by skumar434 | last post: by
reply views Thread by U S Contractors Offering Service A Non-profit | last post: by
1 post views Thread by Waqarahmed | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.