469,270 Members | 1,155 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,270 developers. It's quick & easy.

Split function to split sentence into words

11
Hi,

i don't have enough experience in writing codes in Python but now i'm trying to see how i can start using Python.
I've tried to write a simple program that can display a sentence. now my problem is how to write a code using split function to split that sentence into words then print out each word separately. let me give u an example:

>>>sentence=" My question is to know how to write a code in Python"

then the output of this sentece must give:

sentence[1]=My
sentence[2]=question
sentence[3]=is
sentence[4]=to
sentence[5]=know
......
.......

Can someone help me in this?
Nov 15 '08 #1
19 65401
oler1s
671 Expert 512MB
Always check the documentation, for something interesting. In this case, look at possible string methods ( http://www.python.org/doc/2.5.2/lib/string-methods.html ) and you will see a split function. Here’s a quick example.
Expand|Select|Wrap|Line Numbers
  1. >>> sent = "Jack ate the apple."
  2. >>> splitsent = sent.split(' ')
  3. >>> splitsent
  4. ['Jack', 'ate', 'the', 'apple.']
Simple as that.
Nov 15 '08 #2
fellya
11
thank you for the help but the question is not fully answered! with this program it will split the sentence but i would like the output to be lets say if we have a= jack ate the apple, i would like the output to be:
a[0]=jack
a[1]=ate
a[2]=the
a[3]=apple

can you please see if its possible to get the above output?
Nov 15 '08 #3
bvdet
2,851 Expert Mod 2GB
thank you for the help but the question is not fully answered! with this program it will split the sentence but i would like the output to be lets say if we have a= jack ate the apple, i would like the output to be:
a[0]=jack
a[1]=ate
a[2]=the
a[3]=apple

can you please see if its possible to get the above output?
The answer is: string formatting!

Example:
Expand|Select|Wrap|Line Numbers
  1. >>> sentence = "The dog ate my homework"
  2. >>> for i,word in enumerate(sentence.split()):
  3. ...     print "Word #%d: %s" % (i, word)
  4. ...     
  5. Word #0: The
  6. Word #1: dog
  7. Word #2: ate
  8. Word #3: my
  9. Word #4: homework
  10. >>> 
Nov 15 '08 #4
fellya
11
thank you for the help, but as u can see with the output below when i do the command sentence[0] to show me the first word it is showing me "T" this is not what i want!!! for me i would like to see if i type the command sentence[0]; to display "The" and if i type again sentence[1]; it has to give me "dog"


can you plz help!
Expand|Select|Wrap|Line Numbers
  1. >>> sentence="The dog ate my homework"
  2. >>> for i, word in enumerate(sentence.split()):
  3. ...             print " word #%d: %s" % (i,word)
  4. ...
  5.  word #0: The
  6.  word #1: dog
  7.  word #2: ate
  8.  word #3: my
  9.  word #4: homework
  10. >>> sentence[0]
  11. 'T'
  12. >>> sentence[1];
  13. 'h'
  14. >>>
  15.  
Nov 16 '08 #5
bvdet
2,851 Expert Mod 2GB
How about this?
Expand|Select|Wrap|Line Numbers
  1. >>> split_sentence = sentence.split()
  2. >>> split_sentence[0]
  3. 'The'
  4. >>> split_sentence[1]
  5. 'dog'
  6. >>> 
Nov 16 '08 #6
fellya
11
ohhhh thank you so much.
thats what i wanted.
may God bless U.
once again thank you
Nov 16 '08 #7
fellya
11
hi, i have another question related to the above:
I have created a file of more than 100 sentences in it then i saved it with extension .py , then i'm using the operations to open the file which are:
Expand|Select|Wrap|Line Numbers
  1. f=open("example.py")
  2. try:
  3.     for line in f:
  4.                     print line
  5. finally:
  6.           f.close()
  7.  
so after using the above commands im able to open my file. Now i know how to split a sentence into words, the problem comes now how can i do it on a file containing more than 100 sentences in it? with 1 or 2 senteces i can write the sentences and split them, now how about a file with many sentences?

can someone help me?
Nov 16 '08 #8
bvdet
2,851 Expert Mod 2GB
Please use code tags around code. It will make your code much easier to read.

[CODE]..code goes here..[/CODE]

In your code, you are iterating on each line in the file. Each iteration, the variable line represents a sentence. Do you want to save the sentence in a list? What do you want to do with 100 sentences?

The following will save a list of lists. You can access each word by list index.
Expand|Select|Wrap|Line Numbers
  1. lineList = [line.strip().split() for line in open("your_file").readlines()]
  2. # print the first word in the first line.
  3. print lineList[0][0]
Nov 16 '08 #9
fellya
11
i dont need to save a sentence in a list. what i want to do is : i take a document which has like any number of sentences then by using Python i would like to split the document of any number of sentences into words where each word has a number e.g., word1=the, word2= apple ect. then by this output i will use an other program that can help me to identify if word1 is a noun or not and son on. Brief after getting all the words in a document , I will try to identify only noun and extract only nouns from the doc.
Nov 16 '08 #10
bvdet
2,851 Expert Mod 2GB
The previous code I posted above will work fine for your purpose. To get all the words in a single list:
Expand|Select|Wrap|Line Numbers
  1. wordList = reduce(lambda x,y: x+y, lineList, [])
Now you have a list of all the words. To iterate on the list of words:
Expand|Select|Wrap|Line Numbers
  1. >>> lineList = [['1','2','3'],['4','5','6']]
  2. >>> reduce(lambda x,y: x+y, lineList, [])
  3. ['1', '2', '3', '4', '5', '6']
  4. >>> wordList = reduce(lambda x,y: x+y, lineList, [])
  5. >>> for i,word in enumerate(wordList):
  6. ...     print "Word[%d]: %s" % (i,word)
  7. ...     
  8. Word[0]: 1
  9. Word[1]: 2
  10. Word[2]: 3
  11. Word[3]: 4
  12. Word[4]: 5
  13. Word[5]: 6
  14. >>> 
Nov 16 '08 #11
fellya
11
hey thanks for the help.
my dear your last solution works perfectly with numbers!!!
but the one that i was lookin for is the solution u gave me in your reply number 9 :
Expand|Select|Wrap|Line Numbers
  1. lineList = [line.strip().split() for line in open("your_file").readlines()] 
  2. # print the first word in the first line. 
  3. print lineList[0][0]
  4.  
this solution is helping me to find one word at a time. imagine i have a doc of two pages, the above codes will take time. because when i'm typing like print lineList[0][3] it is giving me the third word in my doc which is perfect, but the problem i have to type print lineList[0][1] upto print lineList[0][n] with n the last word in my doc!!! i want codes like the above one but which will not ask me to type print lineList[][] to get only one word in my doc.
can u please help? i know the codes u gave me are working but the problem i have to type print lineList[][] for each word.
Nov 23 '08 #12
bvdet
2,851 Expert Mod 2GB
If you have a list of lists:
Expand|Select|Wrap|Line Numbers
  1. >>> list_of_lists = [[1,2,3],[4,5,6],[7,8,9]]
  2. >>> for i, item in enumerate(list_of_lists):
  3. ...     for j, word in enumerate(item):
  4. ...         print "List item #%d, Word #%d: %s" % (i,j,word)
  5. ...         
  6. List item #0, Word #0: 1
  7. List item #0, Word #1: 2
  8. List item #0, Word #2: 3
  9. List item #1, Word #0: 4
  10. List item #1, Word #1: 5
  11. List item #1, Word #2: 6
  12. List item #2, Word #0: 7
  13. List item #2, Word #1: 8
  14. List item #2, Word #2: 9
  15. >>> 
Nov 24 '08 #13
fellya
11
okay thank you so much for the help bvdet!!!
but i think u didn't get my question. Ok let me be clear and simple. let us assume i have a file called ex1.py, then in this doc i have more than one paragraph. to open the file i know the pocedure to open a file. now I would like to know if there is a way i can open the file, then read like one sentence or paragraph of the doc then after readin the sentence, i split that sentence such that if the sentence was "jack is a hard worker" i want to have the output like :
word 1: jack
word 2: is
word 3: a
word 4: hard
word 5: worker.

then after reading and splitting that sentence, i go to the next sentence in the file and do the same thing.

is there any way to do it in python?
I need help please!!!
Nov 30 '08 #14
bvdet
2,851 Expert Mod 2GB
You will need to establish rules for determining what is a sentence. If the file is not too big, you can read the entire file into a string and split on the periods.
Expand|Select|Wrap|Line Numbers
  1. >>> import re
  2. >>> s = 'This is a paragraph. How will we split it? We can use re module split()! We should get four sentences.'
  3. >>> sList = [item.strip() for item in re.split('[!?.]', s) if item]
  4. >>> sList
  5. ['This is a paragraph', 'How will we split it', 'We can use re module split()', 'We should get four sentences']
  6. >>> 
Nov 30 '08 #15
fellya
11
thank you for ths answer but that is not want i want,
i have a file called ex1.py then to open it i do:
Expand|Select|Wrap|Line Numbers
  1. f=open("ex1.py")
  2. try:
  3. ........................
  4.  
then after all the procedures of opening a file i have:

jack is a brother of carine, .................................................. ...


My question is: is there anyway after opening this file which contain like 5 paragraph, to be splited into words?
Nov 30 '08 #16
bvdet
2,851 Expert Mod 2GB
It seems that we already covered this:
Expand|Select|Wrap|Line Numbers
  1. import re
  2. s = open('your_file').read()
  3. wordList = []
  4. for sentence in [item.strip() for item in re.split(r'[!?.]', s.replace('\n',' ')) if item]:
  5.     wordList.append(sentence.split())
Nov 30 '08 #17
fellya
11
ohhhh thank you but using the codes i couldn't get anything. maybe i used it wrong. what do u mean when u put [! ? .] or ' '
it seems like i was supposed to put something instead of those symbols.
please look at what i did in the below code. ntbs1.py is my file.
Expand|Select|Wrap|Line Numbers
  1. >>> import re
  2. >>> s=open('ntbs1.py').read()
  3. >>> wordList=[]
  4. >>> for sentence in [item.strip() for item in re.split(r'[!?.]', s.replace('\n',
  5. '')) if item]:
  6. ...       wordList.append(sentence.split())
  7. ...
  8. >>>
  9.  
Dec 1 '08 #18
bvdet
2,851 Expert Mod 2GB
The sentences are split on the characters inside the brackets (!?.). Each newline character (\n) is replaced with a space character.
Dec 1 '08 #19
NeoPa
32,171 Expert Mod 16PB
Fellya,

You have been asked to enclose all your code within [ CODE ] tags. We have rules on this site that demand you do that. Please pay attention in future to making sure all your code is posted that way to ensure it is easier to understand and doesn't waste the time of our experts trying to decipher it.

-Administrator.
Dec 3 '08 #20

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

5 posts views Thread by Arjen | last post: by
5 posts views Thread by NewToThis | last post: by
3 posts views Thread by Reb | last post: by
5 posts views Thread by sck10 | last post: by
1 post views Thread by John | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.