By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,715 Members | 748 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,715 IT Pros & Developers. It's quick & easy.

Help With EOF character: URGENT

P: n/a
Hi Buddies,
I am facing this problem and I dont know what to use
as EOF in python:
I want to read a file, and put all the individual
words in a dictionary with their index:
For example if the file is:

Hello there I am doing fine
How are you?

So I want to make an index like this:

1 Hello
2 there
3 I
4 am
5 doing
6 fine
7 How
8 are
9 you
10 ?

In order to do this: I have written a small code which
is here:
-------------------------------------------------------
# python code for creating dictionary of words from an
#input file
------------------------------------------------------

import os
import sys
try:
fread = open('training_data', 'r')
except IOError:
print 'Cant open file for reading'
sys.exit(0)
print 'Okay reading the file'
s=""
a=fread.read(1)
while (a!="\003"):
#while 1:
s=s+a

if(a=='\012'): #newline
#print s
#print 'The Line Ends'
fwrite=open('dictionary', 'a')
fwrite.write(s)
s=""
if(a=='\040'): #blank character
#print s
fwrite=open('dictionary', 'a')
fwrite.write(s)
fwrite.write("\n")
s=""
a=fread.read(1)

print 'Wrote to Dictionary\n'
fwrite.close()
fread.close()
---------------------------------------------------

My problem is that I dont know what to use in place of
EOF. I have tried using Octal "\003" and "\004" but
that does not work. The code keeps on running. I want
it to stop reading when the EOF has reached.
Can someone help me out on this?
Also, I have to create a list: (A Map kind of thing
with an index associated with each word). Can some one
offer a tip or snippet on that.
I will be really grateful.

Thanks
Dont
__________________________________
Do you Yahoo!?
Yahoo! Mail SpamGuard - Read only the mail you want.
http://antispam.yahoo.com/tools

Jul 18 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
On Sunday 22 February 2004 20:10, dont bother wrote:
My problem is that I dont know what to use in place of
EOF. I have tried using Octal "\003" and "\004" but
that does not work. The code keeps on running. I want
it to stop reading when the EOF has reached.
Can someone help me out on this?


Have you tried just using "while a:"? When it can't read anything -- i.e.
EOF was found -- it will return False and the loop will end.
Be seeing you,
--
Godoy. <go***@ieee.org>
Jul 18 '05 #2

P: n/a
dont bother wrote:
Hi Buddies,
I am facing this problem and I dont know what to use
as EOF in python:


There's no "EOF character" in Python. When the end of a file is
reached, reading from it returns an empty string. To process
a file one character at a time, you can do

while 1:
c = f.read(1)
if not c:
break
# process c here

In your case you seem to be dealing with words, so you can
take advantage of two Python features: (1) You can read
a line at a time with the readline() method. (2) You can
split a string into words with the split() method of strings.

while 1:
line = f.readline()
if not line:
break
words = line.split()
for word in words:
# process word here

If you have a recent enough Python (>= 2.2 I think), you can
also iterate directly over the file, which will iterate over
its lines, so the above reduces to just

for line in f:
words = line.split()
for word in words:
# process word here

Note: The readline() method, and also "for line in f", returns
lines including the newline character on the end. That doesn't
matter here, because line.split() gets rid of all the whitespace,
but you need to be aware of it if you do other things with
the line. You can use

line = line.strip()

to remove the newline if you need to.

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg

Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.