By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,548 Members | 1,396 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,548 IT Pros & Developers. It's quick & easy.

file.tell() ?

P: n/a
Is this a bug? (file is an open text file):
for i in range(0,5): .... var = file.next()
.... file.tell()
....
1675L
1675L
1675L
1675L
1675L

I would have thought that it would increase as the position of the
file.

When I use readline, it works as I would expect:
for i in range(0,5):

.... var = file.readline()
.... file.tell()
....
18L
31L
53L
67L
85L

The reason I ask is, I have a very large file to parse line by line.
I thought I'd try and use an iterator, but it looks like the iterator
is really reading the entire file into memory before it starts
iterating. So my best option is still to use file.readline().

Am I understanding this correctly? Am I using the iterator
incorrectly?

Thanks,
Chris
Jul 18 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Chris McAvoy wrote:
The reason I ask is, I have a very large file to parse line by line.
I thought I'd try and use an iterator, but it looks like the iterator
is really reading the entire file into memory before it starts
iterating. So my best option is still to use file.readline().

Am I understanding this correctly? Am I using the iterator
incorrectly?


The iterating methods of file input tend to buffer input, so calling
things like .tell or additionally trying to read data manually is not
going to work properly.

If it's important to you that you have total control over the current
"read pointer" in the file, call .readline manually. If you don't care
and just want to read through everything, use the iterators.

--
__ Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
/ \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ The average dog is a nicer person than the average person.
-- Andrew A. Rooney
Jul 18 '05 #2

P: n/a
When using a file as an iterator, multiple lines are read at a time.
If you have a long file, not all the lines will be read at once.
When I wrote xreadlines (for python 2.1 or 2.2) it was defined in terms of
readlines(SIZEHINT), but I am no longer familiar with the implementation.
I don't think the exact details are documented anywhere, or guaranteed
not to change between releases.

I wrote a small program to read all the lines in /usr/share/dict/words
and keep a record of all the positions returned by tell(). Here are the
results:
[jepler@parrot jepler]$ cat /tmp/mcavoy.py
d = {}
f = file("/usr/share/dict/words")
for l in f:
d[f.tell()] = None
dk = d.keys()
dk.sort()
print dk
print dk[-1] * 1.0 / len(dk) # Average block size

[jepler@parrot jepler]$ python /tmp/mcavoy.py
[8196L, 16393L, 24596L, 32793L, 40994L, 49186L, 57379L, 65576L,
73776L, 81972L, 90167L, 98362L, 106557L, 114750L, 122945L, 131137L,
139332L, 147532L, 155729L, 163922L, 172119L, 180316L, 188515L,
196710L, 204910L, 213105L, 221306L, 229505L, 237697L, 245896L,
254088L, 262291L, 270486L, 278688L, 286893L, 295092L, 303288L,
311488L, 319687L, 327884L, 336082L, 344277L, 352474L, 360675L,
368869L, 377061L, 385261L, 393459L, 401656L, 409305L]
8186.1

As you can see, my Python reads about 8K at a time, which is a perfectly
reasonable amount on any machine I still use.

Jeff

Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.