473,385 Members | 1,863 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

file.tell() ?

Is this a bug? (file is an open text file):
for i in range(0,5): .... var = file.next()
.... file.tell()
....
1675L
1675L
1675L
1675L
1675L

I would have thought that it would increase as the position of the
file.

When I use readline, it works as I would expect:
for i in range(0,5):

.... var = file.readline()
.... file.tell()
....
18L
31L
53L
67L
85L

The reason I ask is, I have a very large file to parse line by line.
I thought I'd try and use an iterator, but it looks like the iterator
is really reading the entire file into memory before it starts
iterating. So my best option is still to use file.readline().

Am I understanding this correctly? Am I using the iterator
incorrectly?

Thanks,
Chris
Jul 18 '05 #1
2 6437
Chris McAvoy wrote:
The reason I ask is, I have a very large file to parse line by line.
I thought I'd try and use an iterator, but it looks like the iterator
is really reading the entire file into memory before it starts
iterating. So my best option is still to use file.readline().

Am I understanding this correctly? Am I using the iterator
incorrectly?


The iterating methods of file input tend to buffer input, so calling
things like .tell or additionally trying to read data manually is not
going to work properly.

If it's important to you that you have total control over the current
"read pointer" in the file, call .readline manually. If you don't care
and just want to read through everything, use the iterators.

--
__ Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
/ \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ The average dog is a nicer person than the average person.
-- Andrew A. Rooney
Jul 18 '05 #2
When using a file as an iterator, multiple lines are read at a time.
If you have a long file, not all the lines will be read at once.
When I wrote xreadlines (for python 2.1 or 2.2) it was defined in terms of
readlines(SIZEHINT), but I am no longer familiar with the implementation.
I don't think the exact details are documented anywhere, or guaranteed
not to change between releases.

I wrote a small program to read all the lines in /usr/share/dict/words
and keep a record of all the positions returned by tell(). Here are the
results:
[jepler@parrot jepler]$ cat /tmp/mcavoy.py
d = {}
f = file("/usr/share/dict/words")
for l in f:
d[f.tell()] = None
dk = d.keys()
dk.sort()
print dk
print dk[-1] * 1.0 / len(dk) # Average block size

[jepler@parrot jepler]$ python /tmp/mcavoy.py
[8196L, 16393L, 24596L, 32793L, 40994L, 49186L, 57379L, 65576L,
73776L, 81972L, 90167L, 98362L, 106557L, 114750L, 122945L, 131137L,
139332L, 147532L, 155729L, 163922L, 172119L, 180316L, 188515L,
196710L, 204910L, 213105L, 221306L, 229505L, 237697L, 245896L,
254088L, 262291L, 270486L, 278688L, 286893L, 295092L, 303288L,
311488L, 319687L, 327884L, 336082L, 344277L, 352474L, 360675L,
368869L, 377061L, 385261L, 393459L, 401656L, 409305L]
8186.1

As you can see, my Python reads about 8K at a time, which is a perfectly
reasonable amount on any machine I still use.

Jeff

Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: simon place | last post by:
is the code below meant to produce rubbish?, i had expected an exception. f=file('readme.txt','w') f.write(' ') f.read() ( PythonWin 2.3 (#46, Jul 29 2003, 18:54:32) on win32. ) I got...
6
by: Russell E. Owen | last post by:
At one time, mixing for x in file and readline was dangerous. For example: for line in file: # read some lines from a file, then break nextline = readline() # bad would not do what a naive...
8
by: Peter Abel | last post by:
Hi all, I'm working under W2k with Python 2.2.2 (#37, Oct 14 2002, 17:02:34) on win32 I have a file *test_data.txt* with the following content: 0123456789 0123456789 abcdefghi...
3
by: Pernell Williams | last post by:
Hi all: I am new to Python, and this is my first post (and it won't be my last!), so HELLO EVERYONE!! I am attempting to use "xreadlines", an outer loop and an inner loop in conjunction with...
5
by: Richard | last post by:
Hi, Can anyone tell me what the difference is between for line in file.readlines( ): and for line in file:
9
by: Hans-Joachim Widmaier | last post by:
Hi all. Handling files is an extremely frequent task in programming, so most programming languages have an abstraction of the basic files offered by the underlying operating system. This is...
2
by: js | last post by:
Hi list. I'm writing a tail -f like program in python and I found file.read() doesn't work as I think it should. Here's the code illustrating my problem. ### #!/usr/bin/env python import...
13
by: thomas.mertes | last post by:
Hello Recently I discovered some problem. I have some C code which determines how many bytes are available in a file. Later I use this information to malloc a buffer of the correct size before...
13
by: kronecker | last post by:
I am trying to delete multiple lines in a text file using the following Private Sub Read_TextFile() Dim objReader As StreamReader Dim strfull, strContents, strContentsold, strContentsnew As...
3
by: Magdoll | last post by:
I was trying to map various locations in a file to a dictionary. At first I read through the file using a for-loop, but tell() gave back weird results, so I switched to while, then it worked. ...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.