473,418 Members | 2,052 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,418 software developers and data experts.

file position *tell()* works different

Hi all,
I'm working under W2k with
Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32

I have a file *test_data.txt* with the following content:
0123456789
0123456789
abcdefghi
ABCDEFGHIJKLMNOPQ

and I work on it with the following python script:

# Open NOT in binary mode
fp=file('test_data.txt','r')
a='xx'
while a:
print 'Filepointer: %3d' % fp.tell()
a=fp.readline()
fp.close()

print

# Open IN binary mode
fp=file('test_data.txt','r+b')
a='xx'
while a:
print 'Filepointer: %3d' % fp.tell()
a=fp.readline()
fp.close()

Now, when test_data.txt is saved in PC-mode with 0xC, 0xA as newline
it works correct.
But when I save the file in UNIX-Mode with 0xA as newline,
my script gives me the following output, where that one with
the file not opened in binary mode is wrong:
Filepointer: 0
Filepointer: 7
Filepointer: 19
Filepointer: 30
Filepointer: 49
Filepointer: 51

Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 32
Filepointer: 50
Filepointer: 51

When I try this under HP-UX it works fine in both cases.
So I wonder if the function *tell()* is not correctly implemented under win32.

Regards
Peter
Jul 18 '05 #1
8 2970
I'm not sure if that't the reason, but the binary mode for reading is 'rb'.
Actually, the order of 'r' and 'b' shouldn't matter. But the '+' has a different
meaning: the file should allow "opposite" access as well, e.g. 'r+', 'rb+'
means that you can write to the file too, while 'w+' means: open it for
writing but permit reading too. You can try to say 'rt' for the read/text mode.

Anyway, you program works under Windows XP/Python 2.3 as expected:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\Home\Programming\Python\2>py
Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
^Z

C:\Home\Programming\Python\2>test.py
Filepointer: 0
Filepointer: 12
Filepointer: 24
Filepointer: 35
Filepointer: 54

Filepointer: 0
Filepointer: 12
Filepointer: 24
Filepointer: 35
Filepointer: 54

I'm sorry if it doesn't help. The bug must be somewhere else then.

M-a-S
"Peter Abel" <p-****@t-online.de> wrote in message news:13**************************@posting.google.c om... Hi all,
I'm working under W2k with
Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32

I have a file *test_data.txt* with the following content:
0123456789
0123456789
abcdefghi
ABCDEFGHIJKLMNOPQ

and I work on it with the following python script:

# Open NOT in binary mode
fp=file('test_data.txt','r')
a='xx'
while a:
print 'Filepointer: %3d' % fp.tell()
a=fp.readline()
fp.close()

print

# Open IN binary mode
fp=file('test_data.txt','r+b')
a='xx'
while a:
print 'Filepointer: %3d' % fp.tell()
a=fp.readline()
fp.close()

Now, when test_data.txt is saved in PC-mode with 0xC, 0xA as newline
it works correct.
But when I save the file in UNIX-Mode with 0xA as newline,
my script gives me the following output, where that one with
the file not opened in binary mode is wrong:
Filepointer: 0
Filepointer: 7
Filepointer: 19
Filepointer: 30
Filepointer: 49
Filepointer: 51

Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 32
Filepointer: 50
Filepointer: 51

When I try this under HP-UX it works fine in both cases.
So I wonder if the function *tell()* is not correctly implemented under win32.

Regards
Peter

Jul 18 '05 #2
M-a-S wrote:
I'm not sure if that't the reason, but the binary mode for reading is 'rb'.
Actually, the order of 'r' and 'b' shouldn't matter. But the '+' has a different
meaning: the file should allow "opposite" access as well, e.g. 'r+', 'rb+'
means that you can write to the file too, while 'w+' means: open it for
writing but permit reading too. You can try to say 'rt' for the read/text mode.
Sorry, but no you can't: the default is to open the file in text mode, and you
can change it with a 'b', but 't' has no meaning at all. BTW, 'b' also has no
meaning at all on all Unices: the so-called "binary" or "text" mode are the
same, i.e. what is read is what is in the file. Windows needs it only because of
its superfluous \r's at the end of each line.
Anyway, you program works under Windows XP/Python 2.3 as expected:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\Home\Programming\Python\2>py
Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
^Z

C:\Home\Programming\Python\2>test.py
Filepointer: 0
Filepointer: 12
Filepointer: 24
Filepointer: 35
Filepointer: 54

Filepointer: 0
Filepointer: 12
Filepointer: 24
Filepointer: 35
Filepointer: 54

I'm sorry if it doesn't help. The bug must be somewhere else then.

M-a-S
"Peter Abel" <p-****@t-online.de> wrote in message news:13**************************@posting.google.c om...
Hi all,
I'm working under W2k with

===Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32


I do confirm the methods tell and/or seek are broken on Win2K when you open the
file in text mode: doing myFile.seek(myFile.tell()) is not a no-op. The "ghost"
\r's at the end of lines seem to be taken into account by one of the methods and
not by the other one. The problem also happens on Win98.

I don't know if it's a Python bug or a bug in the underlying C API. Knowing
Windows, and considering the Python wrapper must be quite trivial, I'd bet on
the C API...

The only workaround I found was to always open the files in binary mode, and
explicitely ignore the \r's.

HTH
--
- Eric Brunel <eric dot brunel at pragmadev dot com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com

Jul 18 '05 #3

[Peter]
I wonder if the function *tell()* is not correctly implemented under win32.
[M-a-S] Anyway, you program works under Windows XP/Python 2.3 as expected:
M-a-S, are you sure you saved test_data.txt with Unix line endings? I
tested Peter's script under WinXP/Python2.3 as well, and it failed as
expected (though with slightly different results):
pythonw -u peter.py

Filepointer: 0
Filepointer: 8
Filepointer: 20
Filepointer: 31
Filepointer: 50

Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 32
Filepointer: 50

--
Richie Hindle
ri****@entrian.com
Jul 18 '05 #4

[Peter]
I wonder if the function *tell()* is not correctly implemented under win32.
[Tim, quoting the standard] For a text stream, its file position indicator contains
unspecified information, usable by the fseek function for returning
the file position indicator for the stream to its position at the
time of the ftell call


It still doesn't seem to work as specified:

------------------------------ peter.py ------------------------------

# Open the file in text mode, read a line, and store the position.
fp = file('test_data.txt', 'rt')
line = fp.readline()
storedPosition = fp.tell()
print 'Line: %r, file pointer after read: %d' % (line, storedPosition)

# Read some more and print it.
print 'Read another line from this position: %r' % fp.readline()

# Now seek back and read the same line again.
fp.seek(storedPosition)
print 'Another read from the same position: %r' % fp.readline()

----------------------------------------------------------------------

This prints:

Line: '0123456789\n', file pointer after read: 8
Read another line from this position: '0123456789\n'
Another read from the same position: '89\n'

I'd expect doing readline/tell/readline/seek/readline to read the same
line the second two times. And however you implement tell and seek, a
tell value of 8 after reading 11 bytes looks pretty weird.

I'd write the same code in C if I had the time, so at least we could be
*sure* we can blame Microsoft. 8-)

--
Richie Hindle
ri****@entrian.com>
Jul 18 '05 #5
Richie Hindle <ri****@entrian.com> wrote in message news:<ma**********************************@python. org>...
[Peter]
I wonder if the function *tell()* is not correctly implemented under win32.
[M-a-S]
Anyway, you program works under Windows XP/Python 2.3 as expected:


M-a-S, are you sure you saved test_data.txt with Unix line endings? I

This is exactly the point.
It works under win32 with PC-endings, but not with Unix line endings.
The workaround is to open it in binray mode. I know there are differences
between the open-modes 'rb', 'r+b' ... but that's not the problem. Both
work fine. tested Peter's script under WinXP/Python2.3 as well, and it failed as
expected (though with slightly different results):
pythonw -u peter.py Filepointer: 0
Filepointer: 8

It doesn't make any sense for me that file position results in *8* here.
The line has 10 chars: 0123456789 plus one newline which makes 11 for me
if newline is only a 0xa and 12 if newline is a 0xc, 0xa. So let's suppose
the *file(file_name).readline()* reads until it detects a 0xa and then subtracts
an os-depending number of bytes namely one for UNIX-newline and two for PC-newline.
Then it would result in 10 or 9 but never in 8. It doesn't make any sense for
me. I think it must be a bug.
Filepointer: 20
Filepointer: 31
Filepointer: 50

Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 32
Filepointer: 50


When the file is opened in binary mode the above output shows that python
does the right work.

Peter
Jul 18 '05 #6

"Eric Brunel" <er*********@pragmadev.N0SP4M.com> wrote in message news:bk**********@news-reader2.wanadoo.fr...
Sorry, but no you can't: the default is to open the file in text mode, and you
can change it with a 'b', but 't' has no meaning at all. BTW, 'b' also has no
meaning at all on all Unices: the so-called "binary" or "text" mode are the
same, i.e. what is read is what is in the file. Windows needs it only because of
its superfluous \r's at the end of each line.
<...>
HTH
--
- Eric Brunel <eric dot brunel at pragmadev dot com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com

The idea was to tell the humans that it's text. It won't hurt neither Unix nor Windows.
I know that nobody cares, though.

M-a-S
Jul 18 '05 #7

"Richie Hindle" <ri****@entrian.com> wrote in message news:ma**********************************@python.o rg...

M-a-S, are you sure you saved test_data.txt with Unix line endings? I
tested Peter's script under WinXP/Python2.3 as well, and it failed as
expected (though with slightly different results):
<......>
--
Richie Hindle
ri****@entrian.com


Oops! Stupid me! With '\n' it behaves really weird. Same values as yours.

Filepointer: 0
Filepointer: 8 (-3 off the right value - M-a-S)
Filepointer: 20 (-2)
Filepointer: 31 (-1)
Filepointer: 50 (-0)

Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 32
Filepointer: 50

For the file '0123456789\n'*12 it prints:

Filepointer: 0
Filepointer: 0 (-11 off the right value)
Filepointer: 12 (-10)
Filepointer: 24 (-9)
Filepointer: 36 (-8)
Filepointer: 48 (-7)
Filepointer: 60 (-6)
Filepointer: 72 (-5)
Filepointer: 84 (-4)
Filepointer: 96 (-3)
Filepointer: 108 (-2)
Filepointer: 120 (-1)
Filepointer: 132 (-0)

Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 33
Filepointer: 44
Filepointer: 55
Filepointer: 66
Filepointer: 77
Filepointer: 88
Filepointer: 99
Filepointer: 110
Filepointer: 121
Filepointer: 132

If I add another line, it breakes:

Filepointer: 0
Traceback (most recent call last):
File "C:\Home\Programming\Python\t\t.py", line 5, in ?
print 'Filepointer:%4d' % fp.tell()
IOError: (0, 'Error')

M-a-S
Jul 18 '05 #8
M-a-S wrote:
"Eric Brunel" <er*********@pragmadev.N0SP4M.com> wrote in message news:bk**********@news-reader2.wanadoo.fr...
Sorry, but no you can't: the default is to open the file in text mode, and you
can change it with a 'b', but 't' has no meaning at all. BTW, 'b' also has no
meaning at all on all Unices: the so-called "binary" or "text" mode are the
same, i.e. what is read is what is in the file. Windows needs it only because of
its superfluous \r's at the end of each line.
<...>
HTH
--
- Eric Brunel <eric dot brunel at pragmadev dot com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com


The idea was to tell the humans that it's text. It won't hurt neither Unix nor Windows.
I know that nobody cares, though.


Not only to tell humans: Windows automatically removes the '\r' at the end of
each line when a file is opened in text mode. It won't happen in binary mode.

And I wish I could stop caring, but I occasionally run into problems just
because of this behaviour, and I know I'm not the only one.
--
- Eric Brunel <eric dot brunel at pragmadev dot com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com

Jul 18 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Russell E. Owen | last post by:
At one time, mixing for x in file and readline was dangerous. For example: for line in file: # read some lines from a file, then break nextline = readline() # bad would not do what a naive...
3
by: Pernell Williams | last post by:
Hi all: I am new to Python, and this is my first post (and it won't be my last!), so HELLO EVERYONE!! I am attempting to use "xreadlines", an outer loop and an inner loop in conjunction with...
39
by: Arvind Varma Kalidindi | last post by:
Hi, I was asked this question in an interview recently. "How do you move to the 6th byte in a file?" ... My thinking would be to find the data types in the file, set a base pointer and advance it...
5
by: junky_fellow | last post by:
What is a position independent code ? What are the basic guidelines to keep in mind while writing a position independent code ? thanx for any help ....
0
by: Lokkju | last post by:
I am pretty much lost here - I am trying to create a managed c++ wrapper for this dll, so that I can use it from c#/vb.net, however, it does not conform to any standard style of coding I have seen....
24
by: rudranee | last post by:
hi there, can anyone tell me how to lines from a file which are odd numbered i.e. 1st,3rd,5th...lines. i tried incrementing file pointer by 2 (fp=fp+2) but it does'nt work Can someone give me...
2
by: js | last post by:
Hi list. I'm writing a tail -f like program in python and I found file.read() doesn't work as I think it should. Here's the code illustrating my problem. ### #!/usr/bin/env python import...
13
by: thomas.mertes | last post by:
Hello Recently I discovered some problem. I have some C code which determines how many bytes are available in a file. Later I use this information to malloc a buffer of the correct size before...
7
by: mohi | last post by:
hello everyone, i have a program where i write a structure of 1020B to a file a number of times and in the next run of the program i have to again write those structures but by taking some count...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.