Hi all,
I'm working under W2k with
Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32
I have a file *test_data.txt* with the following content:
0123456789
0123456789
abcdefghi
ABCDEFGHIJKLMNOPQ
and I work on it with the following python script:
# Open NOT in binary mode
fp=file('test_data.txt','r')
a='xx'
while a:
print 'Filepointer: %3d' % fp.tell()
a=fp.readline()
fp.close()
print
# Open IN binary mode
fp=file('test_data.txt','r+b')
a='xx'
while a:
print 'Filepointer: %3d' % fp.tell()
a=fp.readline()
fp.close()
Now, when test_data.txt is saved in PC-mode with 0xC, 0xA as newline
it works correct.
But when I save the file in UNIX-Mode with 0xA as newline,
my script gives me the following output, where that one with
the file not opened in binary mode is wrong:
Filepointer: 0
Filepointer: 7
Filepointer: 19
Filepointer: 30
Filepointer: 49
Filepointer: 51
Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 32
Filepointer: 50
Filepointer: 51
When I try this under HP-UX it works fine in both cases.
So I wonder if the function *tell()* is not correctly implemented under win32.
Regards
Peter 8 2970
I'm not sure if that't the reason, but the binary mode for reading is 'rb'.
Actually, the order of 'r' and 'b' shouldn't matter. But the '+' has a different
meaning: the file should allow "opposite" access as well, e.g. 'r+', 'rb+'
means that you can write to the file too, while 'w+' means: open it for
writing but permit reading too. You can try to say 'rt' for the read/text mode.
Anyway, you program works under Windows XP/Python 2.3 as expected:
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
C:\Home\Programming\Python\2>py
Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information. ^Z
C:\Home\Programming\Python\2>test.py
Filepointer: 0
Filepointer: 12
Filepointer: 24
Filepointer: 35
Filepointer: 54
Filepointer: 0
Filepointer: 12
Filepointer: 24
Filepointer: 35
Filepointer: 54
I'm sorry if it doesn't help. The bug must be somewhere else then.
M-a-S
"Peter Abel" <p-****@t-online.de> wrote in message news:13**************************@posting.google.c om... Hi all, I'm working under W2k with Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32
I have a file *test_data.txt* with the following content: 0123456789 0123456789 abcdefghi ABCDEFGHIJKLMNOPQ
and I work on it with the following python script:
# Open NOT in binary mode fp=file('test_data.txt','r') a='xx' while a: print 'Filepointer: %3d' % fp.tell() a=fp.readline() fp.close()
print
# Open IN binary mode fp=file('test_data.txt','r+b') a='xx' while a: print 'Filepointer: %3d' % fp.tell() a=fp.readline() fp.close()
Now, when test_data.txt is saved in PC-mode with 0xC, 0xA as newline it works correct. But when I save the file in UNIX-Mode with 0xA as newline, my script gives me the following output, where that one with the file not opened in binary mode is wrong: Filepointer: 0 Filepointer: 7 Filepointer: 19 Filepointer: 30 Filepointer: 49 Filepointer: 51
Filepointer: 0 Filepointer: 11 Filepointer: 22 Filepointer: 32 Filepointer: 50 Filepointer: 51
When I try this under HP-UX it works fine in both cases. So I wonder if the function *tell()* is not correctly implemented under win32.
Regards Peter
M-a-S wrote: I'm not sure if that't the reason, but the binary mode for reading is 'rb'. Actually, the order of 'r' and 'b' shouldn't matter. But the '+' has a different meaning: the file should allow "opposite" access as well, e.g. 'r+', 'rb+' means that you can write to the file too, while 'w+' means: open it for writing but permit reading too. You can try to say 'rt' for the read/text mode.
Sorry, but no you can't: the default is to open the file in text mode, and you
can change it with a 'b', but 't' has no meaning at all. BTW, 'b' also has no
meaning at all on all Unices: the so-called "binary" or "text" mode are the
same, i.e. what is read is what is in the file. Windows needs it only because of
its superfluous \r's at the end of each line.
Anyway, you program works under Windows XP/Python 2.3 as expected:
Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp.
C:\Home\Programming\Python\2>py Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
^Z
C:\Home\Programming\Python\2>test.py Filepointer: 0 Filepointer: 12 Filepointer: 24 Filepointer: 35 Filepointer: 54
Filepointer: 0 Filepointer: 12 Filepointer: 24 Filepointer: 35 Filepointer: 54
I'm sorry if it doesn't help. The bug must be somewhere else then.
M-a-S
"Peter Abel" <p-****@t-online.de> wrote in message news:13**************************@posting.google.c om...
Hi all, I'm working under W2k with
===Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32
I do confirm the methods tell and/or seek are broken on Win2K when you open the
file in text mode: doing myFile.seek(myFile.tell()) is not a no-op. The "ghost"
\r's at the end of lines seem to be taken into account by one of the methods and
not by the other one. The problem also happens on Win98.
I don't know if it's a Python bug or a bug in the underlying C API. Knowing
Windows, and considering the Python wrapper must be quite trivial, I'd bet on
the C API...
The only workaround I found was to always open the files in binary mode, and
explicitely ignore the \r's.
HTH
--
- Eric Brunel <eric dot brunel at pragmadev dot com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com
[Peter] I wonder if the function *tell()* is not correctly implemented under win32.
[M-a-S] Anyway, you program works under Windows XP/Python 2.3 as expected:
M-a-S, are you sure you saved test_data.txt with Unix line endings? I
tested Peter's script under WinXP/Python2.3 as well, and it failed as
expected (though with slightly different results):
pythonw -u peter.py
Filepointer: 0
Filepointer: 8
Filepointer: 20
Filepointer: 31
Filepointer: 50
Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 32
Filepointer: 50
--
Richie Hindle ri****@entrian.com
[Peter] I wonder if the function *tell()* is not correctly implemented under win32.
[Tim, quoting the standard] For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file position indicator for the stream to its position at the time of the ftell call
It still doesn't seem to work as specified:
------------------------------ peter.py ------------------------------
# Open the file in text mode, read a line, and store the position.
fp = file('test_data.txt', 'rt')
line = fp.readline()
storedPosition = fp.tell()
print 'Line: %r, file pointer after read: %d' % (line, storedPosition)
# Read some more and print it.
print 'Read another line from this position: %r' % fp.readline()
# Now seek back and read the same line again.
fp.seek(storedPosition)
print 'Another read from the same position: %r' % fp.readline()
----------------------------------------------------------------------
This prints:
Line: '0123456789\n', file pointer after read: 8
Read another line from this position: '0123456789\n'
Another read from the same position: '89\n'
I'd expect doing readline/tell/readline/seek/readline to read the same
line the second two times. And however you implement tell and seek, a
tell value of 8 after reading 11 bytes looks pretty weird.
I'd write the same code in C if I had the time, so at least we could be
*sure* we can blame Microsoft. 8-)
--
Richie Hindle ri****@entrian.com>
Richie Hindle <ri****@entrian.com> wrote in message news:<ma**********************************@python. org>... [Peter] I wonder if the function *tell()* is not correctly implemented under win32. [M-a-S] Anyway, you program works under Windows XP/Python 2.3 as expected:
M-a-S, are you sure you saved test_data.txt with Unix line endings? I
This is exactly the point.
It works under win32 with PC-endings, but not with Unix line endings.
The workaround is to open it in binray mode. I know there are differences
between the open-modes 'rb', 'r+b' ... but that's not the problem. Both
work fine. tested Peter's script under WinXP/Python2.3 as well, and it failed as expected (though with slightly different results):
pythonw -u peter.py Filepointer: 0 Filepointer: 8
It doesn't make any sense for me that file position results in *8* here.
The line has 10 chars: 0123456789 plus one newline which makes 11 for me
if newline is only a 0xa and 12 if newline is a 0xc, 0xa. So let's suppose
the *file(file_name).readline()* reads until it detects a 0xa and then subtracts
an os-depending number of bytes namely one for UNIX-newline and two for PC-newline.
Then it would result in 10 or 9 but never in 8. It doesn't make any sense for
me. I think it must be a bug.
Filepointer: 20 Filepointer: 31 Filepointer: 50
Filepointer: 0 Filepointer: 11 Filepointer: 22 Filepointer: 32 Filepointer: 50
When the file is opened in binary mode the above output shows that python
does the right work.
Peter
"Eric Brunel" <er*********@pragmadev.N0SP4M.com> wrote in message news:bk**********@news-reader2.wanadoo.fr... Sorry, but no you can't: the default is to open the file in text mode, and you can change it with a 'b', but 't' has no meaning at all. BTW, 'b' also has no meaning at all on all Unices: the so-called "binary" or "text" mode are the same, i.e. what is read is what is in the file. Windows needs it only because of its superfluous \r's at the end of each line. <...> HTH -- - Eric Brunel <eric dot brunel at pragmadev dot com> - PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com
The idea was to tell the humans that it's text. It won't hurt neither Unix nor Windows.
I know that nobody cares, though.
M-a-S
"Richie Hindle" <ri****@entrian.com> wrote in message news:ma**********************************@python.o rg... M-a-S, are you sure you saved test_data.txt with Unix line endings? I tested Peter's script under WinXP/Python2.3 as well, and it failed as expected (though with slightly different results): <......> -- Richie Hindle ri****@entrian.com
Oops! Stupid me! With '\n' it behaves really weird. Same values as yours.
Filepointer: 0
Filepointer: 8 (-3 off the right value - M-a-S)
Filepointer: 20 (-2)
Filepointer: 31 (-1)
Filepointer: 50 (-0)
Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 32
Filepointer: 50
For the file '0123456789\n'*12 it prints:
Filepointer: 0
Filepointer: 0 (-11 off the right value)
Filepointer: 12 (-10)
Filepointer: 24 (-9)
Filepointer: 36 (-8)
Filepointer: 48 (-7)
Filepointer: 60 (-6)
Filepointer: 72 (-5)
Filepointer: 84 (-4)
Filepointer: 96 (-3)
Filepointer: 108 (-2)
Filepointer: 120 (-1)
Filepointer: 132 (-0)
Filepointer: 0
Filepointer: 11
Filepointer: 22
Filepointer: 33
Filepointer: 44
Filepointer: 55
Filepointer: 66
Filepointer: 77
Filepointer: 88
Filepointer: 99
Filepointer: 110
Filepointer: 121
Filepointer: 132
If I add another line, it breakes:
Filepointer: 0
Traceback (most recent call last):
File "C:\Home\Programming\Python\t\t.py", line 5, in ?
print 'Filepointer:%4d' % fp.tell()
IOError: (0, 'Error')
M-a-S
M-a-S wrote: "Eric Brunel" <er*********@pragmadev.N0SP4M.com> wrote in message news:bk**********@news-reader2.wanadoo.fr...
Sorry, but no you can't: the default is to open the file in text mode, and you can change it with a 'b', but 't' has no meaning at all. BTW, 'b' also has no meaning at all on all Unices: the so-called "binary" or "text" mode are the same, i.e. what is read is what is in the file. Windows needs it only because of its superfluous \r's at the end of each line. <...> HTH -- - Eric Brunel <eric dot brunel at pragmadev dot com> - PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com The idea was to tell the humans that it's text. It won't hurt neither Unix nor Windows. I know that nobody cares, though.
Not only to tell humans: Windows automatically removes the '\r' at the end of
each line when a file is opened in text mode. It won't happen in binary mode.
And I wish I could stop caring, but I occasionally run into problems just
because of this behaviour, and I know I'm not the only one.
--
- Eric Brunel <eric dot brunel at pragmadev dot com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Russell E. Owen |
last post by:
At one time, mixing for x in file and readline was dangerous. For
example:
for line in file:
# read some lines from a file, then break
nextline = readline() # bad
would not do what a naive...
|
by: Pernell Williams |
last post by:
Hi all:
I am new to Python, and this is my first post (and it won't be my last!), so
HELLO EVERYONE!! I am attempting to use "xreadlines", an outer loop and an
inner loop in conjunction with...
|
by: Arvind Varma Kalidindi |
last post by:
Hi,
I was asked this question in an interview recently. "How do you
move to the 6th byte in a file?" ... My thinking would be to find the
data types in the file, set a base pointer and advance it...
|
by: junky_fellow |
last post by:
What is a position independent code ? What are the basic guidelines to
keep in mind while writing a position independent code ?
thanx for any help ....
|
by: Lokkju |
last post by:
I am pretty much lost here - I am trying to create a managed c++
wrapper for this dll, so that I can use it from c#/vb.net, however, it
does not conform to any standard style of coding I have seen....
|
by: rudranee |
last post by:
hi there,
can anyone tell me how to lines from a file which are odd numbered i.e.
1st,3rd,5th...lines.
i tried incrementing file pointer by 2 (fp=fp+2)
but it does'nt work
Can someone give me...
|
by: js |
last post by:
Hi list.
I'm writing a tail -f like program in python
and I found file.read() doesn't work as I think it should.
Here's the code illustrating my problem.
###
#!/usr/bin/env python
import...
|
by: thomas.mertes |
last post by:
Hello
Recently I discovered some problem. I have some C code
which determines how many bytes are available in a
file. Later I use this information to malloc a buffer of
the correct size before...
|
by: mohi |
last post by:
hello everyone,
i have a program where i write a structure of 1020B to a file a number
of times and in the next run of the program i have to again write
those structures but by taking some count...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
| |