473,386 Members | 1,846 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

File Object behavior


When I open a csv or txt file with:

infile = open(sys.argv[1],'rb').readlines()
or
infile = open(sys.argv[1],'rb').read()

and then look at the first few lines of the file there is a carriage return
+
line feed at the end of each line - \r\n
This is fine and somewhat expected. My problem comes from then writing
infile out to a new file with:

outfile = open(sys.argv[2],'w')
outfile.writelines(infile)
outfile.close()

at which point an additional carriage return is inserted to the end of each
line - \r\r\n
The same behavior occurs with outfile.write(infile) also. I am doing no
processing
between reading the input and writing to the output.
Is this expected behavior? The file.writelines() documentation says that it
doesn't add line separators. Is adding a carriage return something
different?
At this point I have to filter out the additional carriage return which
seems like
extra and unnecessary effort.
I am using Python 2.4 on Windows XP sp2.
Can anybody help me understand this situation?

Thanks
--
View this message in context: http://www.nabble.com/File-Object-be....html#a9821538
Sent from the Python - python-list mailing list archive at Nabble.com.

Apr 3 '07 #1
7 2148
Michael Castleton wrote:
When I open a csv or txt file with:

infile = open(sys.argv[1],'rb').readlines()
or
infile = open(sys.argv[1],'rb').read()

and then look at the first few lines of the file there is a carriage return
+
line feed at the end of each line - \r\n
This is fine and somewhat expected. My problem comes from then writing
infile out to a new file with:

outfile = open(sys.argv[2],'w')
outfile.writelines(infile)
outfile.close()

at which point an additional carriage return is inserted to the end of each
line - \r\r\n
Maybe because you're reading the file as binary ('rb') but writing it as
text ('w')::
>>open('temp.txt', 'w').write('hello\r\n')
open('temp.txt', 'rb').read()
'hello\r\r\n'
>>open('temp.txt', 'wb').write('hello\r\n')
open('temp.txt', 'rb').read()
'hello\r\n'
>>open('temp.txt', 'w').write('hello\r\n')
open('temp.txt', 'r').read()
'hello\r\n'

Looks like if you match your writes and reads everything works out fine.

STeVe
Apr 3 '07 #2
On Apr 3, 12:02 pm, Michael Castleton <fatuhe...@yahoo.comwrote:
When I open a csv or txt file with:

infile = open(sys.argv[1],'rb').readlines()
or
infile = open(sys.argv[1],'rb').read()

and then look at the first few lines of the file there is a carriage return
+
line feed at the end of each line - \r\n
This is fine and somewhat expected. My problem comes from then writing
infile out to a new file with:

outfile = open(sys.argv[2],'w')
outfile.writelines(infile)
outfile.close()

at which point an additional carriage return is inserted to the end of each
line - \r\r\n
The same behavior occurs with outfile.write(infile) also. I am doing no
processing
between reading the input and writing to the output.
The file.writelines() documentation says that it
doesn't add line separators. Is adding a carriage return something
different?
At this point I have to filter out the additional carriage return which
seems like
extra and unnecessary effort.
I am using Python 2.4 on Windows XP sp2.
Can anybody help me understand this situation?

Thanks
--
View this message in context:http://www.nabble.com/File-Object-be....html#a9821538
Sent from the Python - python-list mailing list archive at Nabble.com.
The file.writelines() documentation says that it
doesn't add line separators. Is adding a carriage return something
different?
No.
Is this expected behavior?
According to Python in a Nutshell(p. 217) it is. On windows, in text
mode, when you write a \n to a file, the \n is converted to the system
specific newline (which is specified in os.linesep). For windows, a
newline is \r\n. Conversely, on windows, in text mode, when you read
a \r\n newline from a file, it is converted to a \n.

Apr 3 '07 #3
On Apr 3, 12:26 pm, "7stud" <bbxx789_0...@yahoo.comwrote:
The file.writelines() documentation says that it
doesn't add line separators. Is adding a carriage return something
different?
No.
Is this expected behavior?
According to Python in a Nutshell(p. 217), it is. On windows, in
text
mode, when you write a \n to a file, the \n is converted to the
system
specific newline (which is specified in os.linesep). For windows, a
newline is \r\n. Conversely, on windows, in text mode, when you read
a \r\n newline from a file, it is converted to a \n.

I forgot to add that when you read or write in binary mode, no
conversion takes place. So, if you read \r\n from the file, your
input will contain the \r\n; and if you write \r\n to the file, then
the file will contain \r\n.

Apr 3 '07 #4

Thank you to both Steve and 7stud. You were right on with binary flag!
I thought I had tried everything...

Mike
--
View this message in context: http://www.nabble.com/File-Object-be....html#a9825806
Sent from the Python - python-list mailing list archive at Nabble.com.

Apr 3 '07 #5
Michael Castleton a écrit :
When I open a csv or txt file with:

infile = open(sys.argv[1],'rb').readlines()
or
infile = open(sys.argv[1],'rb').read()

and then look at the first few lines of the file there is a carriage return
+
line feed at the end of each line - \r\n
Is there any reason you open your text files in binary mode ?

Unless you're using the csv module (which requires such a mode - but
then you don't care since you're not working with the raw data
yourself), you should consider opening your files in text mode. This
should solve your problem (if not, then you have a problem with
universal newlines support in your Python install).

HTH
Apr 3 '07 #6

Bruno Desthuilliers wrote:

Michael Castleton a écrit :
>When I open a csv or txt file with:

infile = open(sys.argv[1],'rb').readlines()
or
infile = open(sys.argv[1],'rb').read()

and then look at the first few lines of the file there is a carriage
return
+
line feed at the end of each line - \r\n
Is there any reason you open your text files in binary mode ?

Unless you're using the csv module (which requires such a mode - but
then you don't care since you're not working with the raw data
yourself), you should consider opening your files in text mode. This
should solve your problem (if not, then you have a problem with
universal newlines support in your Python install).

HTH
--
http://mail.python.org/mailman/listinfo/python-list

Bruno,
No particular reason in this case. It was probably as a holdover from using
the csv module in the past. I'm wondering though if using binary on very
large
files (>100Mb) would save any processing time - no conversion to system
newline?
What do you think?
Thanks.

--
View this message in context: http://www.nabble.com/File-Object-be....html#a9827881
Sent from the Python - python-list mailing list archive at Nabble.com.

Apr 3 '07 #7
Michael Castleton a écrit :
>

Bruno Desthuilliers wrote:
>Michael Castleton a écrit :
>>When I open a csv or txt file with:

infile = open(sys.argv[1],'rb').readlines()
or
infile = open(sys.argv[1],'rb').read()

and then look at the first few lines of the file there is a carriage
return
+
line feed at the end of each line - \r\n
Is there any reason you open your text files in binary mode ?
(snip)

Bruno,
No particular reason in this case. It was probably as a holdover from using
the csv module in the past. I'm wondering though if using binary on very
large
files (>100Mb) would save any processing time - no conversion to system
newline?
What do you think?
I think that premature optimization is the root of all evil.

You'll have to do the processing by yourself then, and I doubt it'll be
as fast as the C-coded builtin newline processing.

Anyway, you can easily check it out by yourself - Python has timeit (for
micro-benchmarks) and a profiler.
Apr 4 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Dan Perl | last post by:
Here is some code to illustrate a problem that I have: import copy class myc: def __init__(self): self.myf=file('bar.txt',"w") def foo(self): self.myf.write('hello world!') # it's going...
9
by: Aguilar, James | last post by:
I know that one can define an essentially unlimited number of classes in a file. And one can declare just as many in a header file. However, the question I have is, should I? Suppose that, to...
15
by: Nathan | last post by:
I have an aspx page with a data grid, some textboxes, and an update button. This page also has one html input element with type=file (not inside the data grid and runat=server). The update...
5
by: removeps-generic | last post by:
Hi. I'm using placement new to re-initialize part of an object. Is this OK? struct BaseImp { All& r_all; BaseImp(All& all) : r_all(all) }; struct Calc::Imp : public BaseImp
17
by: Peter Duniho | last post by:
I searched using Google, on the web and in the newsgroups, and found nothing on this topic. Hopefully that means I just don't understand what I'm supposed to be doing here. :) The problem: ...
3
by: Carroll, Barry | last post by:
Greetings: Please forgive me if this is the wrong place for this post. I couldn't find a more acceptable forum. If there is one, please point me in the right direction. I am part of a small...
10
by: Ben | last post by:
Hi, i have a weird problem and i don't know who is responsible for this: IIS, excel or asp.net. My problem: we use an asp.net 2.0 application under IIS 6.0 (server 2003 sp2) which must write...
0
by: tom | last post by:
When I try to read in a csv file it gives me this error message. 'Cannot update. Database or object is read-only.' If I change the extension to txt it processes just fine. I have googled all...
11
by: whirlwindkevin | last post by:
I saw a program source code in which a variable is defined in a header file and that header file is included in 2 different C files.When i compile and link the files no error is being thrown.How is...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.