473,320 Members | 1,910 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

file.read problem

f = open('myfile,'r')
a = f.read(5000)

When I do this I get the first 634 bytes. I tried using the:
f = open('myfile,'rb')
option, but now there are a few 0x0D bytes extra in myfile. 0x0D =
Carriage return. How can I make a program that not puts in the 0x0D
bytes in windows.

In linux the first 2 lines are working perfectly.

Feb 17 '06 #1
9 2491
> When I do this I get the first 634 bytes. I tried using the:
f = open('myfile,'rb')
option, but now there are a few 0x0D bytes extra in myfile. 0x0D =
Carriage return. How can I make a program that not puts in the 0x0D
bytes in windows.

Try opening the file in 'rbU' mode. This will use universal newline mode
and convert all carriage returns to line feeds.

-Farshid
Feb 17 '06 #2
>Try opening the file in 'rbU' mode. This will use universal newline mode
and convert all carriage returns to line feeds.


I tried this, but as you say, now there are 0x0A bytes extra in my
files, is there also a possibility to let all these things out, and
just get the file.

I am working on a script to get parts of raw data out of a file, and
the data I read has to be the data written in the file without CR or
LF.

Feb 17 '06 #3
> I am working on a script to get parts of raw data out of a file, and
the data I read has to be the data written in the file without CR or
LF.


So you just want to remove all the linefeeds? This should work then:

data = data.replace('\n','')

-Farshid

Feb 17 '06 #4

Farshid Lashkari wrote:
I am working on a script to get parts of raw data out of a file, and
the data I read has to be the data written in the file without CR or
LF.


So you just want to remove all the linefeeds? This should work then:

data = data.replace('\n','')

-Farshid


The problem is if I remove the linefeeds, I also delete readout data if
it is 0x0A, and I don't want this, because the files I readout has to
be a part of the original data. Another idea??

But still my question is why is the:
f = open('myfile,'r')
a = f.read(5000)
working in linux??

Feb 17 '06 #5
"wscrsurfdude" wrote:
Try opening the file in 'rbU' mode. This will use universal newline mode
and convert all carriage returns to line feeds.


I tried this, but as you say, now there are 0x0A bytes extra in my
files, is there also a possibility to let all these things out, and
just get the file.

I am working on a script to get parts of raw data out of a file, and
the data I read has to be the data written in the file without CR or
LF.


what kind of file are you reading? if it's a text file, it's supposed to have
LF in it (or CR LF if you read it in binary mode); the LF:s are there to tell
you where each line ends.

if it's a binary file, open with mode "rb".

</F>

Feb 17 '06 #6
>if it's a binary file, open with mode "rb".
You are right about opening it in the rb mode (flaw in the start post),
but also when I do this in windows in front of every 0x0A is put a
0x0D. I found a explanation why it is working in linux it is below in
my post.

But what i get of this that in windows in front of every 0x0A is put a
0x0D as a line feed. II have to get rid of these. But if there is
already binary data in my original file with the data 0x0D0A the 0x0D
also is deleted, someone has an idea??

############################################
The whole subject of newlines and text files is a murky area of non
standard implementation by different operating systems. These
differences have their roots in the early days of data communications
and the control of mechanical teleprinters. Basically there are 3
different ways to indicate a new line:

Carriage Return (CR) character ('\r')
Line Feed (LF) character ('\n')
CR/LF pair ('\r\n').
All three techniques are used in different operating systems. MS DOS
(and therefore Windows) uses method 3. Unix (including Linux) uses
method 2. Apple in its original MacOS used method 1, but now uses
method 2 since MacOS X is really a variant of Unix.

So how can the poor programmer cope with this multiplicity of line
endings? In many languages she just has to do lots of tests and take
different action per OS. In more modern languages, including Python,
the language provides facilities for dealing with the mess for you. In
the case of Python the assistance comes in the form of the os module
which defines a variable called linesep which is set to whatever the
newline character is on the current operating system. This makes adding
newlines easy, and rstrip() takes account of the OS when it does its
work of removing them, so really the simple way to stay sane, so far as
newlines are concerned is: always use rstrip() to remove newlines from
lines read from a file and always add os.linesep to strings being
written to a file.

That still leaves the awkward situation where a file is created on one
OS and then processed on another, incompatible, OS and sadly, there
isn't much we can do about that except to compare the end of the line
with os.linesep to determine what the difference is.
######################################

Feb 17 '06 #7
I have the solution, the flaw was not in the opening of the file, but
in the writing of the file. Stupid me, i opened it with mode rb, but
wrote it with w instead of with wb

Everybody thanks for helping me.

Feb 17 '06 #8
On Fri, 17 Feb 2006 00:15:31 -0800, wscrsurfdude wrote:

Farshid Lashkari wrote:
> I am working on a script to get parts of raw data out of a file, and
> the data I read has to be the data written in the file without CR or
> LF.
So you just want to remove all the linefeeds? This should work then:

data = data.replace('\n','')

-Farshid


The problem is if I remove the linefeeds, I also delete readout data if
it is 0x0A, and I don't want this, because the files I readout has to
be a part of the original data. Another idea??


Er, have I understood you correctly? You seem to be saying that some
linefeeds are significant data, and some are not, and you want somebody to
tell you how to remove the insignificant "linefeed = end of line"
characters without removing the significant "linefeed = important data"
characters.

That's easy:

from blackmagic import readmymind, dowhatiwant
fp = file("data", "rb")
readmymind()
data = dowhatiwant(fp.read())

You'll need Python 3.0 for the blackmagic module.

*wink*

Seriously, if this is your problem, then you will have no choice but to
carefully analyse the file yourself, looking at each linefeed and tossing
it away if it is insignificant. We can't tell you how to do that, because
we don't know which linefeeds are data and which are not.
But still my question is why is the:
f = open('myfile,'r')
a = f.read(5000)
working in linux??


Why shouldn't it work in Linux? The question should be, why is it not
working in Windows? (When did "my code is broken" become the excepted
state of affairs, and "my code works" the mystery that needs solving?)

I wonder whether there is a ctrl-Z in your data, and Windows is
interpreting that as end of file.
--
Steven.

Feb 17 '06 #9
"wscrsurfdude" wrote:
if it's a binary file, open with mode "rb".
You are right about opening it in the rb mode (flaw in the start post),
but also when I do this in windows in front of every 0x0A is put a
0x0D. I found a explanation why it is working in linux it is below in
my post.

But what i get of this that in windows in front of every 0x0A is put a
0x0D as a line feed. II have to get rid of these.


if you open a file in binary mode ("rb"), you get the data that's in the
file. no more, no less. if someone's adding CR to the files, that happens
before you opened them in Python.

have you, perhaps, copied binary files between the systems using FTP
in text mode? if so, you've damaged the files, and there's no way to fix
them, in general.

</F>

Feb 17 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: ohaya | last post by:
Hi, I'm a real newbie, but have been asked to try to fix a problem in one of our JSP pages that is suppose to read in a text file and display it. From my testing thus far, it appears this page...
6
by: Russell E. Owen | last post by:
At one time, mixing for x in file and readline was dangerous. For example: for line in file: # read some lines from a file, then break nextline = readline() # bad would not do what a naive...
6
by: o'seally | last post by:
solaris/linux admins/rookie_developers that battle with this error are probably all frustrated when it happens. i bet you're also somehow frustrated by this seemingly unsolvable error :-) ...take...
11
by: BoonHead, The Lost Philosopher | last post by:
I think the .NET framework is great! It's nice, clean and logical; in contradiction to the old Microsoft. It only saddens me that the new Microsoft still doesn't under stand there own...
1
by: wtnt | last post by:
Hello. I've searched all over and haven't seen another thread with this problem. Please bear with me as I try to explain. thanks. :) I have some programs that need to be cross-platform...
3
by: Abhas | last post by:
> > Hi, this is Abhas, > > I had made a video library program in C++, but was facing a problem. > > After entering 12 movies, i cannot enter any more movies. > > Something gibberish comes instead....
11
by: Abhishek | last post by:
I have a problem transfering files using sockets from pocket pc(.net compact c#) to desktop(not using .net just mfc and sockets 2 API). The socket communication is not a issue and I am able to...
0
by: Lokkju | last post by:
I am pretty much lost here - I am trying to create a managed c++ wrapper for this dll, so that I can use it from c#/vb.net, however, it does not conform to any standard style of coding I have seen....
7
by: John Dann | last post by:
I'm trying to read some binary data from a file created by another program. I know the binary file format but can't change or control the format. The binary data is organised such that it should...
4
by: pradqdo | last post by:
Hi folks, I have a very strange problem when I try to port my client/server program to cygwin. It is a simple shell program where the server executes client's commands + it can send and receive...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.