By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
428,786 Members | 2,241 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 428,786 IT Pros & Developers. It's quick & easy.

getting rid of EOL character ?

P: n/a
hello,

In the previous language I used,
when reading a line by readline, the EOL character was removed.

Now I'm reading a text-file with CR+LF at the end of each line,
Datafile = open(filename,'r')
line = Datafile.readline()

now this gives an extra empty line
print line

and what I expect that should be correct, remove CR+LF,
gives me one character too much removed
print line[,-2]

while this gives what I need ???
print line[,-1]

Is it correct that the 2 characters CR+LF are converted to 1 character ?
Is there a more automatic way to remove the EOL from the string ?

thanks,
Stef Mientki
Apr 27 '07 #1
Share this Question
Share on Google+
7 Replies


P: n/a
stef wrote:
hello,

In the previous language I used,
when reading a line by readline, the EOL character was removed.

Now I'm reading a text-file with CR+LF at the end of each line,
Datafile = open(filename,'r') line = Datafile.readline()

now this gives an extra empty line
print line

and what I expect that should be correct, remove CR+LF,
gives me one character too much removed
print line[,-2]

while this gives what I need ???
print line[,-1]

Is it correct that the 2 characters CR+LF are converted to 1 character ?
Is there a more automatic way to remove the EOL from the string ?
line = line.rstrip("\r\n") should take care of it. If you leave out the
parameter, it will strip out all whitespace at the end of the line,
which is what I do in most cases.
--
Michael Hoffman
Apr 27 '07 #2

P: n/a
>
line = line.rstrip("\r\n") should take care of it. If you leave out
the parameter, it will strip out all whitespace at the end of the
line, which is what I do in most cases.
thanks for the solution Michael,

cheers,
Stef
Apr 27 '07 #3

P: n/a
Jim
If you have a recent Python, see the documentation for open on the
library page for built-in functions.
http://docs.python.org/lib/built-in-funcs.html

Jim

Apr 27 '07 #4

P: n/a
On 27/04/2007 11:19 PM, Michael Hoffman wrote:
stef wrote:
>hello,

In the previous language I used,
when reading a line by readline, the EOL character was removed.
Very interesting; how did you distinguish between EOF and an empty line?
Did you need to call an isEOF() method before each read?
>>
Now I'm reading a text-file with CR+LF at the end of each line,
Datafile = open(filename,'r') line = Datafile.readline()

now this gives an extra empty line
print line

and what I expect that should be correct, remove CR+LF,
gives me one character too much removed
print line[,-2]
Stef, that would give you a syntax error. I presume that you meant to
type line[:-2]
>>
while this gives what I need ???
print line[,-1]

Is it correct that the 2 characters CR+LF are converted to 1 character ?
In text mode (the default), whatever is the line ending on your platform
is converted to a single "newline" '\n' which is the same as LF.

Using line[:-1] is NOT recommended, as the last line in your file may
not be terminated, and in that case you would lose the last data character.
>Is there a more automatic way to remove the EOL from the string ?

line = line.rstrip("\r\n") should take care of it. If you leave out the
parameter, it will strip out all whitespace at the end of the line,
which is what I do in most cases.
If you want *exactly* what is in the line, use line.rstrip('\n') -- this
will remove only the trailing newline (if it exists).

If you want to strip all trailing whitespace, use line.rstrip() as
Michael suggested.

Michael, note carefully that line.rstrip('\r\n') removes instances of
'\r' OR '\n' -- the arg is a set of characters to be removed, not a
suffix to be removed. In Stef's situation, it "works" only by accident.
Using that would not always give you the correct answer -- e.g. if your
(Windows) file had a line ending in CR CR LF [I've seen stranger].

HTH,
John
Apr 27 '07 #5

P: n/a
hi John,
>>In the previous language I used,
when reading a line by readline, the EOL character was removed.

Very interesting; how did you distinguish between EOF and an empty line?
Did you need to call an isEOF() method before each read?
Yes indeed, and I admit it needs some more coding ;-)
>
>>>
Now I'm reading a text-file with CR+LF at the end of each line,
Datafile = open(filename,'r') line = Datafile.readline()

now this gives an extra empty line
print line

and what I expect that should be correct, remove CR+LF,
gives me one character too much removed
print line[,-2]

Stef, that would give you a syntax error. I presume that you meant to
type line[:-2]
Yes, sorry.
>
>>>
while this gives what I need ???
print line[,-1]

Is it correct that the 2 characters CR+LF are converted to 1 character ?

In text mode (the default), whatever is the line ending on your platform
is converted to a single "newline" '\n' which is the same as LF.
Aha, that was the answer I was looking for.

<snip>

thanks for the splendid explanation John,

cheers,
Stef Mientki
Apr 28 '07 #6

P: n/a
John Machin wrote:
On 27/04/2007 11:19 PM, Michael Hoffman wrote:
>stef wrote:
>>hello,

In the previous language I used,
when reading a line by readline, the EOL character was removed.

Very interesting; how did you distinguish between EOF and an empty line?
Did you need to call an isEOF() method before each read?
>>>
Now I'm reading a text-file with CR+LF at the end of each line,
Datafile = open(filename,'r') line = Datafile.readline()

now this gives an extra empty line
print line

and what I expect that should be correct, remove CR+LF,
gives me one character too much removed
print line[,-2]

Stef, that would give you a syntax error. I presume that you meant to
type line[:-2]
>>>
while this gives what I need ???
print line[,-1]

Is it correct that the 2 characters CR+LF are converted to 1 character ?

In text mode (the default), whatever is the line ending on your platform
is converted to a single "newline" '\n' which is the same as LF.

Using line[:-1] is NOT recommended, as the last line in your file may
not be terminated, and in that case you would lose the last data character.
>>Is there a more automatic way to remove the EOL from the string ?

line = line.rstrip("\r\n") should take care of it. If you leave out
the parameter, it will strip out all whitespace at the end of the
line, which is what I do in most cases.

If you want *exactly* what is in the line, use line.rstrip('\n') -- this
will remove only the trailing newline (if it exists).

If you want to strip all trailing whitespace, use line.rstrip() as
Michael suggested.

Michael, note carefully that line.rstrip('\r\n') removes instances of
'\r' OR '\n' -- the arg is a set of characters to be removed, not a
suffix to be removed. In Stef's situation, it "works" only by accident.
Using that would not always give you the correct answer -- e.g. if your
(Windows) file had a line ending in CR CR LF [I've seen stranger].
I knew that about line.rstrip, but didn't consider the possibility of
\r\r\n, while still wanting the first \r. Yuck.

Honestly, I almost always use line.rstrip()--it is seldom that I care
about closing whitespace.
--
Michael Hoffman
Apr 28 '07 #7

P: n/a
On Apr 28, 7:25 pm, Michael Hoffman <cam.ac...@mh391.invalidwrote:
John Machin wrote:
On 27/04/2007 11:19 PM, Michael Hoffman wrote:
stef wrote:
hello,
>In the previous language I used,
when reading a line by readline, the EOL character was removed.
Very interesting; how did you distinguish between EOF and an empty line?
Did you need to call an isEOF() method before each read?
>Now I'm reading a text-file with CR+LF at the end of each line,
Datafile = open(filename,'r') line = Datafile.readline()
>now this gives an extra empty line
print line
>and what I expect that should be correct, remove CR+LF,
gives me one character too much removed
print line[,-2]
Stef, that would give you a syntax error. I presume that you meant to
type line[:-2]
>while this gives what I need ???
print line[,-1]
>Is it correct that the 2 characters CR+LF are converted to 1 character ?
In text mode (the default), whatever is the line ending on your platform
is converted to a single "newline" '\n' which is the same as LF.
Using line[:-1] is NOT recommended, as the last line in your file may
not be terminated, and in that case you would lose the last data character.
>Is there a more automatic way to remove the EOL from the string ?
line = line.rstrip("\r\n") should take care of it. If you leave out
the parameter, it will strip out all whitespace at the end of the
line, which is what I do in most cases.
If you want *exactly* what is in the line, use line.rstrip('\n') -- this
will remove only the trailing newline (if it exists).
If you want to strip all trailing whitespace, use line.rstrip() as
Michael suggested.
Michael, note carefully that line.rstrip('\r\n') removes instances of
'\r' OR '\n' -- the arg is a set of characters to be removed, not a
suffix to be removed. In Stef's situation, it "works" only by accident.
Using that would not always give you the correct answer -- e.g. if your
(Windows) file had a line ending in CR CR LF [I've seen stranger].

I knew that about line.rstrip, but didn't consider the possibility of
\r\r\n, while still wanting the first \r. Yuck.
It would be unusual to want that first \r -- a possibly more likely
scenario might be where your text file contains an extract from a
database, and you need to check that there are no unwanted (e.g.
unprintable) characters in the data (whether at the end of the line,
the middle, or the start).

In any case I think that you are missing the point that when reading a
normal text file on Windows with readline, while the line in the file
may be 'foo bar\r\n', what you get from readline is 'foo bar\n' -- so
in normal usage, the \r in your line.rstrip('\r\n') is pointless.
>
Honestly, I almost always use line.rstrip()--it is seldom that I care
about closing whitespace.
Honestly, I almost always split a line into fields and then for each
field, strip leading and trailing whitespace, and change runs of 1 or
more whitespace characters to a single space -- where "whitespace"
includes the pesky U+00A0 aka &nbsp; which doesn't qualify as
whitespace in a str instance.

Cheers,
John

Apr 28 '07 #8

This discussion thread is closed

Replies have been disabled for this discussion.