473,382 Members | 1,423 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

newlines in text files

Hi, I have this problem: when reading a M$ Windows text file under Windows in
text mode (fopen("blah", "r")) the Windows newline sequence, \r\n is returned as
is, i.e. it's not replaced by a single '\n'. Is this the correct behaviour? But
doesn't it make writing portable programs a bit harder? It would be nice if
whatever-newline-sequence-the-platform-has were replaced by a single \n, i.e.
the "C platform" line terminator. Like ints are always 64 bit in Java
irregardless of what platform lies under the Java platform.

In case it is indeed like that, how is my program supposed to know which line
terminator is being used?
Nov 13 '05 #1
7 2285
rihad wrote:
Hi, I have this problem: when reading a M$ Windows text file under Windows
in text mode (fopen("blah", "r")) the Windows newline sequence, \r\n is
returned as is, i.e. it's not replaced by a single '\n'. Is this the
correct behaviour?
No. Check your text file using a hex-capable file browser (e.g. LIST.COM if
you have it). If you don't have a hex-capable file browser, here's one:

FILE *fp = fopen(filename, "rb");
if(fp != NULL)
{
int ch = 0;
int i = 0;
while((ch = getc(fp)) != EOF)
{
printf(" %02X", ch);
i++;
if(16 == i)
{
putchar('\n');
i = 0;
}
}
fclose(fp);
}
putchar('\n');

But doesn't it make writing portable programs a bit
harder? It would be nice if whatever-newline-sequence-the-platform-has
were replaced by a single \n, i.e. the "C platform" line terminator.


It would be nice, and indeed it /is/ nice. Check your input file, and check
your fopen call. It's far more likely that your data or code is broken than
that your compiler is broken.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton
Nov 13 '05 #2
rihad wrote:

Hi, I have this problem: when reading a M$ Windows text file under
Windows in text mode (fopen("blah", "r")) the Windows newline
sequence, \r\n is returned as is, i.e. it's not replaced by a
single '\n'. Is this the correct behaviour? But doesn't it make
writing portable programs a bit harder? It would be nice if
whatever-newline-sequence-the-platform-has were replaced by a
single \n, i.e. the "C platform" line terminator. Like ints are
always 64 bit in Java irregardless of what platform lies under
the Java platform.

In case it is indeed like that, how is my program supposed to
know which line terminator is being used?


There is something wrong with your installation and/or library.
The only way you should see both the /r and /n is if you open the
file in binary mode. Similarly writing a /n should create the /r
/n sequence.

However, you may be operating under Cygwin or something similar,
which attempts to create a complete Li/U-nix environment under
Windoze. I don't know just what provisions they make.

At any rate this is not a C language problem, and should be dealt
with in a newsgroup dedicated to your compiler/system.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 13 '05 #3
On Sat, 04 Oct 2003 17:31:14 GMT, CBFalconer <cb********@yahoo.com> wrote:

However, you may be operating under Cygwin or something similar,
which attempts to create a complete Li/U-nix environment under
Windoze. I don't know just what provisions they make.


You're right! I recompiled the program with mingw's gcc and the problem went
away. I'll have to ask why Cygwin worked that way in a different newsgroup.

Nov 13 '05 #4
rihad <ri***@mail.ru> writes:
On Sat, 04 Oct 2003 17:31:14 GMT, CBFalconer <cb********@yahoo.com> wrote:

However, you may be operating under Cygwin or something similar,
which attempts to create a complete Li/U-nix environment under
Windoze. I don't know just what provisions they make.


You're right! I recompiled the program with mingw's gcc and the problem went
away. I'll have to ask why Cygwin worked that way in a different newsgroup.


Because you probably told it to (IIRC, you are asked to specify
whether you want Cygwin to use Windows or Unix line-endings).

Cygwin is a different implementation from the implementation that
created the text file in question, so all bets are off. Cygwin
attempts to emulate a UNIX-like operating system, whereas if you
used Notepad or somesuch, you wrote the text file from within a
Windows operating system.

-Micah
Nov 13 '05 #5
On 04 Oct 2003 17:10:15 -0700, Micah Cowan <mi***@cowan.name> wrote:
Cygwin is a different implementation from the implementation that
created the text file in question, so all bets are off. Cygwin
attempts to emulate a UNIX-like operating system, whereas if you
used Notepad or somesuch, you wrote the text file from within a
Windows operating system.

Then because of C's Unix heritage, can it be said that \n text files are the
most portable?
Nov 13 '05 #6
rihad wrote:
Micah Cowan <mi***@cowan.name> wrote:
Cygwin is a different implementation from the implementation that
created the text file in question, so all bets are off. Cygwin
attempts to emulate a UNIX-like operating system, whereas if you
used Notepad or somesuch, you wrote the text file from within a
Windows operating system.

Then because of C's Unix heritage, can it be said that \n text
files are the most portable?


No. There are three common flavors of text files with line
termination sequences:

<crlf> \r\n CP/M, MsDos, Windoze, others
<lf> \n Linux, Unix, etc.
<cr> \r Macintosh, possibly other Apples.

and filters are available on most systems to convert between the
standards. Bear in mind that many other systems do not even have
a line termination sequence - they use other means, such as a
count of chars in a fixed length record, or whatever.

On MsDos/Windoze you can easily see the differences with such hex
capable systems as Buerg's LIST. TEXTPAD is capable of generating
(and converting) between at least Linux and Dos conventions, but
can't switch between hex and char displays.

As far as your C program is concerned, you end lines with a \n.
Nothing else enters into it. The i/o libraries will handle the
rest in a manner suitable for your system.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 13 '05 #7
rihad <ri***@mail.ru> writes:
On 04 Oct 2003 17:10:15 -0700, Micah Cowan <mi***@cowan.name> wrote:
Cygwin is a different implementation from the implementation that
created the text file in question, so all bets are off. Cygwin
attempts to emulate a UNIX-like operating system, whereas if you
used Notepad or somesuch, you wrote the text file from within a
Windows operating system.

Then because of C's Unix heritage, can it be said that \n text files are the
most portable?


No, it cannot. What can be said is that all text files must be
represented as having "\n" line-endings to any C program that
opens them as a text file. However, what constitutes a "text
file" is defined by the particular implementation. In this case,
the text files made in Notepad do not fit Cygwin's idea of a
"text file".

-Micah
Nov 13 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Alessandro Crugnola *sephiroth* | last post by:
Hi, I'm trying to detect the newlines method of a file opened. I'm using the 'U' mode as parameter when opening a file, but for every file opened the result is always "\r\n", even if the file has...
5
by: Thomas Philips | last post by:
I have a data file that I read with readline(), and would like to control the formats of the lines when they are printed. I have tried inserting escape sequences into the data file, but am having...
4
by: Grant Edwards | last post by:
I'm using xml.sax to parse the "datebook" xml file generated by QTopiaDesktop. When I look at the xml file, some of the attribute strings have newlines in them (as they are supposed to). ...
1
by: dbee | last post by:
Hi, So I'm having a problem with disappearing newlines. I import the newlines from a file into my shell script fine. But then I process the text and the url_encode comes out the other end with...
0
by: skip | last post by:
*argh!* I hate XML! There, now that that's off my chest... I am trying to save Python code as attributes of an XML tag with xml.dom.minidom machinery. The code, predicatbly enough, contains...
2
by: Edward K. Ream | last post by:
Hello all, I recently ran across a situation in which sax.saxutils.quoteattr did not work as I expected. I am writing Leo outlines as opml files http://en.wikipedia.org/wiki/OPML which forces...
6
by: chrispwd | last post by:
Hello, I have a situation where I have a file that contains text similar to: myValue1 = contents of value1 myValue2 = contents of value2 but with a new line here myValue3 = contents of...
5
by: Neil Crighton | last post by:
I'm using the zipfile library to read a zip file in Windows, and it seems to be adding too many newlines to extracted files. I've found that for extracted text-encoded files, removing all instances...
3
by: joelkeepup | last post by:
Hi, im trying to create a text email message using xslt template , the transforms work great, but the newlines and whitespace in the xslt doc are removed. Is there a setting somewhere I have...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.