473,399 Members | 3,656 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

getline() and newlines


I have a text file with mixed carriage returns ('\n' and '\r\n').

On Linux, both the std::string getline() global function and the
std::iostream getline() member function are keeping some of the newlines in
the result (I suspect they look only for the '\n').

* Is there a quick way I can tell either function to gobble up both
Windows-style and Unix-style newlines?

* If not, what would be an efficient way of getting rid of them? Currently
I use string::find_last_of("\n\r") + string::erase() but this is not very
efficient.

Apr 5 '08 #1
5 15425
On 2008-04-05 15:25, barcaroller wrote:
I have a text file with mixed carriage returns ('\n' and '\r\n').

On Linux, both the std::string getline() global function and the
std::iostream getline() member function are keeping some of the newlines in
the result (I suspect they look only for the '\n').

* Is there a quick way I can tell either function to gobble up both
Windows-style and Unix-style newlines?
While you can specify the delimiting character you can only specify one
character.
* If not, what would be an efficient way of getting rid of them? Currently
I use string::find_last_of("\n\r") + string::erase() but this is not very
efficient.
Since the Windows sequence is \r\n and getline() uses \n as delimiter
any line with a Windows linebreak will end with \r. Use this knowledge
to reduce the work required:

std::string str;
std::getline(file, str);

if (str[str.size() - 1] == '\r')
str.resize(str.size() - 1);

--
Erik Wikström
Apr 5 '08 #2
On Sat, 05 Apr 2008 09:25:11 -0400, barcaroller wrote:
I have a text file with mixed carriage returns ('\n' and '\r\n').

On Linux, both the std::string getline() global function and the
std::iostream getline() member function are keeping some of the newlines
in the result (I suspect they look only for the '\n').

* Is there a quick way I can tell either function to gobble up both
Windows-style and Unix-style newlines?

* If not, what would be an efficient way of getting rid of them?
Currently
I use string::find_last_of("\n\r") + string::erase() but this is not
very efficient.
A simple and quick solution, adjust it to your own needs:

#include <iostream>
#include <sstream>

std::istream & getline(std::istream & in, std::string & out) {
char c;
while(in.get(c).good()) {
if(c == '\n') {
c = in.peek();
if(in.good()) {
if(c == '\r') {
in.ignore();
}
}
break;
}
out.append(1,c);
}
return in;
}

int main() {
std::istringstream strm("alpha\nbeta\n\r...\n\romega\n\n");
for(int i = 0; strm.good(); ++i) {
std::string line;
getline(strm,line);
std::cout<<i<<"\t"<<line<<std::endl;
}
return 0;
}

--
OU
Apr 5 '08 #3
On Sat, 05 Apr 2008 13:45:39 +0000, Obnoxious User wrote:
On Sat, 05 Apr 2008 09:25:11 -0400, barcaroller wrote:
>I have a text file with mixed carriage returns ('\n' and '\r\n').

On Linux, both the std::string getline() global function and the
std::iostream getline() member function are keeping some of the
newlines in the result (I suspect they look only for the '\n').

* Is there a quick way I can tell either function to gobble up both
Windows-style and Unix-style newlines?

* If not, what would be an efficient way of getting rid of them?
Currently
I use string::find_last_of("\n\r") + string::erase() but this is not
very efficient.

A simple and quick solution, adjust it to your own needs:

#include <iostream>
#include <sstream>

std::istream & getline(std::istream & in, std::string & out) {
char c;
while(in.get(c).good()) {
if(c == '\n') {
c = in.peek();
if(in.good()) {
if(c == '\r') {
in.ignore();
}
}
break;
}
out.append(1,c);
}
return in;
}

int main() {
std::istringstream strm("alpha\nbeta\n\r...\n\romega\n\n");
for(int i = 0; strm.good(); ++i) {
std::string line;
getline(strm,line);
std::cout<<i<<"\t"<<line<<std::endl;
}
return 0;
}
Realized after I posted it that I reversed the sequence, so the code is
flawed for your needs. Although easily fixed. Ignore it.

--
OU
Apr 5 '08 #4
On 5 avr, 15:25, "barcaroller" <barcarol...@music.netwrote:
I have a text file with mixed carriage returns ('\n' and '\r\n').
On Linux, both the std::string getline() global function and
the std::iostream getline() member function are keeping some
of the newlines in the result (I suspect they look only for
the '\n').
Technically, it's implementation defined. Typically, however,
yes: Unix implementations treat a single 0x0A in the stream as a
newline; Windows implementations treat either a single 0x0A or
the sequence 0x0D, 0x0A as a newline.

Most of the time, this should not be a problem. In all of the
usual encodings (at least outside of the mainframe world), the
0x0D will result in an '\r' under Unix (and probably also under
Windows, if it isn't immediately followed by a 0x0A). In the
"C" locale, and probably in all other locales, '\r' is
whitespace. So it ends up ignored with the rest of the trailing
whitespace. (The one exception is C and C++ source code; for
some reason, the standard doesn't consider '\r' as whitespace in
source code.)
* Is there a quick way I can tell either function to gobble
up both Windows-style and Unix-style newlines?
Is there ever a need to?
* If not, what would be an efficient way of getting rid of
them? Currently I use string::find_last_of("\n\r") +
string::erase() but this is not very efficient.
I'd use an external program (e.g. tr). In practice, if a file
is on a shared file system, and thus being read by both Windows
and Unix, it's generally best (pragmatically, at least) to stick
with the Unix conventions.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Apr 5 '08 #5
std::string str;
std::getline(file, str);

if (str[str.size() - 1] == '\r')
str.resize(str.size() - 1);
And with a empty line with unix end of line -SEGFAULT

The code fragment should be:
if ((str.size() 0) && (str[str.size() - 1] == '\r')
str.resize(str.size() - 1);
Apr 7 '08 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Simon Gibson | last post by:
Hi there, im trying to write a program where you can write reports and save them into an array. im having problems with getting the string into an array tho it seems to be skipping over the...
2
by: Michael Easterly | last post by:
What is a good way to read a text file and read each line, then assign data to variables? I have this so far, #include <iostream> #include <fstream> using namespace std; string...
1
by: ma740988 | last post by:
Consider: ifstrem MyFile("extractMe.txt"); string Str; getline(MyFile, Str); getline above extracts the contents of MyFile and place into the string object. Deduced using FROM/TO logic I...
18
by: Amadeus W. M. | last post by:
I'm trying to read a whole file as a single string, using the getline() function, as in the example below. I can't tell what I'm doing wrong. Tried g++ 3.2, 3.4 and 4.0. Thanks! #include...
1
by: dbee | last post by:
Hi, So I'm having a problem with disappearing newlines. I import the newlines from a file into my shell script fine. But then I process the text and the url_encode comes out the other end with...
2
by: Edward K. Ream | last post by:
Hello all, I recently ran across a situation in which sax.saxutils.quoteattr did not work as I expected. I am writing Leo outlines as opml files http://en.wikipedia.org/wiki/OPML which forces...
8
by: toton | last post by:
Hi, I am reading some large text files and parsing it. typical file size I am using is 3 MB. It takes around 20 sec just to use std::getline (I need to treat newlines properly ) for whole file in...
11
by: rory | last post by:
I am reading a binary file and I want to search it for a string. The only problem is that failbit gets set after only a few calls to getline() so it never reaches the end of the file where the...
10
by: Terry IT | last post by:
hi, i'm using code like this string s while(getline(cin,s)){ process(s); } // this is the last line process(s);
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.