I realize this is a somewhat platform specific question, but I think it is
still of general enough interest to ask it here ... if I am wrong I guess I
will find out 8*).
As we all know, DOS uses two characters (carriage-return and line-feed), to
signal the end of a line, while UNIX uses only one (line-feed). When using
getline in C++, one can only specify a single character as the terminator
(default is '\n'), so if you read a line of text from a DOS-style text file
into a string, there is still a carriage return on the end of it. This then
causes problems, particularly if I want to later concatenate two strings
read in this way.
Perhaps Windoze-based compilers automatically set things up so that both of
the terminator characters are removed and added as needed, but I am using
g++ on cygwin, and I have to deal with this myself. So, is there a general
technique for dealing with this? I don't really want to have to check the
last character each time I read in a string with getline, and remove it if
it is a carriage-return. Actually, I don't even know how I would do that
offhand .. I guess look up ^CR in an ASCII table and check it using the
octal value? Any help would be appreciated.
TIA,
Dave Moore 8 3410
Dave Moore wrote: I realize this is a somewhat platform specific question, but I think it is still of general enough interest to ask it here ... if I am wrong I guess I will find out 8*).
As we all know, DOS uses two characters (carriage-return and line-feed), to signal the end of a line, while UNIX uses only one (line-feed). When using getline in C++, one can only specify a single character as the terminator (default is '\n'), so if you read a line of text from a DOS-style text file into a string, there is still a carriage return on the end of it. This then causes problems, particularly if I want to later concatenate two strings read in this way.
Perhaps Windoze-based compilers automatically set things up so that both of the terminator characters are removed and added as needed, but I am using g++ on cygwin, and I have to deal with this myself. So, is there a general technique for dealing with this? I don't really want to have to check the last character each time I read in a string with getline, and remove it if it is a carriage-return. Actually, I don't even know how I would do that offhand .. I guess look up ^CR in an ASCII table and check it using the octal value? Any help would be appreciated.
If you open a file that you know _may_ contain \r, just discard them
from the lines before you process your lines further.
V
Dave Moore wrote: Perhaps Windoze-based compilers automatically set things up so that
both of the terminator characters are removed and added as needed, but I am
using g++ on cygwin, and I have to deal with this myself.
The OS should be doing it. I believe there is hackary with mounting
mode in cygwin.
So, is there a general technique for dealing with this?
Usually you open your file in text mode. With cygwin I believe that
folder or whatever has to be 'mounted' in text mode as well...or
something of that order. Read docs in cygwin about mounting.
"Dave Moore" <dt*****@email. unc.edu> wrote in message
news:37******** *****@individua l.net... I realize this is a somewhat platform specific question, but I think it is still of general enough interest to ask it here ... if I am wrong I guess
I will find out 8*).
As we all know, DOS uses two characters (carriage-return and line-feed),
to signal the end of a line, while UNIX uses only one (line-feed). When
using getline in C++, one can only specify a single character as the terminator (default is '\n'), so if you read a line of text from a DOS-style text
file into a string, there is still a carriage return on the end of it. This
then causes problems, particularly if I want to later concatenate two strings read in this way.
Perhaps Windoze-based compilers automatically set things up so that both
of the terminator characters are removed and added as needed, but I am using g++ on cygwin, and I have to deal with this myself. So, is there a
general technique for dealing with this? I don't really want to have to check the last character each time I read in a string with getline, and remove it if it is a carriage-return. Actually, I don't even know how I would do that offhand .. I guess look up ^CR in an ASCII table and check it using the octal value? Any help would be appreciated.
Standard C++ defines a single (abstract) type 'char' value which
denotes 'newline' ('\n'). It does not specify its numeric value
or a mapping to a particular character set. The implementation is
responsible for translating between an external 'end-of-line'
indicator and '\n'. (This happens for streams opened in 'text mode'
(the default).)
If a stream is opened in 'binary mode', no such translation occurs
(however, there may still be a conversion from the 'external' to
'internal' [i.e. in-memory] encoding). IOW in 'binary mode',
'newline' has no meaning.
If you're opening your streams in text mode, and your compiler
is failing to do the proper translations to/from '\n', then
it's non-compliant, broken, or not configured correctly.
Everything You Ever Wanted To Know About C++ Streams: http://www.langer.camelot.de/iostreams.html
-Mike
"Mike Wahler" <mk******@mkwah ler.net> wrote in message
news:PM******** *********@newsr ead1.news.pas.e arthlink.net... If you're opening your streams in text mode, and your compiler is failing to do the proper translations to/from '\n', then it's non-compliant, broken, or not configured correctly.
I think I spoke too soon. Rereading your message, I see
you're trying to read a 'foreign' file format. This means
you'll have to manage the translations yourself. Or alternatively
there exist utilities which can convert files between "DOS text"
and "UNIX text" formats. That might make things easier for you.
Check google.
-Mike
Dave Moore wrote: I realize this is a somewhat platform specific question, but I think it is still of general enough interest to ask it here.
This is a perfectly valid C++ question.
If I am wrong, I guess I will find out 8*).
As we all know, DOS uses two characters (carriage-return and line-feed), to signal the end of a line, while UNIX uses only one (line-feed). When using getline in C++, (default is '\n'), one can only specify a single character as the terminator so, if you read a line of text from a DOS-style text file into a string, there is still a carriage return on the end of it. This, then, causes problems, particularly if I want to later concatenate two strings read in this way.
Perhaps Windoze-based compilers automatically set things up so that both of the terminator characters are removed and added as needed, but I am using g++ on cygwin, and I have to deal with this myself.
No! The GNU C++ compiler on cygwin will do this for you too.
So, is there a general technique for dealing with this?
Open the file in text mode. This converts
the carriage-return/line-feed sequence to a line-feed on input and
the line-feed or a carriage-return/line-feed sequence on output.
I don't really want to have to check the last character each time I read in a string with getline and remove it if it is a carriage-return. Actually, I don't even know how I would do that offhand. I guess look up ^CR in an ASCII table and check it using the octal value? Any help would be appreciated.
If you need to see the carriage-return/linefeed sequence
in your program, open the file in binary mode:
std::ifstream input("input_fi le_name", std::ios::binar y);
"Dave Moore" <dt*****@email. unc.edu> wrote in message
news:37******** *****@individua l.net... I realize this is a somewhat platform specific question, but I think it is still of general enough interest to ask it here ... if I am wrong I guess
I will find out 8*).
As we all know, DOS uses two characters (carriage-return and line-feed),
to signal the end of a line, while UNIX uses only one (line-feed). When
using getline in C++, one can only specify a single character as the terminator (default is '\n'), so if you read a line of text from a DOS-style text
file into a string,
using a UNIX implementation
there is still a carriage return on the end of it. This then causes problems, particularly if I want to later concatenate two strings read in this way.
Perhaps Windoze-based compilers automatically set things up so that both
of the terminator characters are removed and added as needed,
Using a Windows implementation, 'end of line' indicators
in the file are automatically translated to '\n' (which
C++ does not assign a specific value).
but I am using g++ on cygwin, and I have to deal with this myself. So, is there a
general technique for dealing with this? I don't really want to have to check the last character each time I read in a string with getline, and remove it if it is a carriage-return. Actually, I don't even know how I would do that offhand .. I guess look up ^CR in an ASCII table and check it using the octal value? Any help would be appreciated.
#include <fstream>
#include <iostream>
#include <istream>
#include <string>
/*
Extracts a string from the stream 'is', using
default terminator '\n', and stores the string
in 'line'. If the last character of the extracted
string is equal to 'rem', removes it. Returns a
reference to 'is'.
*/
std::istream& get_xlate_line( std::istream& is,
std::string& line,
char rem = '\r')
{
std::getline(is , line);
if(!line.empty( ))
{
std::string::it erator e(line.end() - 1);
if(*e == rem)
line.erase(e);
}
return is;
}
/* extract and output strings from a file */
int main()
{
std::ifstream ifs("filename") ;
std::string line;
while(get_xlate _line(ifs, line))
std::cout << line << '\n';
return 0;
}
-Mike
"Noah Roberts" <nr******@stmar tin.edu> wrote in message
news:11******** *************@c 13g2000cwb.goog legroups.com... Dave Moore wrote:
Perhaps Windoze-based compilers automatically set things up so that both of the terminator characters are removed and added as needed, but I am using g++ on cygwin, and I have to deal with this myself.
The OS should be doing it. I believe there is hackary with mounting mode in cygwin.
So, is there a general technique for dealing with this?
Usually you open your file in text mode. With cygwin I believe that folder or whatever has to be 'mounted' in text mode as well...or something of that order. Read docs in cygwin about mounting.
It seems that something a bit different is going on, but your reply led me
in the right direction. I was compiling my executable to use the cygwin
run-time environment (cygwin.dll), rather than the windows environment. I
am pretty sure I set up my cygwin installation to use unix-style text files,
so that might well explain the confusion.
Once I compiled my program to use the windows environment
(using -mno-cygwin, as specified in the cygwin FAQ), everything was groovy.
Thanks for the suggestion!
Dave Moore
Noah Roberts wrote: Dave Moore wrote:
Perhaps Windoze-based compilers automatically set things up so that
both of
the terminator characters are removed and added as needed, but I am
using
g++ on cygwin, and I have to deal with this myself.
The OS should be doing it. I believe there is hackary with mountin mode in cygwin.
The OS might do it, but I rarely see so. The expansion is done in the
language runtime library. What most likely is confused here is that
the CYGWIN environment has the compiler thinking that there is no conversion
needed, but he's giving it files from the DOS world. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: KevinGPO |
last post by:
Just wondering if anyone knows if there are converters to convert from:
MS Visual C++ 6.0 or
MS Visual Studio 2003
project files into UNIX autogen/configure/make files?
|
by: David Meier |
last post by:
Hi,
I am new to C# and I am facing this small problem:
I start a new process using cygwin and I redirect the standard output
to a string variable. When I display the string variable in a list box
I see those squares representing UNIX line breaks. How can I convert
those to Windows style line breaks?
Thanks. Dave.
|
by: Steve |
last post by:
Is there a UNIX style HUP call for .Net apps to make them keep running,
but reread their .config files?
Thanks,
Steve
|
by: Xah Lee |
last post by:
The Nature of the “Unix Philosophy”
Xah Lee, 2006-05
In the computing industry, especially among unix community, we often
hear that there's a “Unix Philosophy”. In this essay, i dissect the
nature and characterization of such “unix philosophy”, as have been
described by Brian Kernighan, Rob Pike, Dennis Ritchie, Ken Thompson,
and Richard P Gabriel et al, and in recent years by Eric Raymond.
|
by: Ben |
last post by:
Hi,
I have a python script on a unix system that runs fine. I have a python
script on a windows system that runs fine. Both use tabs to indent
sections of the code. I now want to run them on the same system,
actually in the same script by combining bits and pieces. But whatever
I try my windows tabs get converted to spaces when I transfer it to the
unix system and the interpreter complains that the indentation style is
not consistant...
| |
by: Zytan |
last post by:
I am downloading a file with \n newlines from a Unix system, and
storing it to a string. I want to convert it to \r\n newlines for
Windows. I know the StreamReader has an Encoding attribute, but this
isn't what I need. Should I do a String.Replace(), or is there a
better solution?
Zytan
|
by: mazwolfe |
last post by:
Someone recently asked about reading lines. I had this code written
some time ago (part of a BASIC-style interpreter based on H. Shildts
in Art of C) to read a file with the lines ended in any format:
Microsoft-style CR/LF pair, Unix-style NL, or Mac-style CR. It also
allows for EOF that does not follow a blank line. I thought this would
make text-file sharing a bit easier.
Here it is:
/* Load a file, normalizing newlines to *nix...
|
by: Hongyu |
last post by:
Dear all:
I am trying to write to a file with full directory name and file name
specified (./outdir/mytestout.txt where . is the current directory)
in
C programming language and under Unix, but got errors of Failed to
open file ./outdir/mytestout.txt. Below is the code:
#include <stdio.h>
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
| |
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |