473,548 Members | 2,604 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

What does "formatted" I/O really mean?

I'm still not completely sure what's going on with C++ I/O regarding the
extractors and inserters. The following document seems a bit inconsistent:
http://gcc.gnu.org/onlinedocs/libstd...o/howto.html#1

Copying a file:

WRONG WAY:
#include <fstream>
std::ifstream IN ("input_file ");
std::ofstream OUT ("output_file") ;
OUT << IN; // undefined behavior

RIGHT WAY:
//[T]he easiest way to copy the file is:
OUT << IN.rdbuf();

HOWEVER:
"First, ios::binary has exactly one defined effect, no more and no less.
Normal text mode has to be concerned with the newline characters, and the
runtime system will translate between (for example) '\n' and the
appropriate end-of-line sequence (LF on Unix, CRLF on DOS, CR on Macintosh,
etc)....

Second, using << to write and >> to read isn't going to work with the
standard file stream classes, even if you use skipws during reading. Why
not? Because ifstream and ofstream exist for the purpose of formatting, not
reading and writing. Their job is to interpret the data into text
characters, and that's exactly what you don't want to happen during binary
I/O.

BUT IT SAID:
[T]he easiest way to copy the file is:
OUT << IN.rdbuf();

Does that only apply to "text" files?

"Third, using the get() and put()/write() member functions still aren't
guaranteed to help you. These are "unformatte d" I/O functions, but still
character-based. (This may or may not be what you want, see below.)"

I saw below, but don't know what I was supposed to see. Is it the endian
stuff?

If I open a file in binary mode, then f.rdbuf() >> stringstrm, is the entire
file going to be faithfully represented bit-for-bit in the
std::stringstre am? If not, how will it have been changed? Note that I
made no mention of unsetting skipws here.

You may think I'm just had-headed, and can't understand that I shouldn't use
the overloaded shift operators for unformatted data. Well, suppose someone
else were to do that, and it worked for them. Is there a potential that it
could cause problems for me? I certainly have used the above method to
read "raw" data in the past.

Also, I _do_ want to "format" the data. I want to parse and ELF file into
its elementary components.
--
If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true.-Bertrand Russell
Aug 1 '05 #1
2 3494

Steven T. Hatton wrote:
HOWEVER:
"First, ios::binary has exactly one defined effect, no more and no less.
Normal text mode has to be concerned with the newline characters, and the
runtime system will translate between (for example) '\n' and the
appropriate end-of-line sequence (LF on Unix, CRLF on DOS, CR on Macintosh,
etc)....
Yes, that is indeed the case. It does not mean that if you stream with
operator<< or operator>> it will write in "binary" format.
Second, using << to write and >> to read isn't going to work with the
standard file stream classes, even if you use skipws during reading. Why
not? Because ifstream and ofstream exist for the purpose of formatting, not
reading and writing. Their job is to interpret the data into text
characters, and that's exactly what you don't want to happen during binary
I/O.
Correct, but streambuf is there underneath as no more than an array of
characters.
BUT IT SAID:
[T]he easiest way to copy the file is:
OUT << IN.rdbuf();
If you can get the length of the buffer you can also use write() which
is used for binary I/O. You must beware of one thing though - those
nasty char_traits. I was using a basic_streambuf < unsigned char > for
binary I/O and found some characters missing. It turned out it was
randomly removing 0xff characters after interpreting them as "EOF". So
I had to write my own char_traits for unsigned char and attach that to
my stream as my second template parameter (thus basic_iostream<
unsigned char, uchtraits > where uchtraits is my own "traits" class).
Then it worked.
"Third, using the get() and put()/write() member functions still aren't
guaranteed to help you. These are "unformatte d" I/O functions, but still
character-based. (This may or may not be what you want, see below.)"
They are based in characters that have traits. You are not forced to
use char_traits<cha r>

I saw below, but don't know what I was supposed to see. Is it the endian
stuff?
Nothing to do with endian stuff, except that if you used basic_fstream
(basic_iostream ) on a character type of 2 bytes or more to write
integers then endian stuff might come into play. (One reason why
wchar_t is generally not used as a character. Instead one-byte
characters and codepages are used).
If I open a file in binary mode, then f.rdbuf() >> stringstrm, is the entire
file going to be faithfully represented bit-for-bit in the
std::stringstre am? If not, how will it have been changed? Note that I
made no mention of unsetting skipws here.
There is no operator>> overload for basic_streambuf/filebuf.
You may think I'm just had-headed, and can't understand that I shouldn't use
the overloaded shift operators for unformatted data. Well, suppose someone
else were to do that, and it worked for them. Is there a potential that it
could cause problems for me? I certainly have used the above method to
read "raw" data in the past.
Your own objects can use operator>> and operator<< in whatever way they
want, writing in binary format if they choose. They do not need to be
humanly readable.
Also, I _do_ want to "format" the data. I want to parse and ELF file into
its elementary components.


Then get your objects to format binary data. How is STL supposed to
know your format? If you format your data to be a fixed size then use
read() and write(). If a variable size then put a "header" section
inside and resolve any endian issues by enforcing one particular endian
notation. (Normally I would choose big-endian unless you are going to
primarily be working on a little-endian system and can optimise for
that system).

Aug 1 '05 #2
Steven T. Hatton wrote:
Second, using << to write and >> to read isn't going to work with the
standard file stream classes, even if you use skipws during reading. Why
not? Because ifstream and ofstream exist for the purpose of formatting,
not reading and writing. Their job is to interpret the data into text
characters, and that's exactly what you don't want to happen during binary
I/O.

BUT IT SAID:
[T]he easiest way to copy the file is:
OUT << IN.rdbuf();

Does that only apply to "text" files?
No, it applies to all files. The thing here is that this output operator
considers the whole sequences of characters produced by 'IN.rdbuf()' as
one (unaltered) sequence of text characters.
"Third, using the get() and put()/write() member functions still aren't
guaranteed to help you. These are "unformatte d" I/O functions, but still
character-based. (This may or may not be what you want, see below.)"

I saw below, but don't know what I was supposed to see. Is it the endian
stuff?
I don't know for sure what they wanted to refer at, except maybe the
discussion about binary formatted I/O. The actual issue which I haven't
seen addressed on this page is that the bytes in the file are converted
into characters by processing them in a locale specific way. If you want
to do binary I/O you need this conversion to have no effect. This is
done by selecting the "C" locale.
If I open a file in binary mode, then f.rdbuf() >> stringstrm, is the
entire file going to be faithfully represented bit-for-bit in the
std::stringstre am?
No, it is not: First of all, there is no overload for 'operator>>()'
taking a stream buffer as first argument and a stream as second
argument. Assuming you wanted to write 'f >> stringstream.rd buf()'
this still does not work because formatted input operators start by
skipping white space unless 'skipws' is turned off. It works the other
way around, though: 'stringstream << f.rdbuf()' (assuming 'stringstream'
is a variable of an appropriate type, e.g. 'std::ostringst ream').
You may think I'm just had-headed, and can't understand that I shouldn't
use the overloaded shift operators for unformatted data. Well, suppose
someone else were to do that, and it worked for them.
You can use essentially just one of the predefined shift operators
reasonably for binary I/O and this is the output operator taking a
stream buffer pointer. Everything else is too error prone, IMO, to
be used reasonably although you can go a long way to come close to a
working implementation. For example, you could even make the inserters
and extractors for numeric types work on a binary format by creating
appropriate 'num_put' and 'num_get' facets. This is, however, not their
intended purpose and there are still sufficient problems left. It is
easier to create a new stream hierarchy for binary I/O and it avoids
a bunch of pitfalls (e.g. to forget to unset 'skipws' or accidental
use of operators for text formatted I/O).
I certainly have used the above method to read "raw" data in the past.
If I had to read the whole file into a container, I would read "raw"
data like this:

std::vector<cha r> data((std::istr eambuf_iterator <char>(in)),
std::istreambuf _iterator<char> ());

Of course, this tends to be pretty slow because it requires a certain
optimization to be in place which is typically not implemented. Thus,
I'm using streams for a binary format.
Also, I _do_ want to "format" the data. I want to parse and ELF file
into its elementary components.


Yes, the distinction between formatted and unformatted does not really
fit well for this. The difference is between text formatted and binary
formatted. You want a binary format. This is not directly supported by
the IOStreams hierarchy although you can still use the stream buffer
hierarchy for the actual reading/writing from a file. You just have to
take care of opening the files in binary mode and suppressing any
character conversions.
--
<mailto:di***** ******@yahoo.co m> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Aug 1 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
1161
by: Jeff Sandler | last post by:
I have a web page. It uses JavaScript to test the user's input before sending it to the server. It frequently tests using isNaN() with some very interesting results. The statements in question test values from text boxes using isNaN() and also checking if they are still the same after parseInt or parseFloat are performed on them. The...
8
14014
by: John Dalberg | last post by:
What happens when a cookie expires? Does it mean that when the browser or sessions ends, it doesn't get saved? I am using Opera and looking at available cookies and I can some cookies that have expiration dates in the past. Does this mean that they are in memory and they are still valid cookies which sites can test for their existance and...
2
2131
by: Robin | last post by:
I have set up a form with some basic input data, e.g. Name, Address, Telephone number and a few selections from drop-down menus; and a submit button. I am using "mailto" to send the contents of the form when completed to an e-mail address; however I want the information from the form to be formatted in a particular way in the e-mails that are...
28
2965
by: john_sips_tea | last post by:
Just tried Ruby over the past two days. I won't bore you with the reasons I didn't like it, however one thing really struck me about it that I think we (the Python community) can learn from. Ruby has ... an issue with docs. That is to say, there are almost none. Well, actually, there are some. For example, the "PickAxe" book (google it),...
18
11830
by: Martin Jørgensen | last post by:
Hi, Today I got a really strange problem... I've made myself a data-file and I read in data from that file.... When I read something like this line: 03 04 05, 00 04 01, 05 03 07, 08 03 00, 09 06 03 ... etc. with something like scanf("%i %i %i, ", &var1, &var2, &var3);
20
1890
by: Frank Millman | last post by:
Hi all This is probably old hat to most of you, but for me it was a revelation, so I thought I would share it in case someone has a similar requirement. I had to convert an old program that does a traditional pass through a sorted data file, breaking on a change of certain fields, processing each row, accumulating various totals, and...
0
1876
by: Jon | last post by:
If anyone can help...Whenevr I go into a form and use the ctrl+F to find something with the binoculars the "Search field as formatted" is checked as default. This seems to slow down the find considerably. If I uncheck it the search is usually instant. The problem is I dont always rememeber to uncheck it. Does anyone know a way to permantly...
25
30836
by: Peng Yu | last post by:
Hi, It is possible to change the length of "\t" to a number other than 8. std::cout << "\t"; Thanks, Peng
19
2200
by: maya | last post by:
hi, so what is "modern" javascript?? the same as "DOM-scripting"? i.e., editing content (or changing appearance of content) dynamically by massaging javascript objects, html elements, etc? (in conjunction with css, you know, the usual...;) this is what is meant by "modern" javascript?? so how do folks feel about this who think...
0
7512
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7438
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7951
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7466
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7803
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6036
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5362
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5082
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
751
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.