Reading from a stream til EOF

Hendrik Schober

Hi,

I have a 'std::istream' and need to read
its whole contents into a string. How can
I do this?

TIA;

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #1

Subscribe Post Reply

2700

Rodrigo Dominguez

Hendrik Schober wrote:

Hi,

I have a 'std::istream' and need to read
its whole contents into a string. How can
I do this?

TIA;

Schobi

well, I'm not an expert on STL, but here are some examples

example 1:

char c;
while(your_istream.get(c))
your_string.push_back(c);

example 2:

char c;
while(your_istream >> c)
your_string.push_back(c);
example 3:

string your_string;
while(your_istream >> your_string)
foo();

--
Rodrigo Dominguez
<ro***@rorra.com.ar>
Powered Hosting
(www.powered-design.com)

Jul 22 '05 #2

Hendrik Schober

Rodrigo Dominguez <ro***@rorra.com.ar> wrote:

Hendrik Schober wrote:
Hi,

I have a 'std::istream' and need to read
its whole contents into a string. How can
I do this?

TIA;

Schobi

well, I'm not an expert on STL, but here are some examples
[...]

Actually I was hoping for something
that would promiss more performance.

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #3

Jonathan Turkanis

"Hendrik Schober" <Sp******@gmx.de> wrote in message
news:c1**********@news1.transmedia.de...

Hi,

I have a 'std::istream' and need to read
its whole contents into a string. How can
I do this?

I'm afraid making a copy at some point is unavoidable. I wish you
could call reserve() and then write directly into the underlying
storage, as with vector -- at least if the string had never been
copied.

Jonathan

Jul 22 '05 #4

Hendrik Schober

Jonathan Turkanis <te******@kangaroologic.com> wrote:

"Hendrik Schober" <Sp******@gmx.de> wrote in message
news:c1**********@news1.transmedia.de...
Hi,

I have a 'std::istream' and need to read
its whole contents into a string. How can
I do this?
I'm afraid making a copy at some point is unavoidable. I wish you
could call reserve() and then write directly into the underlying
storage, as with vector -- at least if the string had never been
copied.

I suppose you mean 'resize()', where you
say 'reserve()'? The problem is, I don't
see how I can find out how much there is
to read from the stream in advance.
What I'm doing right now is this:

std::string f(std::istream& is)
{
return std::string( std::istream_iterator<char>(is)
, std::istream_iterator<char>() );
}

However, I suppose this goes through all
the sentries etc. for each and every char?
One other thing I was thinking about is
that 'operator>>' seems to be overloaded
for a stream buffer on the RHS. So should
this

std::stringstream ss;
is >> ss.rdbuf();
return ss.str();

do what I think? And if so, can I expect
better performance from this compared to
copying the char myself?
Jonathan

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #5

Jonathan Turkanis

"Hendrik Schober" <Sp******@gmx.de> wrote in message
news:c1**********@news1.transmedia.de...

Jonathan Turkanis <te******@kangaroologic.com> wrote:
"Hendrik Schober" <Sp******@gmx.de> wrote in message
news:c1**********@news1.transmedia.de...
Hi,

I have a 'std::istream' and need to read
its whole contents into a string. How can
I do this?
I'm afraid making a copy at some point is unavoidable. I wish you
could call reserve() and then write directly into the underlying
storage, as with vector -- at least if the string had never been
copied.

I suppose you mean 'resize()', where you

Yes.
say 'reserve()'? The problem is, I don't
see how I can find out how much there is
to read from the stream in advance.
Right. That's unavoidable. An exponential growth strategy is the way
to go. You should get this automatically with string, or you can do it
yourself.
What I'm doing right now is this:

std::string f(std::istream& is)
{
return std::string( std::istream_iterator<char>(is)
, std::istream_iterator<char>() );
}
You defintely don't want to do this if you're concerned with
efficiency. At the very least, you should extract the underlying
streambuf using is.rdbuf(), and read into a char array using sgetn.
However, I suppose this goes through all
the sentries etc. for each and every char?
One other thing I was thinking about is
that 'operator>>' seems to be overloaded
for a stream buffer on the RHS. So should
this

std::stringstream ss;
is >> ss.rdbuf();
return ss.str();

I would have guessed that a good implementation would implement this
as I described above, but I checked dinkumware and it does a
character-by-character extraction. So I would use a char buffer.

(In my first response, I though you were mainly interested in avoiding
the final copy when you call ss.str())

Jonathan

Jul 22 '05 #6

Hendrik Schober

Jonathan Turkanis <te******@kangaroologic.com> wrote:

[...]
say 'reserve()'? The problem is, I don't
see how I can find out how much there is
to read from the stream in advance.
Right. That's unavoidable. An exponential growth strategy is the way
to go. You should get this automatically with string, or you can do it
yourself.

I planned to let 'std::string' take care
of this. :)

What I'm doing right now is this:

std::string f(std::istream& is)
{
return std::string( std::istream_iterator<char>(is)
, std::istream_iterator<char>() );
}

You defintely don't want to do this if you're concerned with
efficiency.

I see. I was expecting this. I suppose
using streambuf iterators wouldn't help
much with this?
At the very least, you should extract the underlying
streambuf using is.rdbuf(), and read into a char array using sgetn.
As this avoids creating/destroying any
sentries and all the formatting?

However, I suppose this goes through all
the sentries etc. for each and every char?
One other thing I was thinking about is
that 'operator>>' seems to be overloaded
for a stream buffer on the RHS. So should
this

std::stringstream ss;
is >> ss.rdbuf();
return ss.str();

I would have guessed that a good implementation would implement this
as I described above, but I checked dinkumware and it does a
character-by-character extraction.

Thanks for checking. We are indeed using
Dinkumware on two platforms. So this would
not help much. I should probably ask about
this MS' std lib newsgroup, as PJP and PB
are reading and posting there.
So I would use a char buffer.
I am not sure what you mean here. Can you
elaborate.
(In my first response, I though you were mainly interested in avoiding
the final copy when you call ss.str())
Well, actually, I would need to istream
the content later anyway. However, first
I need the size of it. (The real task is
to parse the data, which is a rather
lengthy process. OTOH the raw data itself
usually is not very big. So I thought it
would be better to loose some performance
on copying to get the size, as this would
give me a real progress bar for visual
feedback to the users.)
Jonathan

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #7

Dietmar Kuehl

"Hendrik Schober" <Sp******@gmx.de> wrote:

What I'm doing right now is this:

std::string f(std::istream& is)
{
return std::string( std::istream_iterator<char>(is)
, std::istream_iterator<char>() );
}
This is not at all what you want to do, I guess: amoung others, this will
strip all white spaces from the input before putting it into the string!
However, I suppose this goes through all
the sentries etc. for each and every char?
Yes, this goes through the sentries and the preparation etc. What you
probably want to do is this:

std::string f(std::istream& is) {
return std::string( std::istreambuf_iterator<char>(is),
std::istreambuf_iterator<char>() );
}

This does not go through the sentires. However, for this to be efficient,
the library has either to implement the general segmented iterator
optimization or it has to special case this particular use in some form.
My implementation has a special case (which is pretty close to the general
optimization but is not quite there) and this is the fastest method to
read a string, especially for a file with the "C" facet: in this case it
essentially amounts to a memcpy() from a memory mapped file to the string.
One other thing I was thinking about is
that 'operator>>' seems to be overloaded
for a stream buffer on the RHS. So should
this

std::stringstream ss;
is >> ss.rdbuf();
return ss.str();
I would expect this to be the fastest approach with typical implementations:
this may bypass certain internal buffers, etc. For buffered input streams
this should at the very least process blocks of characters from buffers
directly.
do what I think? And if so, can I expect
better performance from this compared to
copying the char myself?

Go measure... I would expect the 'rdbuf()' to be significantly faster than
processing individual characters. Here is something which should also be
faster than processing individual characters:

enum { bufsize = 8192 };
char buf[bufsize];
std::string s;
for (std::streamsize size = 0; size = is.read(buf, bufsize) > 0; )
s.append(buf, size);

(this code is untested and I'm somewhat humble with respect to the string
interface...).
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

Jul 22 '05 #8

Hendrik Schober

Dietmar Kuehl <di***********@yahoo.com> wrote:

"Hendrik Schober" <Sp******@gmx.de> wrote:
What I'm doing right now is this:

std::string f(std::istream& is)
{
return std::string( std::istream_iterator<char>(is)
, std::istream_iterator<char>() );
}
This is not at all what you want to do, I guess: amoung others, this will
strip all white spaces from the input before putting it into the string!

Yes, I found this out by now. :o>

However, I suppose this goes through all
the sentries etc. for each and every char?

Yes, this goes through the sentries and the preparation etc. What you
probably want to do is this:

std::string f(std::istream& is) {
return std::string( std::istreambuf_iterator<char>(is),
std::istreambuf_iterator<char>() );
}

This does not go through the sentires.

This was the next thing I was about to try.
However, for this to be efficient,
the library has either to implement the general segmented iterator
optimization [...]
???
[...]
std::stringstream ss;
is >> ss.rdbuf();
return ss.str();
I would expect this to be the fastest approach with typical implementations:
this may bypass certain internal buffers, etc. For buffered input streams
this should at the very least process blocks of characters from buffers
directly.

Could I do this the other way around, too?

std::stringstream ss;
ss << is.rdbuf();
return ss.str();

And if so, is there anything different in
principle or is it just down to the
particular library?
[...]
Go measure...
The problem is, I need to find a way to do
this which most likely is fast on a couple
of platforms without beeing able to profile
it on each one.
I would expect the 'rdbuf()' to be significantly faster than
processing individual characters.
I see.
Here is something which should also be
faster than processing individual characters:

enum { bufsize = 8192 };
char buf[bufsize];
std::string s;
for (std::streamsize size = 0; size = is.read(buf, bufsize) > 0; )
s.append(buf, size);

(this code is untested and I'm somewhat humble with respect to the string
interface...).

The good old char buf read functions. I
wonder why it is so hard to do something
efficiently without having to go back to
C-ish ways.

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #9

tom_usenet

On Wed, 25 Feb 2004 23:05:55 +0100, "Hendrik Schober"
<Sp******@gmx.de> wrote:

Hi,

I have a 'std::istream' and need to read
its whole contents into a string. How can
I do this?

I've posted a few solutions to this in the past:

http://www.google.com/groups?selm=3d....easynet.co.uk

There are lots more ways, and the most efficient somewhat depends on
the library implementation in question.

Tom
--
C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

Jul 22 '05 #10

Hendrik Schober

tom_usenet <to********@hotmail.com> wrote:

[...]
I've posted a few solutions to this in the past:

http://www.google.com/groups?selm=3d....easynet.co.uk
I didn't think of seeking through a
stream to get its size! Of all the
reasons I wanted to do this I did
manage to eliminate all except that
I need the size of the data to be
read from the stream. Since you just
showed me how to get this, I won't
even need to read the whole thing
into a string anymore!
There are lots more ways, and the most efficient somewhat depends on
the library implementation in question.
Yes. What I wanted was a solution
that has good performance on most
platforms. However, I think I don't
need it anymore. :)
Tom

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #11

tom_usenet

On Thu, 26 Feb 2004 16:33:55 +0100, "Hendrik Schober"
<Sp******@gmx.de> wrote:

tom_usenet <to********@hotmail.com> wrote:
[...]
I've posted a few solutions to this in the past:

http://www.google.com/groups?selm=3d....easynet.co.uk

I didn't think of seeking through a
stream to get its size! Of all the
reasons I wanted to do this I did
manage to eliminate all except that
I need the size of the data to be
read from the stream. Since you just
showed me how to get this, I won't
even need to read the whole thing
into a string anymore!

There are a couple of provisos.

Firstly, opening the stream in binary mode is likely to give you a
better result (e.g. the number of bytes in the file) - text mode
sometimes has funny ideas about where a file ends on some OSes.

Secondly, it won't work for files whose length won't fit in a
std::streamoff (e.g. bigger than, say, 2GB).

Finally, don't forget you can just use a std::filebuf and cut out the
fstream entirely.

Tom
--
C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

Jul 22 '05 #12

Hendrik Schober

tom_usenet <to********@hotmail.com> wrote:

[...]
Firstly, opening the stream in binary mode is likely to give you a
better result (e.g. the number of bytes in the file) - text mode
sometimes has funny ideas about where a file ends on some OSes.
Is there anything worse to be expected than
the "\r\n" problem? As this is just for
progress indication for the users, accuracy
is not as important.
Secondly, it won't work for files whose length won't fit in a
std::streamoff (e.g. bigger than, say, 2GB).
Yes. But I woulnd't have thought of loading
these into a string anyway. :)
Finally, don't forget you can just use a std::filebuf and cut out the
fstream entirely.
How do I read a line from a streambuf?
Tom

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #13

Dietmar Kuehl

Hendrik Schober wrote:

Dietmar Kuehl <di***********@yahoo.com> wrote:
std::string f(std::istream& is) {
return std::string( std::istreambuf_iterator<char>(is),
std::istreambuf_iterator<char>() );
} However, for this to be efficient,
the library has either to implement the general segmented iterator
optimization [...]

Well, essentially, a streambuf iterator iterates over buffers of
characters. Sure, it is always the same buffer but just envision each
fill of the buffer a separate one. Now, each of these buffers can be
processed in a chunk making up a segment of the overall sequence.
Taking advantage of this view results in faster code because rather
than making two checks in each iteration, there is just one. Also, it
is possible to unroll the loop even further because the sizes of the
segments are known in advance, allowing to make a check only for
something like every 100th character. Without this optimization, the
processing of stream buffers will work more efficiently because this
processing does just this, just more naturally (at least, I would
expect it from most implementations).

The general principle can also be applied to other kinds of sequences
which are similarily segmented. 'std::deque's and hashes using lists
of each bucket come to mind.

Could I do this the other way around, too?

std::stringstream ss; std::ostringstream ss; ss << is.rdbuf();
return ss.str();
This is how I'm normally writing it. The direction should not really
matter and the same function should be used underneath.
The problem is, I need to find a way to do
this which most likely is fast on a couple
of platforms without beeing able to profile
it on each one.
But you should get a general feeling which things work fast and which
don't by trying out a couple. Actually, I'm aware of only five
different libraries being in wider use:
- Dinkumware (eg. shipping with MSVC++)
- libstdc++ (shipping with gcc)
- Metrowerk's library shipping with their compiler
- RougeWave (used to ship eg. with Sun CC)
- STLport (a free drop in place library)

I'm unaware of any other standard C++ library shipping with a commmercial
compiler (ObjectSpace dropped their library and mine was never shipping
with anything; is there any other reasonably complete standard library
implementation still in use?)
The good old char buf read functions. I
wonder why it is so hard to do something
efficiently without having to go back to
C-ish ways.

Well, the segmented iterator optimization requires quite a bit of
machinery to work. It gives a nice abstract interface to an efficient
implementation. Just, nobody does it because the library implementers are
kept busy with all kinds of other stuff and optimizations. The low-level
stuff is some wiring you can apply yourself...
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

Jul 22 '05 #14

Hendrik Schober

Dietmar Kuehl <di***********@yahoo.com> wrote:

[...]
Well, essentially, a streambuf iterator [...]
Thanks for the enlightment!

Could I do this the other way around, too?

std::stringstream ss;

std::ostringstream ss;
ss << is.rdbuf();
return ss.str();

This is how I'm normally writing it. The direction should not really
matter and the same function should be used underneath.

I see.

The problem is, I need to find a way to do
this which most likely is fast on a couple
of platforms without beeing able to profile
it on each one.

But you should get a general feeling which things work fast and which
don't by trying out a couple. Actually, I'm aware of only five
different libraries being in wider use:
- Dinkumware (eg. shipping with MSVC++)
- libstdc++ (shipping with gcc)
- Metrowerk's library shipping with their compiler
- RougeWave (used to ship eg. with Sun CC)
- STLport (a free drop in place library)

Yes, but then there is all the different
versions of these libraries. And once a
piece of code works, nobody will go into
it and check whether with the newest
version this or that could be optimized
using another technique...
I'm unaware of any other standard C++ library shipping with a commmercial
compiler (ObjectSpace dropped their library and mine was never shipping
with anything;
Warum eigentlich?
is there any other reasonably complete standard library
implementation still in use?)
The good old char buf read functions. I
wonder why it is so hard to do something
efficiently without having to go back to
C-ish ways.

Well, the segmented iterator optimization requires quite a bit of
machinery to work. It gives a nice abstract interface to an efficient
implementation. Just, nobody does it because the library implementers are
kept busy with all kinds of other stuff and optimizations. The low-level
stuff is some wiring you can apply yourself...

But I wonder whether it is a flaw in the
design if something like reading into a
string cannot easily be done fast with
the recommended approach.

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #15

Jonathan Turkanis

"Hendrik Schober" <Sp******@gmx.de> wrote in message
news:c1**********@news1.transmedia.de...

tom_usenet <to********@hotmail.com> wrote:
[...]
I didn't think of seeking through a
stream to get its size! Of all the
reasons I wanted to do this I did
manage to eliminate all except that
I need the size of the data to be
read from the stream. Since you just
showed me how to get this, I won't
even need to read the whole thing
into a string anymore!

This is fine depending on the stream type. As I'm sure you know, an
arbitrary stream deosn't have to be arbitrarily-positional. If you
know that the streams you will be using are arbitrarily-positional,
you're all set.

You could try seeking, and then testing whether the result is a valid
stream poosition. If it's not, you could then use another method.
However, I'm not sure its guaranteed that a stream will be in a valid
state after a failed seek.

Jonathan

Jul 22 '05 #16

Hendrik Schober

Jonathan Turkanis <te******@kangaroologic.com> wrote:

[...]
I didn't think of seeking through a
stream to get its size! Of all the
reasons I wanted to do this I did
manage to eliminate all except that
I need the size of the data to be
read from the stream. Since you just
showed me how to get this, I won't
even need to read the whole thing
into a string anymore!
This is fine depending on the stream type. As I'm sure you know, an
arbitrary stream deosn't have to be arbitrarily-positional. If you
know that the streams you will be using are arbitrarily-positional,
you're all set.

You could try seeking, and then testing whether the result is a valid
stream poosition. If it's not, you could then use another method.
However, I'm not sure its guaranteed that a stream will be in a valid
state after a failed seek.

How do I detect a failed positioning?

Mhmm. Right now it will be file streams
and string streams only which I assume
to be positional. I think I will try
this and put an assert to be triggered
if anything fails.
Jonathan

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #17

Jonathan Turkanis

"Hendrik Schober" <Sp******@gmx.de> wrote in message
news:c1**********@news1.transmedia.de...

Jonathan Turkanis <te******@kangaroologic.com> wrote:
[...]
I didn't think of seeking through a
stream to get its size! Of all the
reasons I wanted to do this I did
manage to eliminate all except that
I need the size of the data to be
read from the stream. Since you just
showed me how to get this, I won't
even need to read the whole thing
into a string anymore!

This is fine depending on the stream type. As I'm sure you know, an arbitrary stream deosn't have to be arbitrarily-positional. If you
know that the streams you will be using are arbitrarily-positional, you're all set.

You could try seeking, and then testing whether the result is a valid stream poosition. If it's not, you could then use another method.
However, I'm not sure its guaranteed that a stream will be in a valid state after a failed seek.

How do I detect a failed positioning?

Test it against -1.

Jonathan

Jul 22 '05 #18

Hendrik Schober

Jonathan Turkanis <te******@kangaroologic.com> wrote:

[...]
Test it against -1.
Thanks!
Jonathan

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #19

Dietmar Kuehl

Hendrik Schober wrote:

Dietmar Kuehl <di***********@yahoo.com> wrote:
I'm unaware of any other standard C++ library shipping with a commmercial
compiler (ObjectSpace dropped their library and mine was never shipping
with anything;
Warum eigentlich?

I guess you choose German because you were interested more in my personal
reasons rather than in ObjectSpace's fate. Actually, these things are
somewhat related: at some point, ObjectSpace, in persona Lois Goldthwaite,
decided to take my help and incorporate my IOStreams and locales into
their product. I invested some time into fitting things together and
shortly afterwards ObjectSpace stopped their standard C++ library effort
and I decided to focus on other things for now. My standard library
implementation is still not complete and the bits missing would take
quite some effort while nobody seems to be that interested.

My latest effort in the direction of the standard library was the
implementation of property map based algorithms: this is what I want to
get into the next round of standardization. However, even this stuff is
currently mostly idle and not entirely finished... (a definite proof of
concept can be downloaded from
<http://www.dietmar-kuehl.de/cool/cool-20030106.tar.gz>; the major
omission is documentation...).
But I wonder whether it is a flaw in the
design if something like reading into a
string cannot easily be done fast with
the recommended approach.

There is some complexity involved. It is not that bad, actually, but it
has to go somewhere. It is just that nobody ventured to really do the
implementation for whatever bad reasons people had (eg. I'm too lazy
and I have other stuff to do). Actually, the whole generic programming
stuff is about doing optimizations in a centralized form - however, the
optimizations still have to be done. I would like to setup algorithms
in a form allowing easy experimenting with optimizations (and this is
how I tried to implement the algorithms) and have multiple people work
on the optimizations. One problem with this which I haven't resolved is
how to actually test the performance.
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

Jul 22 '05 #20

Hendrik Schober

Dietmar Kuehl <di***********@yahoo.com> wrote:

Hendrik Schober wrote:
Dietmar Kuehl <di***********@yahoo.com> wrote:
I'm unaware of any other standard C++ library shipping with a commmercial
compiler (ObjectSpace dropped their library and mine was never shipping
with anything;
Warum eigentlich?

I guess you choose German because you were interested more in my personal
reasons rather than in ObjectSpace's fate. Actually, these things are
somewhat related: at some point [...]

I see. Thanks for answering this here.
[...](a definite proof of
concept can be downloaded from [...]

Um, I will look at this when we passed this
milestone... :(

But I wonder whether it is a flaw in the
design if something like reading into a
string cannot easily be done fast with
the recommended approach.

There is some complexity involved. It is not that bad, actually, but it
has to go somewhere. It is just that nobody ventured to really do the
implementation for whatever bad reasons people had (eg. I'm too lazy
and I have other stuff to do). Actually, the whole generic programming
stuff is about doing optimizations in a centralized form - however, the
optimizations still have to be done. I would like to setup algorithms
in a form allowing easy experimenting with optimizations (and this is
how I tried to implement the algorithms) and have multiple people work
on the optimizations. One problem with this which I haven't resolved is
how to actually test the performance.

But this isn't what I mean. I think, much (if not
most) of what makes up the stream library's design
is driven by history. And things that seemed to
have been good back then might not seem as good
anymore today.
For example, stream buffers do two things: They
manage a buffer for buffering, and they manage the
IO to some specific device. IMO this violates the
desing principle that every class should serve
exactly one purpose. The result is that it is
unnecessarily hard to adapt streams to an IO
device that I have read/write functions for. I
would expect to be able to derive from some class
(or maybe use template magic to avoid virtual
function calls) and implement its 'read()' and
'write()' functions to use the ones for my device.
Instead I have to create a stream buffer with all
the complicated mechanics involved.
And reading strings is just another example: When
the stream idea was born, IO was done using C-
strings. Consequently, the design was fit for
C-strings. Later, 'std::string' entered the picture
and some more functions/operators were added. But
the streams design wasn't adapted to fit strings.
Thus using C-strings is often easier/better than
using C++-string.
I think that's sad for a C++ IO lib.

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #21

Dietmar Kuehl

Hendrik Schober wrote:

For example, stream buffers do two things: They
manage a buffer for buffering, and they manage the
IO to some specific device.
Well, actually this is not really the correct view, IMO.
Stream buffers abstract from an "external representation".
Period. They do one thing. Since characterwise access to
an external representation is in almost all cases slow,
the stream buffer class offers the service of using a
buffer. Use of buffers generally improves performance of
derived stream buffers due to a variaty of reasons.

Also, the class 'std::basic_streambuf' just does one thing:
it provides buffering capability to derived classes.
IMO this violates the
desing principle that every class should serve
exactly one purpose. The result is that it is
unnecessarily hard to adapt streams to an IO
device that I have read/write functions for. I
would expect to be able to derive from some class
(or maybe use template magic to avoid virtual
function calls) and implement its 'read()' and
'write()' functions to use the ones for my device.
What is wrong with this assumption for the general base of
the streaming library is that not all classes benefit from
buffering and that there are external representations which
use a different abstractions than read and write functions.
For example, memory mapped files would only set up a buffer
without any read and write function at all. On the other
extreme, some interfaces are actually driven characterwise
(although this is indeed rare). The external representations
providing read and write style access are pretty common but
not the only ones supported: the abstraction for stream
buffers are conceptually infinite sequences of characters.
Instead I have to create a stream buffer with all
the complicated mechanics involved.
You need to implement two function: 'overflow()' and
'underflow()'! Big deal! Especially as these are really
that hard to implement if you obtain characters from a
'read()' style of function and dump them to a 'write()'
style one. The whole stream buffer has something like 20
lines of code.
And reading strings is just another example: When
the stream idea was born, IO was done using C-
strings. Consequently, the design was fit for
C-strings. Later, 'std::string' entered the picture
and some more functions/operators were added. But
the streams design wasn't adapted to fit strings.
??? There is a 'operator>>()' overload for reading words,
'std::getline()' for reading lines, and a constructor of
strings for reading whole sequences. This seems to be
pretty good. Admittedly, you need to watch out for
performance bottlenecks when reading whole sequences but
this is exactly the problem I talked about before. What
do you want instead?
Thus using C-strings is often easier/better than
using C++-string.

Strangely enough I'm using C-strings only very rarely
and most of the time I'm using them it is due to some
form of legacy. Note, that the use in a previous article
was not about use of strings: it was a use of a buffer.
This could have been a 'std::vector' instead but a fixed
sized built-in array was just appropriate.

There is indeed no direct interface for using strings as
buffers with IOStreams. But strings don't really make
good buffers. Fixed size built-in array seem to be much
more appropriate in this case.
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

Jul 22 '05 #22

Hendrik Schober

tom_usenet <to********@hotmail.com> wrote:

On Wed, 25 Feb 2004 23:05:55 +0100, "Hendrik Schober"
<Sp******@gmx.de> wrote:
Hi,

I have a 'std::istream' and need to read
its whole contents into a string. How can
I do this?
I've posted a few solutions to this in the past:

http://www.google.com/groups?selm=3d....easynet.co.uk

There are lots more ways, and the most efficient somewhat depends on
the library implementation in question.

FTR, I just found another one:

const std::istream::char_type chEof = std::istream::traits_type::eof();
std::string f( std::istream& is )
{
std::string tmp;
std::getline( is, tmp, chEof );
return tmp;
}

Tom

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #23

Nick Hounsome

"Hendrik Schober" <Sp******@gmx.de> wrote in message
news:c1**********@news1.transmedia.de...

tom_usenet <to********@hotmail.com> wrote:
On Wed, 25 Feb 2004 23:05:55 +0100, "Hendrik Schober"
<Sp******@gmx.de> wrote:
Hi,

I have a 'std::istream' and need to read
its whole contents into a string. How can
I do this?
I've posted a few solutions to this in the past:

http://www.google.com/groups?selm=3d....easynet.co.uk

There are lots more ways, and the most efficient somewhat depends on
the library implementation in question.

FTR, I just found another one:

Sorry - this is wrong.
const std::istream::char_type chEof = std::istream::traits_type::eof();

Hopefully your compiler will warn you that this is a narrowing assignment.
eof() returns int_type where int_type is required to hold
"all of the valid characters of char_type plus the end-of-file value eof()"
std::string f( std::istream& is )
{
std::string tmp;
std::getline( is, tmp, chEof );
probably gets up to a character that has the same low order bits as EOF.
If EOF is -1 then this will PROBABLY stop at a DEL (0xff) character in the
file.
The behaviour is actuallu undefined.
return tmp;
}

Tom

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #24

Hendrik Schober

Nick Hounsome <nh***@blueyonder.co.uk> wrote:

[...]
Sorry - this is wrong.
Yes, you're right. I just wanted to
post that I had found this out the
hard way... :(
[...]

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Jul 22 '05 #25

Reading from a stream til EOF

Similar topics