473,406 Members | 2,345 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

making an istream from a char array


I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used. Here's a short program demonstrating
what I've tried. Is there any way to get "inside"
the istringstream and tell it to use the 'chunk'
directly, rather than insisting on making a copy?

Thanks,
John Salmon

[jsalmon@river c++]$ cat chararraytostream.cpp
#include <string>
#include <sstream>
#include <cstdlib>
#include <cstring>
#include <cstdio>
using namespace std;

char *getLotsOfBytes();
istream& streamParser(istream &s);
void linuxChkMem(const char *msg);

void withImplicitString(){
linuxChkMem("Before getLotsOfBytes: ");
char *chunk = getLotsOfBytes();
linuxChkMem("After getLotsOfBytes():");
{
istringstream iss(chunk);
linuxChkMem("After iss(p): ");
streamParser(iss);
linuxChkMem("After streamParser(iss): ");
}
linuxChkMem("After iss goes out of scope: ");
free(chunk);
linuxChkMem("After free(p): ");
}

void withExplicitString(){
linuxChkMem("Before getLotsOfBytes: ");
char *chunk = getLotsOfBytes();
linuxChkMem("After getLotsOfBytes():");
{
string s(chunk);
linuxChkMem("After s(chunk): ");
free(chunk);
linuxChkMem("After free(p): ");
istringstream iss(s);
linuxChkMem("After iss(s): ");
streamParser(iss);
linuxChkMem("After streamParser(iss): ");
}
linuxChkMem("After iss goes out of scope: ");
}

int main(int argc, char **argv){
printf("with an implicit string constructor\n");
withImplicitString();
printf("\nwith an explicit string constructor\n");
withExplicitString();
return 0;
}

// On linux, tell us how much data space we're using
// in the VM.
void linuxChkMem(const char *msg){
printf("%s", msg);
fflush(stdout);
char cmd[50];
sprintf(cmd, "grep VmData /proc/%d/status", getpid());
system(cmd);
}

static const int SZ = 100*1024*1024;
// A rough approximation to getLotsOfBytes. In the
// real application, getLotsOfBytes has these characteristics:
// - it returns a malloced pointer to a NUL-terminated array of chars.
// - it is out of my control. E.g., I can't rewrite it in a way
// that might be more friendly to C++ streams.
char *getLotsOfBytes(){
char *p = (char *)malloc(SZ);
memset(p, ' ', SZ);
strcpy(p+SZ-50, "3.1415 2.718 1.414");
return p;
}

// A rough approximation to streamParser. In the real
// application, streamParser takes a ref to an istream
// and does what it does. Again, I can't easily redefine
// the interface.
istream& streamParser(istream& s){
double x, y, z;
s > x >y >z;
printf("x: %f y: %f z: %f\n", x, y, z);
return s;
}

[jsalmon@river c++]$ g++ -O3 chararraytostream.cpp
[jsalmon@river c++]$ a.out
with an implicit string constructor
Before getLotsOfBytes: VmData: 40 kB
After getLotsOfBytes():VmData: 102444 kB
After iss(p): VmData: 204848 kB
x: 3.141500 y: 2.718000 z: 1.414000
After streamParser(iss): VmData: 204980 kB
After iss goes out of scope: VmData: 102576 kB
After free(p): VmData: 172 kB

with an explicit string constructor
Before getLotsOfBytes: VmData: 172 kB
After getLotsOfBytes():VmData: 102576 kB
After s(chunk): VmData: 204980 kB
After free(p): VmData: 102576 kB
After iss(s): VmData: 204980 kB
x: 3.141500 y: 2.718000 z: 1.414000
After streamParser(iss): VmData: 204980 kB
After iss goes out of scope: VmData: 172 kB
[jsalmon@river c++]$

Dec 30 '06 #1
7 20408
Hello John!
John Salmon wrote:
My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used.
std::istringstream takes a std::string. For creating this
std::string from a char array, a copy is created. This copy
is then copied into the std::istringstream. For this purpose,
you probably don't want to use an std::istringstream. Instead,
you could use a simple homegrown stream buffer (code see
below).

Good luck, Denise!
--- CUT HERE ---
#include <istream>
#include <iostream>
#include <streambuf>
#include <string>
#include <string.h>

struct membuf:
std::streambuf
{
membuf(char* b, char* e) { this->setg(b, b, e); }
};

int main()
{
char* buffer = get_huge_buffer_with_data();
membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
std::istream in(&sbuf);
for (std::string line; std::getline(in, line); )
std::cout << "line: " << line << "\n";
}

Dec 30 '06 #2
John Salmon wrote:
I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.
The "without making a copy" might be a little tricky with istringstream.

I'm no expert on c++ streams but something like this might work.

#include <istream>

class Xistream
: public std::istream,
public std::streambuf
{
public:
Xistream( const char * begin, const char * end )
: std::istream( this )
{
setg( const_cast<char *>(begin), const_cast<char *>(begin),
const_cast<char *>(end) );
}
};

#include <iostream>

int main()
{
const char xx[] = "1 22 33";

Xistream xi( xx, xx + sizeof(xx) -1);

int i;
xi >i;

std::cout << i << "\n";

xi >i;

std::cout << i << "\n";

}
Dec 30 '06 #3
>>>>"Denise" == Denise Kleingeist <de***************@googlemail.comwrites:

DeniseHello John!
DeniseJohn Salmon wrote:
>My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used.
Denisestd::istringstream takes a std::string. For creating this
Denisestd::string from a char array, a copy is created. This copy
Deniseis then copied into the std::istringstream. For this purpose,
Deniseyou probably don't want to use an std::istringstream. Instead,
Deniseyou could use a simple homegrown stream buffer (code see
Denisebelow).

DeniseGood luck, Denise!
Denise--- CUT HERE ---
Denise #include <istream>
Denise #include <iostream>
Denise #include <streambuf>
Denise #include <string>
Denise #include <string.h>

Denise struct membuf:
Denise std::streambuf
Denise {
Denise membuf(char* b, char* e) { this->setg(b, b, e); }
Denise };

Denise int main()
Denise {
Denise char* buffer = get_huge_buffer_with_data();
Denise membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
Denise std::istream in(&sbuf);
Denise for (std::string line; std::getline(in, line); )
Denise std::cout << "line: " << line << "\n";
Denise }

Thanks! This is exactly what I needed.

One question - what's the point of the std::find()?

I don't see how std::find(buffer, buffer+strlen(buffer), 0);
could ever be different from buffer+strlen(buffer)??

Cheers,
John Salmon
Dec 30 '06 #4
Hello John!
John Salmon wrote:
>>>"Denise" == Denise Kleingeist <de***************@googlemail.comwrites:
Denise membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
One question - what's the point of the std::find()?

I don't see how std::find(buffer, buffer+strlen(buffer), 0);
could ever be different from buffer+strlen(buffer)??
You are right: it is a left over from a discarded attempt to use
std::find() instead of strlen()! Just use buffer + strlen(buffer)
instead.

Sorry for any confusion caused, Denise!

Dec 30 '06 #5
"John Salmon" <js*****@thesalmons.orgwrote in message
news:m3************@river.fishnet...
I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used. Here's a short program demonstrating
what I've tried. Is there any way to get "inside"
the istringstream and tell it to use the 'chunk'
directly, rather than insisting on making a copy?
See the header <strstream>. It does exactly what you want,
and it's part of the C++ Standard (albeit a bit old
fashioned).

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
Dec 30 '06 #6
>>>>"PJ" == P J Plauger <pj*@dinkumware.comwrites:

PJ"John Salmon" <js*****@thesalmons.orgwrote in message
PJnews:m3************@river.fishnet...
>I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used. Here's a short program demonstrating
what I've tried. Is there any way to get "inside"
the istringstream and tell it to use the 'chunk'
directly, rather than insisting on making a copy?
PJSee the header <strstream>. It does exactly what you want,
PJand it's part of the C++ Standard (albeit a bit old
PJfashioned).

Thanks to Usenet, I now have two workable solutions.

Googling for strstream turns up lots of warnings that "strstream is
deprecated", with dire warnings that it may be removed from future
versions of the standard. OTOH, an istrstream does exactly what I
want, without any extra custom machinery ( struct membuf : public
streambuf ).

Other than simplicity and possible compatibility with future
standards, is there any reason to prefer one approach over the
other?

Cheers,
John Salmon
Dec 30 '06 #7
"John Salmon" <js*****@thesalmons.orgwrote in message
news:m3************@river.fishnet...
>>>>>"PJ" == P J Plauger <pj*@dinkumware.comwrites:

PJ"John Salmon" <js*****@thesalmons.orgwrote in message
PJnews:m3************@river.fishnet...
>>I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used. Here's a short program demonstrating
what I've tried. Is there any way to get "inside"
the istringstream and tell it to use the 'chunk'
directly, rather than insisting on making a copy?

PJSee the header <strstream>. It does exactly what you want,
PJand it's part of the C++ Standard (albeit a bit old
PJfashioned).

Thanks to Usenet, I now have two workable solutions.

Googling for strstream turns up lots of warnings that "strstream is
deprecated", with dire warnings that it may be removed from future
versions of the standard. OTOH, an istrstream does exactly what I
want, without any extra custom machinery ( struct membuf : public
streambuf ).

Other than simplicity and possible compatibility with future
standards, is there any reason to prefer one approach over the
other?
You should prefer strstream because:

1) it's exactly what you need

2) it's still part of the C++ Standard

3) there's no reason to believe it'll become nonstandard anytime
soon, despite the dire warnings

4) even if it does officially go away, there's not a sane vendor
who'll stop supporting it for the next decade

So what the hell.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
Dec 30 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: matthurne | last post by:
I'm doing a chapter 12 exercise from Accelerated C++ ... writing a string-like class which stores its data in a low-level way. My class, called Str, uses a char array and length variable. I've...
3
by: lpe540 | last post by:
Hi, I'm having trouble using istream to read in a file in its entirety on UNIX. I've written a dummy program that essencially reads in a file from stdin and writes it out to a file. When I cat a...
13
by: Randy | last post by:
Is there any way to do this? I've tried tellg() followed by seekg(), inserting the stream buffer to an ostringstream (ala os << is.rdbuf()), read(), and having no luck. The problem is, all of...
5
by: Jim Langston | last post by:
In one of my files I am outputting the value of a char into a human readable file. That is, char a = 123; std::ofstream CharFile( ("Players\\" + Name + ".char").c_str()); if ( CharFile.is_open()...
1
by: mwebel | last post by:
Hi, My Module (B) needs to read from a istream (provided by another module A) and again the module (C) i use accepts only char*. So actually it accepts filenames. But i dont want to store the...
3
by: KWienhold | last post by:
I'm currently writing an application (using Visual Studio 2003 SP1 and C#) that stores files and additional information in a single compound file using IStorage/IStream. Since files in a compound...
4
by: =?Utf-8?B?Sm9obg==?= | last post by:
Hi all, I am developing website application in asp.net , visual C# and atl com. I am using atl com component in visual C# application. One of the function of com component interface returns...
4
by: Ralf | last post by:
Hallo, I'm trying to call a COM function from C#. The function has a parameter from the type array of IStream* (IStream**). The COM component (out-proc-server (.exe)) has the following code: ...
4
by: james.lawton | last post by:
Hi, I'm having a problem that I can't diagnose. I'm creating istreams of unknown types, and want to manage them on the stack, co I'm passing around ownership-holding pointers. Usually, I would...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.