473,320 Members | 1,823 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

iostream and memory-mapped file

Hi there,
I am seeking a fastest way to load a BIG string and parse it as a
given format. I have a extern function which return a (char *)string in
BIG size. Now, I am going to parse it with a iterator as following

char *str = return_a_big_size_str();
istringstream ss(string(str), istringstream::in);
istreambuf_iterator<char> bit(ss), eit;
parsing(bit, eit);

I found the code shown above is so inefficient because of the big size
of str.

BTW, I also save the whole string to a file, says str.txt, and then
load the file with ifstream

std::ifstream input("str.txt") ;
std::istreambuf_iterator bit(input), eit;
parsing(bit, eit);

I can't believe that the later program is faster than the previous one.
Anyway, I think memory-mapped IO maybe a better choice. However, I
have no idea how memory-mapped file associated with ifstream

Feb 21 '06 #1
3 7528
it's slow because you are making a lot of copies.

is your parser templatized to use any kind of char iterator? then it
would be as easy as parsing(str, str+len). no copying required.

Feb 21 '06 #2
TB
wa***@wakun.com skrev:
Hi there,
I am seeking a fastest way to load a BIG string and parse it as a
given format. I have a extern function which return a (char *)string in
BIG size. Now, I am going to parse it with a iterator as following

IO is slow, accept it.
char *str = return_a_big_size_str();
istringstream ss(string(str), istringstream::in);
istreambuf_iterator<char> bit(ss), eit;
parsing(bit, eit);

I found the code shown above is so inefficient because of the big size
of str.

You could always write your own iterator:

#include <iterator>
#include <stdexcept>

class cstringiterator
: public std::iterator<std::input_iterator_tag,char> {

private:
char const * d_cstring;

public:
cstringiterator(char const * cstring = 0)
: d_cstring(cstring) { }
cstringiterator(cstringiterator const & csi)
: d_cstring(csi.d_cstring) { }

value_type operator*() throw (std::runtime_error) {
if(!d_cstring) throw std::runtime_error("Access Denied");
return *d_cstring;
}
cstringiterator & operator++() throw () {
if(d_cstring) {
if(!*++d_cstring) {
d_cstring = 0;
}
}
return *this;
}
cstringiterator operator++(int) throw () {
cstringiterator c(d_cstring);
++*this;
return c;
}
bool operator==(cstringiterator const & csi) const throw () {
return d_cstring == csi.d_cstring;
}
bool operator!=(cstringiterator const & csi) const throw () {
return d_cstring != csi.d_cstring;
}
};

#include <ostream>
#include <algorithm>

int main(int argc, char* argv[])
{
char const * c = "apa";
std::copy(cstringiterator(c),cstringiterator(),
std::ostream_iterator<char>(std::cout));
return 0;
}
BTW, I also save the whole string to a file, says str.txt, and then
load the file with ifstream

std::ifstream input("str.txt") ;
std::istreambuf_iterator bit(input), eit;
parsing(bit, eit);
Use an iterator that utilizes internal buffers, and only reads ahead
when called for; overwriting old buffers and allocates new when needed,
unless you actually must have complete access to the entire string at
any time.

I can't believe that the later program is faster than the previous one.
Anyway, I think memory-mapped IO maybe a better choice. However, I
have no idea how memory-mapped file associated with ifstream


Memory mapping a file is rather platform specific with its own set of
native api calls. Derive a class from std::basic_filebuf that neatly
handles it all.

--
TB @ SWEDEN
Feb 21 '06 #3
wa***@wakun.com wrote:
char *str = return_a_big_size_str();
istringstream ss(string(str), istringstream::in);
The above line create at least two copies of the string which are
all around at the same time. This is likely to cause swapping on your
system (at least if the strings are really rather large). This is an
tremendous performance hit.
istreambuf_iterator<char> bit(ss), eit;
parsing(bit, eit);


Hold it! You are parsing your string using stream *buffer* iterators,
i.e. you are not taking advantage of the formatting facilities of
streams at all? Why don't you simply pass pointers as the iterators
to the 'parsing()' function (which, of course, should be function
template). Assuming, however, that 'parsing()' is not a function
template, you still have the option to create a suitable stream buffer
which is used just for the situation described:

struct membuf:
std::streambuf
{
membuf(char* str) { this->setg(str, str, str + strlen(str)); }
};
membuf buffer(str);
std::istreambuf_iterator<char> bit(&buffer), eit;
// ...
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Feb 23 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

21
by: Stephan | last post by:
why does the following code not work???? after compiling and running it will just say killed after all my memory filled up any suggestions? #include <iostream> using namespace std; void...
11
by: Charles L | last post by:
I have read that the inclusion of <fstream.h> makes the inclusion of <iostream.h> unnecessary. Is this correct? Charles L
17
by: ~Gee | last post by:
Hi Folks! Please see the program below: 1 #include<iostream> 2 #include<list> 3 #include <unistd.h> 4 using namespace std; 5 int main() 6 { 7 {
1
by: Vijay | last post by:
Hi , I have created a program using CMapStrintToPtr where I would be mapping structure ptr to map string. This is just a sample program which I thought to avoid linear search in link list since...
10
by: Dan Elliott | last post by:
I am working on some tricky code and need some help from the experts. I have several large data structures (uBLAS matrices) that must be written to a pre-allocated (by another program) chunk of...
4
by: Someonekicked | last post by:
Is it possible to read a memory address with C++; For example, If I run this code first: ************* #include <iostream> using namespace std; void main() { int *zz = new int;
6
by: thangamani.vaiyapuri | last post by:
Hi, The below code snippet shows memory issues in Vector's push_back method. #include <iostream.h> #include <vector> class Base {
4
by: marko.suonpera | last post by:
How to create a buffer of memory in C++, whose size can dynamically grow and shrink as needed? This is used for buffering input/output. Several variable types, such as int and double are read and...
1
by: lars.uffmann | last post by:
Hello everyone! I just debugged a pretty huge project, eliminating basically every memory leak that would occur with the current configuration files, at least according to the mtrace() tool from...
19
by: Robert Kochem | last post by:
Hi, I am relative new to C++ regarding it's functions and libraries. I need to access files larger than 4GB which is AFAIK not possible with the STL iostream - at least not if using a 32 bit...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.