output to a file (stream performance) | | |
Hi,
I'm a stream virgin and am attempting to output strings to a file. My
approach is to write the string initially to a 'stringstream' and only when
complete write the stringstream to the file (ofstream).
The process works fine however appears to be rather slow. For example
outputting about 2Mb of data takes a couple of minutes (most of the time
appears to be writing to the stringstream) and as I'm creating several
hundred files in one run the whole time can run into hours. Oh, the files
are on my harddrive hence there is no network performance issues.
Is there anything basic I'm doing wrong (see code below) or does anyone have
any suggestions to improve this performance.
Thank in advance
Lee
stringstream m_outputFileReference;
ofstream m_outputFile ;
//initialisation for each file produced
m_outputFile.open(fullPathName, fstream::out) ;
m_outputFileReference.str("") ;
m_outputFileReference.clear() ;
// output data using the format below
m_outputFileReference << CC_NUM_START_OPEN << tempStr << CC_NUM_START_CLOSE
;
when complete send the stringstream to the file
m_outputFile << m_outputFileReference.str()
m_outputFile.close(); | | | | re: output to a file (stream performance)
Lee wrote:[color=blue]
> Hi,
>
> I'm a stream virgin and am attempting to output strings to a file. My
> approach is to write the string initially to a 'stringstream' and only when
> complete write the stringstream to the file (ofstream).
>
> The process works fine however appears to be rather slow. For example
> outputting about 2Mb of data takes a couple of minutes (most of the time
> appears to be writing to the stringstream) and as I'm creating several
> hundred files in one run the whole time can run into hours. Oh, the files
> are on my harddrive hence there is no network performance issues.
>
> Is there anything basic I'm doing wrong (see code below) or does anyone have
> any suggestions to improve this performance.
>
> Thank in advance
> Lee
>
>
> stringstream m_outputFileReference;
> ofstream m_outputFile ;
>
> //initialisation for each file produced
> m_outputFile.open(fullPathName, fstream::out) ;
> m_outputFileReference.str("") ;
> m_outputFileReference.clear() ;
>
> // output data using the format below
> m_outputFileReference << CC_NUM_START_OPEN << tempStr << CC_NUM_START_CLOSE
> ;
>
> when complete send the stringstream to the file
> m_outputFile << m_outputFileReference.str()
> m_outputFile.close();
>
>
>[/color]
why not just:
ofstream m_outputFile(fullPathName, fstream::out);
m_outputFile << CC_NUM_START_OPEN << tempStr << CC_NUM_START_CLOSE;
Regards,
Ben | | | | re: output to a file (stream performance)
Lee wrote:[color=blue]
> My
> approach is to write the string initially to a 'stringstream' and only
> when complete write the stringstream to the file (ofstream).[/color]
Why though? You could immediately write to the file.
[color=blue]
> The process works fine however appears to be rather slow.[/color]
It probably depends on the implementation of the stream buffer
underlying the string stream: some implementation stick to the
[original] words in the standard which mandate that the buffer
in a string stream shall increase by just one position and/or
they implement the buffer to grow by some fixed amount, e.g.
128 characters (yes, I have seen this is in actual code of a
commercial 'basic_stringbuf' implementation). This will cause
the string stream to spent its time mostly for copying
increasingly larger chunks of memory around.
The obvious work-around is to avoid this by using a file stream
immediately and just streaming the data there. If this is not
an option for whatever reason, the best bet is to *not* use a
string stream but rather a stream based on a simple handcrafted
stream buffer which simply extends the internal buffer by some
factor e.g. duplicating the size whenever the buffer runs full.
I think I have posted the corresponding code in the past but it
is not really hard to create anyway (just something like 20 lines
of code).
[color=blue]
> stringstream m_outputFileReference;[/color]
There is no good reason to use a 'std::stringstream' in this
situation anyway: you want to use a 'std::ostringstream'. However,
this change will probably not remove your problem.
[color=blue]
> ofstream m_outputFile ;
>
> //initialisation for each file produced
> m_outputFile.open(fullPathName, fstream::out) ;[/color]
[color=blue]
> m_outputFileReference.str("") ;
> m_outputFileReference.clear() ;[/color]
You don't need to perform the above two operations: after
construction, the string stream is empty and in a 'good()'
state.
--
<mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence | | | | re: output to a file (stream performance)
I tried this originally but when the files were across a network the
performance was even worse.
Lee
"benben" <benhongh@yahoo.com.au> wrote in message
news:440eab16$0$1052$afc38c87@news.optusnet.com.au ...[color=blue]
> Lee wrote:[color=green]
> > Hi,
> >
> > I'm a stream virgin and am attempting to output strings to a file. My
> > approach is to write the string initially to a 'stringstream' and only[/color][/color]
when[color=blue][color=green]
> > complete write the stringstream to the file (ofstream).
> >
> > The process works fine however appears to be rather slow. For example
> > outputting about 2Mb of data takes a couple of minutes (most of the time
> > appears to be writing to the stringstream) and as I'm creating several
> > hundred files in one run the whole time can run into hours. Oh, the[/color][/color]
files[color=blue][color=green]
> > are on my harddrive hence there is no network performance issues.
> >
> > Is there anything basic I'm doing wrong (see code below) or does anyone[/color][/color]
have[color=blue][color=green]
> > any suggestions to improve this performance.
> >
> > Thank in advance
> > Lee
> >
> >
> > stringstream m_outputFileReference;
> > ofstream m_outputFile ;
> >
> > //initialisation for each file produced
> > m_outputFile.open(fullPathName, fstream::out) ;
> > m_outputFileReference.str("") ;
> > m_outputFileReference.clear() ;
> >
> > // output data using the format below
> > m_outputFileReference << CC_NUM_START_OPEN << tempStr <<[/color][/color]
CC_NUM_START_CLOSE[color=blue][color=green]
> > ;
> >
> > when complete send the stringstream to the file
> > m_outputFile << m_outputFileReference.str()
> > m_outputFile.close();
> >
> >
> >[/color]
>
> why not just:
>
> ofstream m_outputFile(fullPathName, fstream::out);
> m_outputFile << CC_NUM_START_OPEN << tempStr << CC_NUM_START_CLOSE;
>
> Regards,
> Ben[/color] | | | | re: output to a file (stream performance)
> I tried this originally but when the files were across a network the[color=blue]
> performance was even worse.[/color]
Looks like the ofstream buffer isn't generous enough...and so it causes
lots of network traffic if I am not too mistaken.
If cross-platform is not an issue you may try platform-dependent support
which tends to have more optimization features.
Using a stringstream like a buffer is very odd a solution.
Regards,
Ben | | | | re: output to a file (stream performance)
Many thanks for the reply. I tried using the file stream directly but when
the files were over a network the performance was even worse.
Using a hand crafted stream buffer sounds good. I'm not to sure how
exactly - can this be achieved by creating a new class based upon streambuf
and utilsing the 'setbuf' function to control the buffer size.
thanks
Lee
"Dietmar Kuehl" <dietmar_kuehl@yahoo.com> wrote in message
news:477r9mFe9ppgU1@individual.net...[color=blue]
> Lee wrote:[color=green]
> > My
> > approach is to write the string initially to a 'stringstream' and only
> > when complete write the stringstream to the file (ofstream).[/color]
>
> Why though? You could immediately write to the file.
>[color=green]
> > The process works fine however appears to be rather slow.[/color]
>
> It probably depends on the implementation of the stream buffer
> underlying the string stream: some implementation stick to the
> [original] words in the standard which mandate that the buffer
> in a string stream shall increase by just one position and/or
> they implement the buffer to grow by some fixed amount, e.g.
> 128 characters (yes, I have seen this is in actual code of a
> commercial 'basic_stringbuf' implementation). This will cause
> the string stream to spent its time mostly for copying
> increasingly larger chunks of memory around.
>
> The obvious work-around is to avoid this by using a file stream
> immediately and just streaming the data there. If this is not
> an option for whatever reason, the best bet is to *not* use a
> string stream but rather a stream based on a simple handcrafted
> stream buffer which simply extends the internal buffer by some
> factor e.g. duplicating the size whenever the buffer runs full.
> I think I have posted the corresponding code in the past but it
> is not really hard to create anyway (just something like 20 lines
> of code).
>[color=green]
> > stringstream m_outputFileReference;[/color]
>
> There is no good reason to use a 'std::stringstream' in this
> situation anyway: you want to use a 'std::ostringstream'. However,
> this change will probably not remove your problem.
>[color=green]
> > ofstream m_outputFile ;
> >
> > //initialisation for each file produced
> > m_outputFile.open(fullPathName, fstream::out) ;[/color]
>[color=green]
> > m_outputFileReference.str("") ;
> > m_outputFileReference.clear() ;[/color]
>
> You don't need to perform the above two operations: after
> construction, the string stream is empty and in a 'good()'
> state.
> --
> <mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
> <http://www.eai-systems.com> - Efficient Artificial Intelligence[/color] | | | | re: output to a file (stream performance)
Lee wrote:[color=blue]
> Using a hand crafted stream buffer sounds good. I'm not to sure how
> exactly - can this be achieved by creating a new class based upon
> streambuf and utilsing the 'setbuf' function to control the buffer size.[/color]
'setbuf()' is the wrong tool: it has essentially no useful guarantees.
The only guarantee it has is that 'setbuf(0, 0)' will turn a file
stream to become unbuffered. You might be able to set a buffer size
suiting your need for file streams but this is implementation specific.
To create a useful surrogate for a string stream, you would derive a
class from 'std::streambuf' and essentially just override the
'overflow()' method to install more room. Essentially the code for
the stream buffer would look like this (note: the code is untested,
not even compiled):
class mystringbuf:
std::streambuf
{
public:
enum { initial = 1024 };
mystringbuf(): m_buffer(new char[initial])
{ this->setp(this->m_buffer, this->m_buffer + initial); }
~mystringbuf() { delete[] this->m_buffer; }
private:
int_type overflow(int_type c)
{
// increase the buffer
if (this->pptr() == this->epptr())
{
ptrdiff_t size = this->pptr() - this->pbase();
char* tmp = new char[2 * size];
std::copy(this->pbase(), this->pptr(), tmp);
this->setp(tmp, tmp + 2 * size);
this->pbump(size);
std::swap(tmp, this->m_buffer);
delete[] tmp;
}
// put the character into the buffer
if (c != std::char_traits<char>::eof())
{
*this->pptr() = std::char_traits<char>::to_char_type(c);
this->pbump(1);
}
// signal success
return std::char_traits<char>::not_eof(c);
}
char* m_buffer;
};
You would, of course, still need some method to access the buffer
but this can be anything to your liking. I would probably provide
a pair of iterators and/or a pointer to the start plus the current
size (this would be useful to pass it to 'sputn()' of the file
stream's 'rdbuf()').
--
<mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence | | | | re: output to a file (stream performance)
Lee wrote:[color=blue]
> Hi,
>
> I'm a stream virgin and am attempting to output strings to a file. My
> approach is to write the string initially to a 'stringstream' and only when
> complete write the stringstream to the file (ofstream).[/color]
[snip][color=blue]
> Is there anything basic I'm doing wrong (see code below) or does anyone have
> any suggestions to improve this performance.[/color]
[snip] http://groups.google.com/group/perfo...73f4d1a05cfbd1 contains
various testsuites to measure comparative performance of "Reading file
into string".
Perhaps, it is worth building similar testsuites to measure comparative
performance of "Writing string to file", for instance.
---
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com http://mathforum.org/library/view/10978.html http://sourceforge.net/users/alexvn |  | | | | /bytes/about
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 226,510 network members.
|