Connecting Tech Pros Worldwide Forums | Help | Site Map

filecopy with std::copy()

Thomas J. Clancy
Guest
 
Posts: n/a
#1: Jul 19 '05
I was wondering if anyone knew of a way to use std::copy() and
istream_iterator<>/ostream_iterator<> write a file copy function that is
quick and efficient.

Doing this messes up the file because it seems to ignore '\n'

ifstream in("somefile");
ofstream out("someOtherFile");

std::copy(std::istream_iterator<unsigned char>(in),
std::istream_iterator<unsigned char>(),
std::ostream_iterator<unsigned char>(out));

Now, I figured out how to do it correctly but it is dog slow. I was
wondering if anyone knew how to do this in an ellegant manner?

thomas j. clancy



Ivan Vecerina
Guest
 
Posts: n/a
#2: Jul 19 '05

re: filecopy with std::copy()


Hi Thomas,
"Thomas J. Clancy" <tjclancy@comcast.net> wrote in message
news:Rq-cnUGo6qy0PMeiXTWJhg@comcast.com...[color=blue]
> I was wondering if anyone knew of a way to use std::copy() and
> istream_iterator<>/ostream_iterator<> write a file copy function that is
> quick and efficient.[/color]
....[color=blue]
> Now, I figured out how to do it correctly but it is dog slow. I was
> wondering if anyone knew how to do this in an ellegant manner?[/color]

Unless you insist on using std::copy, the elegant and efficient
manner to copy an entire file (or stream):
dstStream << srcStream.rdbuf();
A C++ implementation should be able to ultimately optimize this
operation (but performance may vary...).

hth,
Ivan
--
http://www.post1.com/~ivec <> Ivan Vecerina


Thomas J. Clancy
Guest
 
Posts: n/a
#3: Jul 19 '05

re: filecopy with std::copy()


"Ivan Vecerina" <ivecATmyrealboxDOTcom> wrote in message
news:3f5ad280$1@news.swissonline.ch...[color=blue]
> Hi Thomas,
> "Thomas J. Clancy" <tjclancy@comcast.net> wrote in message
> news:Rq-cnUGo6qy0PMeiXTWJhg@comcast.com...[color=green]
> > I was wondering if anyone knew of a way to use std::copy() and
> > istream_iterator<>/ostream_iterator<> write a file copy function that is
> > quick and efficient.[/color]
> ...[color=green]
> > Now, I figured out how to do it correctly but it is dog slow. I was
> > wondering if anyone knew how to do this in an ellegant manner?[/color]
>
> Unless you insist on using std::copy, the elegant and efficient
> manner to copy an entire file (or stream):
> dstStream << srcStream.rdbuf();
> A C++ implementation should be able to ultimately optimize this
> operation (but performance may vary...).[/color]


Elegant, yes... this I already knew about, but boy is it
sloooooooooooooooowwwwww.... I came up with a different solution using the
std::copy and a type (class) that contains a buffer of chars and uses
stream::read() and stream::write() within the input stream operator (>>) and
the output stream operator (<<), respectively. And man does it scream.

Anyway, I was just wondering if there were alternatives to creating this
sort of thing using or extending the stream stuff.


[color=blue]
>
> hth,
> Ivan
> --
> http://www.post1.com/~ivec <> Ivan Vecerina
>
>[/color]


Josh Sebastian
Guest
 
Posts: n/a
#4: Jul 19 '05

re: filecopy with std::copy()


On Sun, 07 Sep 2003 09:44:26 -0400, Thomas J. Clancy wrote:
[color=blue]
> "Ivan Vecerina" <ivecATmyrealboxDOTcom> wrote in message
> news:3f5ad280$1@news.swissonline.ch...
>[color=green]
>> Unless you insist on using std::copy, the elegant and efficient
>> manner to copy an entire file (or stream):
>> dstStream << srcStream.rdbuf();
>> A C++ implementation should be able to ultimately optimize this
>> operation (but performance may vary...).[/color]
>
>
> Elegant, yes... this I already knew about, but boy is it
> sloooooooooooooooowwwwww....[/color]

Nothing using IOStreams is going to be faster. File copies are best left
to OS routines.

Josh
Josh Sebastian
Guest
 
Posts: n/a
#5: Jul 19 '05

re: filecopy with std::copy()


On Sun, 07 Sep 2003 10:11:13 -0400, Thomas J. Clancy wrote:
[color=blue]
> Ummm... the rest of my previous reply talks about what I did to make it
> much, much faster than the solution you mentioned, so what do you mean by
> your statement above?[/color]

It was actually faster using a copy than rdbuf? That's a messed-up
IOStreams implementation. :-}
Josh Sebastian
Guest
 
Posts: n/a
#6: Jul 19 '05

re: filecopy with std::copy()


On Sun, 07 Sep 2003 11:05:28 -0400, Thomas J. Clancy wrote:
[color=blue]
>
> Not at all... when you use the output iterator of rdbuf(), I believe that it
> is doing it byte by byte and not in chunks. At least this is the behaviour
> I am seeing with VC7.1's implementation, which they get from Dinkumware, I
> believe. Now I could try this using STLPort.[/color]

It shouldn't be, there should be buffering done both by your OS and by
IOStreams. For example

curien@balar:~/prog$ uname -a
Linux balar 2.4.18 #1 Sun Aug 10 12:24:29 EDT 2003 i686 GNU/Linux
curien@balar:~/prog$ cat blah.cpp
#include <fstream>
#include <ios>

int main() {
std::ifstream infile("test.dat", std::ios_base::binary);
std::ofstream outfile("test~.dat", std::ios_base::binary);

outfile << infile.rdbuf();
}
curien@balar:~/prog$ dd if=/dev/zero of=test.dat bs=1024 count=50K
51200+0 records in
51200+0 records out
52428800 bytes transferred in 0.700340 seconds (74861922 bytes/sec)
curien@balar:~/prog$ g++ -v
Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.2/specs
Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
Thread model: posix
gcc version 3.3.2 20030812 (Debian prerelease)
curien@balar:~/prog$ g++ -ansi -pedantic -W -Wall -O2 blah.cpp
curien@balar:~/prog$ time ./a.out

real 0m0.619s
user 0m0.030s
sys 0m0.540s

Josh
Thomas J. Clancy
Guest
 
Posts: n/a
#7: Jul 19 '05

re: filecopy with std::copy()



"Josh Sebastian" <curien@cox.net> wrote in message
news:pan.2003.09.07.16.12.24.353080@cox.net...[color=blue]
> On Sun, 07 Sep 2003 11:05:28 -0400, Thomas J. Clancy wrote:
>[color=green]
> >
> > Not at all... when you use the output iterator of rdbuf(), I believe[/color][/color]
that it[color=blue][color=green]
> > is doing it byte by byte and not in chunks. At least this is the[/color][/color]
behaviour[color=blue][color=green]
> > I am seeing with VC7.1's implementation, which they get from Dinkumware,[/color][/color]
I[color=blue][color=green]
> > believe. Now I could try this using STLPort.[/color]
>
> It shouldn't be, there should be buffering done both by your OS and by
> IOStreams. For example
>
> curien@balar:~/prog$ uname -a
> Linux balar 2.4.18 #1 Sun Aug 10 12:24:29 EDT 2003 i686 GNU/Linux
> curien@balar:~/prog$ cat blah.cpp
> #include <fstream>
> #include <ios>
>
> int main() {
> std::ifstream infile("test.dat", std::ios_base::binary);
> std::ofstream outfile("test~.dat", std::ios_base::binary);
>
> outfile << infile.rdbuf();
> }
> curien@balar:~/prog$ dd if=/dev/zero of=test.dat bs=1024 count=50K
> 51200+0 records in
> 51200+0 records out
> 52428800 bytes transferred in 0.700340 seconds (74861922 bytes/sec)[/color]

Hey man, I have the numbers, too, and believe me, they suck. I wonder if
Microsoft is pulling a fast one? :-)
[color=blue]
> curien@balar:~/prog$ g++ -v
> Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.2/specs
> Configured with:[/color]
.../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treela
ng --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gx
x-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enab
le-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu
--enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-
gc i486-linux[color=blue]
> Thread model: posix
> gcc version 3.3.2 20030812 (Debian prerelease)
> curien@balar:~/prog$ g++ -ansi -pedantic -W -Wall -O2 blah.cpp
> curien@balar:~/prog$ time ./a.out
>
> real 0m0.619s
> user 0m0.030s
> sys 0m0.540s
>
> Josh[/color]


Thomas J. Clancy
Guest
 
Posts: n/a
#8: Jul 19 '05

re: filecopy with std::copy()


> It shouldn't be, there should be buffering done both by your OS and by[color=blue]
> IOStreams. For example
>[/color]

My bad, you're right. Under microsoft, if you build this little application
in debug, it is dog slow. I thought I had been building in release mode.
Once I set it to release mode and rebuilt the thing flew! Thanks for the
information on this.

Tom


Kevin Goodsell
Guest
 
Posts: n/a
#9: Jul 19 '05

re: filecopy with std::copy()


Thomas J. Clancy wrote:
[color=blue]
> "Ivan Vecerina" <ivecATmyrealboxDOTcom> wrote in message
> news:3f5ad280$1@news.swissonline.ch...[color=green]
>>
>>Unless you insist on using std::copy, the elegant and efficient
>>manner to copy an entire file (or stream):
>> dstStream << srcStream.rdbuf();
>>A C++ implementation should be able to ultimately optimize this
>>operation (but performance may vary...).[/color]
>
>
>
> Elegant, yes... this I already knew about, but boy is it
> sloooooooooooooooowwwwww....[/color]

Are you using Visual C++ 5 or 6 by chance? There's a known bug in the
iostream library that causes buffering to be wrongly disabled in file
streams that are opened by name. That could account for this being slow,
I think. Check here:

http://www.dinkumware.com/vc_fixes.html
[color=blue]
> I came up with a different solution using the
> std::copy and a type (class) that contains a buffer of chars and uses
> stream::read() and stream::write() within the input stream operator (>>) and
> the output stream operator (<<), respectively. And man does it scream.[/color]

Buffering should be automatic, making this unnecessary. I guess that it
should be possible to make the solution using standard stream classes
perform just as well or better than this solution, but I wouldn't know
exactly how to do it.

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.

Thomas J. Clancy
Guest
 
Posts: n/a
#10: Jul 19 '05

re: filecopy with std::copy()



"Kevin Goodsell" <usenet1.spamfree.fusion@neverbox.com> wrote in message
news:B0L6b.2685$PE6.2267@newsread3.news.pas.earthl ink.net...[color=blue]
> Thomas J. Clancy wrote:
>[color=green]
> > "Ivan Vecerina" <ivecATmyrealboxDOTcom> wrote in message
> > news:3f5ad280$1@news.swissonline.ch...[color=darkred]
> >>[/color][/color]
> Buffering should be automatic, making this unnecessary. I guess that it
> should be possible to make the solution using standard stream classes
> perform just as well or better than this solution, but I wouldn't know
> exactly how to do it.
>[/color]

Here was my solution before I realized that using stream::rdbuf() worked
well while NOT in debug mode using VC++7.1 (.NET 2003).

/**
* A block buffer type that can be used with std::copy() and
istream_iterators without
* having to write a special form of copy or an istream_iterator.
*/
class ByteBlock
{
public:
ByteBlock()
: m_bytesRead(0),
m_fileSize(-1),
m_totalRead(0)
{
}

private:
unsigned char m_block[10240];
int m_bytesRead;
long m_fileSize;
long m_totalRead;
friend std::istream& operator >> (std::istream& stream, ByteBlock& byte);
friend std::ostream& operator << (std::ostream& stream, const ByteBlock&
byte);
};

std::istream& operator >> (std::istream& stream, ByteBlock& block)
{
if (block.m_fileSize == -1)
{
stream.seekg(0, std::ios::end);
block.m_fileSize = stream.tellg();
stream.seekg(0, std::ios::beg);
}
std::size_t leftToRead = block.m_fileSize - block.m_totalRead;
if (leftToRead)
{
stream.read((char*)block.m_block, std::min(sizeof(block.m_block),
leftToRead));
block.m_bytesRead = stream.gcount();
block.m_totalRead += block.m_bytesRead;
}
else
{
stream.setstate(std::ios_base::eofbit | std::ios_base::badbit);
}
return stream;
}

std::ostream& operator << (std::ostream& stream, const ByteBlock& block)
{
stream.write((char*)block.m_block, block.m_bytesRead);
return stream;
}

void blockCopyFile(const char* source, const char* dest)
{
ifstream in(source, ios::in | ios::binary);
ofstream out(dest, ios::out | ios::binary);
copy(istream_iterator<tjc_std::ByteBlock>(in),
istream_iterator<tjc_std::ByteBlock>(),
ostream_iterator<tjc_std::ByteBlock>(out));
}

Yes, this was a naive approach, but it worked quickly and in fact for some
reason this approach still seemed to work slightly faster than:

out << in.rdbuf();

I don't know why that would be, especially since what I've recently read and
what I've been told by others here in this newsgroup. But hey, I just need
a way to copy files without relying on the OS, so both of these ideas seems
to work just fine.

Tom


Closed Thread