473,405 Members | 2,310 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

Canonical way to copy an array

What's the canonical way to copy an array in C++?

If we're copying a POD, we can use memcpy (but there could be a more
efficient alternative if we know that the blocks are suitably aligned).

Regardless of whether the array consists of POD's, we could use a loop such
as:

#include <cassert>

template<class T>
void CopyArray(T *pdest, T const *psource, T const *const psover)
{
assert(pdest);
assert(psource);
assert(psover); /* Contraversey here! */
assert(psover psource);

do
{
*pdest++ = *psource++;
} while(psource != psover);
}

#define ARRLEN(arr) (sizeof(arr)/sizeof*(arr))

#define POVER(arr) ((arr) + ARRLEN((arr)))

int main()
{
int array1[5] = {1,2,3,4,5};

int array2[5];

CopyArray(array2,array1,POVER(array1));
}

However, I presume that there's a Standard Library function for doing this?
What's the most efficient method?

Also, how do you go about copy-constructing an array? Seems to me that
you'd have to use "aligned_storage", and then do a loop which uses
"placement new". I wonder why the following is disallowed:

int main()
{
int arr1[] = {1,2,3,4,5};

int arr2[] = arr1;
}

--

Frederick Gotham
Aug 18 '06 #1
15 3797
Frederick Gotham wrote:
What's the canonical way to copy an array in C++?
std::copy.
Aug 18 '06 #2
Frederick Gotham schrieb:
What's the canonical way to copy an array in C++?
[loop that copies elements]
However, I presume that there's a Standard Library function for doing this?
std::copy
What's the most efficient method?
Not defined by the standard.

You could use std::copy and when measured, that some way is faster for a
specific type, you could overload/specialize std::copy using the "faster"
method.

--
Thomas
Aug 18 '06 #3
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:26*******************@news.indigo.ie...
What's the canonical way to copy an array in C++?

If we're copying a POD, we can use memcpy (but there could be a more
efficient alternative if we know that the blocks are suitably aligned).
I doubt it. memcpy() normally knows all sorts of grubby details about
properties of memory addresses; if the addresses are suitably aligned,
memcpy() will normally use the more efficient implementation anyway.

Philip

Aug 18 '06 #4
On 2006-08-18 07:01:18 -0400, Frederick Gotham <fg*******@SPAM.comsaid:
What's the canonical way to copy an array in C++?
std::copy
If we're copying a POD, we can use memcpy
On a decent implementation, std::copy optimizes away to a call to
memcpy when appropriate (I know that GCC's implementation does this)
(but there could be a more efficient alternative if we know that the
blocks are suitably aligned).
Again, on a decent implementation, I doubt it. memcpy is likely to be
highly optimized (i.e. to take advantage of copying large, aligned
blocks, etc.)
Regardless of whether the array consists of POD's, we could use a loop such as:
[snip CopyArray implementation]

This is essentially what std::copy does when it's parameters are
pointers to non-POD objects. An example implementation of std::copy:

template< typename InputIterator, typename OutputIterator>
OutputIterator copy(InputIterator first, InputIterator last,
OutputIterator result)
{
while(first != last)
{
*result++ = *first++
}
return result;
}
However, I presume that there's a Standard Library function for doing
this? What's the most efficient method?

Also, how do you go about copy-constructing an array? Seems to me that
you'd have to use "aligned_storage", and then do a loop which uses
"placement new". I wonder why the following is disallowed:

int main()
{
int arr1[] = {1,2,3,4,5};

int arr2[] = arr1;
}
It's probably disallowed for no reason other than "that's the way C did it".

--
Clark S. Cox, III
cl*******@gmail.com

Aug 18 '06 #5
Philip Potter posted:
>If we're copying a POD, we can use memcpy (but there could be a more
efficient alternative if we know that the blocks are suitably aligned).

I doubt it. memcpy() normally knows all sorts of grubby details about
properties of memory addresses; if the addresses are suitably aligned,
memcpy() will normally use the more efficient implementation anyway.
There would be the runtime overhead of checking whether the addresses are
suitably aligned, for instance:

(I've only written the following code in the last half-hour, so it's likely
to contain errors:)

#include <cstddef>
#include <limits>

using std::size_t;
using std::numeric_limits;

#define alignof(T) sizeof(T)
typedef long unsigned IntP; /* Integer type to store memory address */
#define STATIC_ASSERT(x) /* Nothing*/

template<class T>
void SpecialisedCopy(T *pdest,T const *psrc,size_t num)
{
while(num--) *pdest++ = *psrc++;
}

void *memcpy(void *const pdest,void const *const psrc,size_t const num)
{
STATIC_ASSERT(!numeric_limits<unsigned>::traps);
STATIC_ASSERT(!numeric_limits<unsigned short>::traps);

unsigned const dist_dest = (IntP)pdest % alignof(int);
unsigned const dist_src = (IntP)psrc % alignof(int);

if(dist_dest != dist_src)
return SpecialisedCopy<char unsigned>(
static_cast<char unsigned*>(pdest),
static_cast<char unsigned const*>(psrc), num), pdest;
switch(dist_dest)
{
case 0:
return SpecialisedCopy<unsigned>(
static_cast<unsigned*>(pdest),
static_cast<unsigned const*>(psrc), num/sizeof(int)), pdest;

case alignof(short):
return SpecialisedCopy<short unsigned>(
static_cast<short unsigned*>(pdest),
static_cast<short unsigned const*>(psrc), num/sizeof(short)),
pdest;

default:
return SpecialisedCopy<char unsigned>(
static_cast<char unsigned*>(pdest),
static_cast<char unsigned const*>(psrc), num), pdest;
}
}

int main()
{
int iarray1[16] = {},iarray2[16] = {};
short sarray1[16] = {},sarray2[16] = {};
char carray1[16] = {},carray2[16] = {};
float farray1[16] = {},farray2[16] = {};
double darray1[16] = {},darray2[16] = {};

memcpy(iarray2,iarray1,sizeof iarray2);
memcpy(sarray2,sarray1,sizeof sarray2);
memcpy(carray2,carray1,sizeof carray2);
memcpy(farray2,farray1,sizeof farray2);
memcpy(darray2,darray1,sizeof darray2);
}

--

Frederick Gotham
Aug 18 '06 #6
In article <ec**********@cliff.xsj.xilinx.com>,
Philip Potter <ph***********@xilinx.comwrote:
>"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:26*******************@news.indigo.ie...
>What's the canonical way to copy an array in C++?

If we're copying a POD, we can use memcpy (but there could be a more
efficient alternative if we know that the blocks are suitably aligned).

I doubt it. memcpy() normally knows all sorts of grubby details about
properties of memory addresses; if the addresses are suitably aligned,
memcpy() will normally use the more efficient implementation anyway.
Maybe I've been asleep at the wheel, but I'm curious which
implementations of memcpy() knows all the grubby details
and actually uses them?
--
Greg Comeau / 20 years of Comeauity! Intel Mac Port now in alpha!
Comeau C/C++ ONLINE == http://www.comeaucomputing.com/tryitout
World Class Compilers: Breathtaking C++, Amazing C99, Fabulous C90.
Comeau C/C++ with Dinkumware's Libraries... Have you tried it?
Aug 18 '06 #7
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:7T*******************@news.indigo.ie...
Philip Potter posted:
I doubt it. memcpy() normally knows all sorts of grubby details about
properties of memory addresses; if the addresses are suitably aligned,
memcpy() will normally use the more efficient implementation anyway.

There would be the runtime overhead of checking whether the addresses are
suitably aligned
Oh noes! A bitwise-and, a check-for-zero, and a conditional branch of
overhead, in a function that typically copies large arrays of memory! The
extra gains you can get for using memcpy(), which will have been written
carefully, under controlled conditions, reviewed, rewritten, and tested
until it is as fast as is humanly possible, far outweigh this
"inefficiency".

Also note that in order to avoid these checks for alignment, you need to add
explicit function calls to aligned_memcpy() or equally_unaligned_memcpy() or
unequally_unaligned_memcpy(), none of which are substitutable for each
other. This makes your code much harder to maintain. (I don't really want to
keep track of alignments of all my arrays in order to use the right copy
function.)

See below for how memcpy() often beats hand-coded memory copying routines.
>template<class T>
void SpecialisedCopy(T *pdest,T const *psrc,size_t num)
{
while(num--) *pdest++ = *psrc++;
}
You're relying greatly on your compiler here to unroll this loop well.
(Quite possibly your reliance is well-placed; I don't know.) Often memcpy()
will copy in blocks of 8 or 16 (using 8 or 16 registers) for the inner loop,
reducing problems of memory latency greatly. Of course it needs prologue and
epilogue code to handle the beginning and end of the array.
if(dist_dest != dist_src)
return SpecialisedCopy<char unsigned>(
static_cast<char unsigned*>(pdest),
static_cast<char unsigned const*>(psrc), num), pdest;
If gcd(dist_dest-dist_src,sizeof(machine_register_type)) 1
(for example, if the arrays are misaligned by 16 bits on a machine with
32-bit registers), you can still copy shorts and be faster than copying
chars.
switch(dist_dest)
{
case 0:
return SpecialisedCopy<unsigned>(
static_cast<unsigned*>(pdest),
static_cast<unsigned const*>(psrc), num/sizeof(int)), pdest;
Assumes that num % (sizeof(unsigned)) == 0.
>
case alignof(short):
return SpecialisedCopy<short unsigned>(
static_cast<short unsigned*>(pdest),
static_cast<short unsigned const*>(psrc), num/sizeof(short)),
Assumes that num % (sizeof(unsigned short)) == 0.

Also, you'd get faster results by copying ONE short, then copying the rest
using unsigned (as you did above), then at the end copying shorts or chars
to make up the rest.
>pdest;

default:
return SpecialisedCopy<char unsigned>(
static_cast<char unsigned*>(pdest),
static_cast<char unsigned const*>(psrc), num), pdest;
}
Similarly here, you can get better results by copying chars until the arrays
are aligned, then using unsigned.

Also note that your code becomes less efficient still if
sizeof(unsigned)!=sizeof(short)*2, and sizeof(short)!=sizeof(char)*2. (On a
machine with 64-bit integer registers, you may need to consider copying 64,
32, 16 or 8 bits at a time).

Philip

Aug 18 '06 #8
"Greg Comeau" <co****@panix.comwrote in message
news:ec**********@panix2.panix.com...
In article <ec**********@cliff.xsj.xilinx.com>,
Philip Potter <ph***********@xilinx.comwrote:
I doubt it. memcpy() normally knows all sorts of grubby details about
properties of memory addresses; if the addresses are suitably aligned,
memcpy() will normally use the more efficient implementation anyway.

Maybe I've been asleep at the wheel, but I'm curious which
implementations of memcpy() knows all the grubby details
and actually uses them?
I can't give concrete examples; I just know the theory, not who has put it
into practice. Of course, the particular "grubby details" I refer to are
things such as:

* The size of an integer register
* The number of available registers during a function call
* The size of any larger registers that may actually be faster for the
job[1]
* The latency of a cache access (since we're accessing contiguous memory,
assume cache hits are very likely) for aligned and unaligned (if possible)
fetches
* Whether the machine has any specialized hardware for memcpy()ing!

And the uses they get put to are:
* How deeply the loop can be unrolled (constrained by the number of
registers available)
* How much data can be fetched from memory at once (constrained by the size
of a machine register times the available registers)
* How deeply the loop needs to be unrolled to eliminate memory latency
(constrained by... memory latency).
* Whether the specialized hardware can do it faster!

It would surprise me if many serious memcpy() implementations didn't take
advantage of these sorts of things.

Philip

[1] I have heard that the Pentium's MMX registers are better for copying
memory than its integer registers.

Aug 18 '06 #9
In article <26*******************@news.indigo.ie>, fg*******@SPAM.com
says...

[ ... ]
Also, how do you go about copy-constructing an array?
You don't. Copy-constructing requires a copy-contructor, and built-in
arrays don't have them.

You can copy-construct a vector, or you can construct a vector as a copy
of the data in an array.
Seems to me that
you'd have to use "aligned_storage", and then do a loop which uses
"placement new".
I believe that's pretty much how vector's copy ctor is usually
implemented. Constructing a vector from an array probably does pretty
much the same thing, though I haven't checked to be sure.

If you want something that doesn't use dynamic storage, you might
consider TR1's array class.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Aug 18 '06 #10
In article <ec**********@panix2.panix.com>, co****@panix.com says...

[ ... ]
Maybe I've been asleep at the wheel, but I'm curious which
implementations of memcpy() knows all the grubby details
and actually uses them?
I'm not sure I've ever seen one that did.

OTOH, I've seen some compilers that were good at recognizing and
optimizing block copies. Just for one obvious example, for quite a while
now, Visual C++ has been able to recognize a loop like:

for (int i=0; i<size; i++)
whatever[i] = whatever2[i];

and turn it into code like:

mov ecx, size
lea esi, dword ptr whatever2
lea edi, dword ptr whatever
rep movsd

In theory, you can do a little better than this using SSE2 instructions,
in but in fact it all comes out about the same unless (nearly) all of
the data is in the cache to start with -- otherwise, the memory bus is
the bottleneck.

I suppose it's possible that MS' current memmove and/or memcpy do a bit
more though -- normally they just forward to RtlMemoryMove and
RtlMemoryCopy respectively, and I haven't looked through either of them
recently to figure out how fancy they get. The last time I looked, they
weren't particularly impressive, but that was quite a while ago (around
NT 4 or so).

--
Later,
Jerry.

The universe is a figment of its own imagination.
Aug 18 '06 #11
Philip Potter posted:
(On a machine with 64-bit integer registers, you may need to consider
copying 64, 32, 16 or 8 bits at a time).

One thing which I considerd was the following (but I don't think it's
strictly allowed). It will only work if the array length is known at
compile time. (Alternatively, you could get it to work with a runtime
length, but it would involve thousands of template specialisations and a
switch statement.)

#include <cstddef>

using std::size_t;

template<class T, size_t len>
void CopyArray(T (&dst)[len], T const (&src)[len])
{
struct Copiable {
T arr[len];
};

reinterpret_cast<Copiable&>(dst) =
reinterpret_cast<Copiable const &>(src);
}

int main()
{
int iarray1[16] = {},iarray2[16] = {};
short sarray1[16] = {},sarray2[16] = {};
char carray1[16] = {},carray2[16] = {};
float farray1[16] = {},farray2[16] = {};
double darray1[16] = {},darray2[16] = {};

CopyArray(iarray2,iarray1);
CopyArray(sarray2,sarray1);
CopyArray(carray2,carray1);
CopyArray(farray2,farray1);
CopyArray(darray2,darray1);
}

Do you think could this be more efficient than memcpy?

--

Frederick Gotham
Aug 18 '06 #12
Frederick Gotham wrote:
Philip Potter posted:
>(On a machine with 64-bit integer registers, you may need to consider
copying 64, 32, 16 or 8 bits at a time).


One thing which I considerd was the following (but I don't think it's
strictly allowed). It will only work if the array length is known at
compile time. (Alternatively, you could get it to work with a runtime
length, but it would involve thousands of template specialisations and a
switch statement.)

#include <cstddef>

using std::size_t;

template<class T, size_t len>
void CopyArray(T (&dst)[len], T const (&src)[len])
{
struct Copiable {
T arr[len];
};

reinterpret_cast<Copiable&>(dst) =
reinterpret_cast<Copiable const &>(src);
}

int main()
{
int iarray1[16] = {},iarray2[16] = {};
short sarray1[16] = {},sarray2[16] = {};
char carray1[16] = {},carray2[16] = {};
float farray1[16] = {},farray2[16] = {};
double darray1[16] = {},darray2[16] = {};

CopyArray(iarray2,iarray1);
CopyArray(sarray2,sarray1);
CopyArray(carray2,carray1);
CopyArray(farray2,farray1);
CopyArray(darray2,darray1);
}

Do you think could this be more efficient than memcpy?
As with any question of efficiency, the answer is a definite "Maybe,
maybe not". There's no way to say in the general case; it may be faster
on one platform, slower on another, or it may actually be a memcpy call
under the hood.
--
Clark S. Cox III
cl*******@gmail.com
Aug 18 '06 #13
Greg Comeau wrote:
In article <ec**********@cliff.xsj.xilinx.com>,
Philip Potter <ph***********@xilinx.comwrote:
>"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:26*******************@news.indigo.ie...
>>What's the canonical way to copy an array in C++?

If we're copying a POD, we can use memcpy (but there could be a more
efficient alternative if we know that the blocks are suitably aligned).
I doubt it. memcpy() normally knows all sorts of grubby details about
properties of memory addresses; if the addresses are suitably aligned,
memcpy() will normally use the more efficient implementation anyway.

Maybe I've been asleep at the wheel, but I'm curious which
implementations of memcpy() knows all the grubby details
and actually uses them?
I know that Apple's implementation of memcpy/memmove for MacOSX takes
advantage of several grubby details and actually has a few separate
internal implementations (for different alignments, for copies less than
32 bytes, etc.)

--
Clark S. Cox III
cl*******@gmail.com
Aug 18 '06 #14
Greg Comeau wrote:
In article <ec**********@cliff.xsj.xilinx.com>,
Philip Potter <ph***********@xilinx.comwrote:
>>"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:26*******************@news.indigo.ie...
>>>What's the canonical way to copy an array in C++?

If we're copying a POD, we can use memcpy (but there could be a more
efficient alternative if we know that the blocks are suitably aligned).

I doubt it. memcpy() normally knows all sorts of grubby details about
properties of memory addresses; if the addresses are suitably aligned,
memcpy() will normally use the more efficient implementation anyway.


Maybe I've been asleep at the wheel, but I'm curious which
implementations of memcpy() knows all the grubby details
and actually uses them?
A simple example is here:

http://cvs.opensolaris.org/source/xr...en/memcpy.s#40

--
Ian Collins.
Aug 20 '06 #15

Clark S. Cox III wrote:
Greg Comeau wrote:
In article <ec**********@cliff.xsj.xilinx.com>,
Philip Potter <ph***********@xilinx.comwrote:
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:26*******************@news.indigo.ie...
What's the canonical way to copy an array in C++?

If we're copying a POD, we can use memcpy (but there could be a more
efficient alternative if we know that the blocks are suitably aligned).
I doubt it. memcpy() normally knows all sorts of grubby details about
properties of memory addresses; if the addresses are suitably aligned,
memcpy() will normally use the more efficient implementation anyway.
Maybe I've been asleep at the wheel, but I'm curious which
implementations of memcpy() knows all the grubby details
and actually uses them?

I know that Apple's implementation of memcpy/memmove for MacOSX takes
advantage of several grubby details and actually has a few separate
internal implementations (for different alignments, for copies less than
32 bytes, etc.)
The PowerPC implementation of memcpy on Mac OS X uses floating point
registers to copy 64 bits of memory with each operation.

Greg

Aug 20 '06 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

30
by: franky.backeljauw | last post by:
Hello, I am wondering which of these two methods is the fastest: std::copy, which is included in the standard library, or a manually written pointer copy? Do any of you have any experience with...
2
by: Birt | last post by:
My understanding about defining your own copy and assignment constructors is whenever there is a member of pointer type. Is this true or is there any exception to this"rule"? How about when you...
7
by: Richard Forester | last post by:
Hello. I need some help understanding what goes on when an array is copied. I create 2 arrays and copy one to the other: int pins = {9, 3, 7, 2}; int copy = new int; for (int i = 0; i !=...
4
by: fperfect13 | last post by:
Hi, I wanted to perform a deep copy of an array. Searching on google I ran into different opinions : C# Interview Questions (http://blogs.wwwcoder.com/tsvmadhav/archive/2005/04/08/2882.aspx)...
1
by: Juan R. | last post by:
Introduction I am developing the CanonML language (version 1.0) as a way to generate, store, and publish canonical science documents on the Internet. This language will be the basis for the next...
1
by: Juan R. | last post by:
The initial CanonMath program presented here http://canonicalscience.blogspot.com/2006/02/choosing-notationsyntax-for-canonmath.html] was discussed with several specialists, including father of...
7
by: Keith Thompson | last post by:
The comp.lang.c-recommended way to invoke malloc() is, of course: some_type *ptr; ptr = malloc(count * sizeof *ptr); But what if (in C99 only) ptr is a pointer to a VLA (variable-length...
1
by: zzz | last post by:
Hi all, I was recently reading the book "Write Great code by ryndall Hyde" in this in chapter 8 the following are given. given n input variables there are two raised to two raised to n unique...
2
by: Jeffrey Walton | last post by:
Hi All, BMP Strings are a subset of Universal Strings.The BMP string uses approximately 65,000 code points from Universal String encoding. BMP Strings: ISO/IEC 10646, 2-octet canonical form,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.