473,836 Members | 2,240 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

memcpy() vs. for() performance

#define SIZE 100
#define USE_MEMCPY

int main(void)
{
char a[SIZE];
char b[SIZE];
int n;

/* code 'filling' a[] */

#ifdef USE_MEMCPY
memcpy(b, a, sizeof(a));
#else
for (n = 0; n < sizeof(a); n++)
{
b[n] = a[n];
}
#endif
}

/*
Any (general) ideas about when (depending on SIZE) to use
memcpy(), and when to use for()?

<OT>
Any remarks about this issue using GCC, or the Sun compiler,
are welcome.
</OT>
*/

Nov 14 '05
33 33773
luc wastiaux <du*******@airp ost.net> wrote:
Thomas Matthews wrote:
I've written my own memcpy function which uses the
processor's specialized instructions. However,
it has a minimum overhead. The threshold between
using memcpy for large areas vs. the DMA device
is very close (on my platform).


Out of curiosity, how do you instruct your processor to use DMA in your
custom memcpy function ?


In ISO C, you don't. It all depends on the architecture, and therefore
will differ between, say, an Intel machine and a Sparc.

Richard
Nov 14 '05 #11
luc wastiaux wrote:
Thomas Matthews wrote:
I've written my own memcpy function which uses the
processor's specialized instructions. However,
it has a minimum overhead. The threshold between
using memcpy for large areas vs. the DMA device
is very close (on my platform).

Out of curiosity, how do you instruct your processor to use DMA in your
custom memcpy function ?

I use assembly language. The DMA is not a part of the processor,
but a component on the platform. The DMA has a setup overhead,
so it should only be used for large or automated transfers.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.l earn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Nov 14 '05 #12
Arthur J. O'Dwyer wrote:
Case <no@no.no> writes:

A well implemented memcpy() can use many tricks to accelerate its
operation.


Agreed and agreed. I use 'memcpy' any time I can guarantee it
will be safe, which in C is all the time, as far as I can recall.


Aren't there issues with memcpy and overlapping memory locations?

In the following program, isn't the call to memcpy an error?

#include <stdio.h>
#include <string.h>

int main()
{

int x[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int *to = x;
int *from = &x[1];

memcpy(to, from, sizeof x - sizeof *x); /* UB ? */

return 0;
}

Nov 14 '05 #13
In <Pi************ *************** *******@unix49. andrew.cmu.edu> "Arthur J. O'Dwyer" <aj*@nospam.and rew.cmu.edu> writes:

On Wed, 30 Jun 2004, Arthur J. O'Dwyer wrote:

One more reason to prefer whichever alternative is the more readable
(in this case, the alternative that doesn't involve a function call
to do a one-line task :) .


And to clarify: I mean the function call 'foo', not the function
call 'memcpy'. 'memcpy' is good. 'foo' itself is unnecessary and
ought to be removed. :)
Okay, I think that's clearer.


Indeed. foo() was introduced for the sole reason of having a minimal
translation unit ;-)

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #14
In <Pi************ *************** *******@unix49. andrew.cmu.edu> "Arthur J. O'Dwyer" <aj*@nospam.and rew.cmu.edu> writes:
Unfortunately for your example, "The Dev Team Thinks Of Everything"
in GCC, too:

% cat test.c
#include <string.h>

void foo(int *p, int *q)
{
memcpy(q, p, 2 * sizeof *p);
}
% gcc -O2 -S test.c
% cat test2.c
#include <string.h>

void foo(int *p, int *q)
{
int i;
for (i=0; i < 2; ++i)
q[i] = p[i];
}
% gcc -O2 -S test2.c
% diff test.s test2.s
1c1
< .file "test.c"
---
.file "test2.c"%


Which shows that the memcpy version is still at least as good as the
for loop ;-)
One more reason to prefer whichever alternative is the more readable
(in this case, the alternative that doesn't involve a function call
to do a one-line task :) .


To me, the memcpy alternative is more readable than the other: it
consists of a single, very simple, idiomatic even (for objects that can't
be directly assigned) function call. Which I wouldn't hide behind a
function in real C code: either use as such, inline, or hidden behind
a macro.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #15
In <hoCEc.914888$P k3.851808@pd7tw 1no> Edmund Bacon <eb****@SpamMeN ot.onesystem.co m> writes:
Arthur J. O'Dwyer wrote:
Case <no@no.no> writes:

A well implemented memcpy() can use many tricks to accelerate its
operation.
Agreed and agreed. I use 'memcpy' any time I can guarantee it
will be safe, which in C is all the time, as far as I can recall.


Aren't there issues with memcpy and overlapping memory locations?


Yes, there are.
In the following program, isn't the call to memcpy an error?

#include <stdio.h>
#include <string.h>

int main()
{

int x[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int *to = x;
int *from = &x[1];

memcpy(to, from, sizeof x - sizeof *x); /* UB ? */

return 0;
}


Use memmove() in such cases. It has well defined behaviour for
overlapping memory blocks. Depending on the nature of the overlap,
it will either perform an ordinary memcpy() or a copy in the opposite
direction.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #16
In <2w************ *****@newssvr32 .news.prodigy.c om> Thomas Matthews <Th************ *************** *@sbcglobal.net > writes:
luc wastiaux wrote:
Thomas Matthews wrote:
I've written my own memcpy function which uses the
processor's specialized instructions. However,
it has a minimum overhead. The threshold between
using memcpy for large areas vs. the DMA device
is very close (on my platform).


Out of curiosity, how do you instruct your processor to use DMA in your
custom memcpy function ?

I use assembly language. The DMA is not a part of the processor,
but a component on the platform. The DMA has a setup overhead,
so it should only be used for large or automated transfers.


By "automated" I guess you mean "asynchrono us to the program execution".
Which has obvious advantages and disadvantages.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #17
Edmund Bacon wrote:
Arthur J. O'Dwyer wrote:

Case <no@no.no> writes:

A well implemented memcpy() can use many tricks to accelerate its
operation.


Agreed and agreed. I use 'memcpy' any time I can guarantee it
will be safe, which in C is all the time, as far as I can recall.

Aren't there issues with memcpy and overlapping memory locations?

In the following program, isn't the call to memcpy an error?
[snip example with overlapping source and destination]


Yes: The behavior of memcpy() is not defined if the
source and destination objects overlap. If that's a
possibility, use memmove() instead.

--
Er*********@sun .com

Nov 14 '05 #18
"Arthur J. O'Dwyer" <aj*@nospam.and rew.cmu.edu> wrote:

ALWAYS use memcpy(), NEVER use for loops, unless you have empirical
evidence that your memcpy() is very poorly implemented.

A well implemented memcpy() can use many tricks to accelerate its
operation.


Agreed and agreed. I use 'memcpy' any time I can guarantee it
will be safe, which in C is all the time, as far as I can recall.
Of course, I don't write many programs in which "copy a chunk of
memory from A to B" is much of a bottleneck... :)


I have a slight aversion to memcpy, because of one compiler I had to
use, which would copy 65535 bytes if you called it with a third
argument of 0. (I think this is not standard-conforming, but
unfortunately the real world rears its ugly head sometimes).

FWIW this was Hitech C for the Z80 (and I guess the problem came
about because the Z80's block-move instruction does this if you
pass 0 as the length (it decrements and then checks the zero flag),
and the implementers must have not been aware of this behaviour).
Nov 14 '05 #19
Dan Pop wrote:
I've written my own memcpy function which uses the
processor's specialized instructions. However,
it has a minimum overhead. The threshold between
using memcpy for large areas vs. the DMA device
is very close (on my platform).

Out of curiosity, how do you instruct your processor to use DMA in your
custom memcpy function ?


I use assembly language. The DMA is not a part of the processor,
but a component on the platform. The DMA has a setup overhead,
so it should only be used for large or automated transfers.

By "automated" I guess you mean "asynchrono us to the program execution".
Which has obvious advantages and disadvantages.


But how do you know when the transfer is complete then ? I assume that
even in synchronous mode, using DMA for large transfers can be beneficial.

--
luc wastiaux
Nov 14 '05 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
17521
by: franky.backeljauw | last post by:
Hello, following my question on "std::copy versus pointer copy versus member copy", I had some doubts on the function memcpy, as was used by tom_usenet in his reply. - Is this a c++ standard library function? That is, can I be sure that every c++ standard library has this function? Or is there a c++ alternative to it?
5
1424
by: PeterCMG | last post by:
I am currently working on creating guidelines for designing client/server applications for performance. I am looking for any existing documentation that would assist with this and my google searches have not produced much. Please let me know if you are aware of any good sites or other resources that would cover guidelines for better performing client/server applications (especially in a Microsoft world). Thanks, Peter
0
1259
by: Bryan Parkoff | last post by:
I use Intel C++ Compiler 8.1 for my Xeon Pentium III 550MHz machine. I have executed six different tests for performance. GV stands for global variable. GF stands for global function. SV stands for struct variable or class variable. SF stands for struct member function or class member function. JMP stands for switch (xx). CALL stands for pointer to function. My output screen test shows that GV GF and SV GF have almost the same...
2
2396
by: Stan Leung | last post by:
Hello all, I am interested in know if anyone has set up clustering for performance and fail over using PostgreSQL. We are currently using Oracle for a distribution application and would like to use PostgreSQL with multiple application and database servers. Regards Stan. --------------------------------- Post your free ad now! Yahoo! Canada Personals
0
9811
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10822
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10577
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10241
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9359
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7774
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6975
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5642
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5812
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.