473,569 Members | 2,542 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

The performance of all kinds of C operations

Hi,

Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication " on a typical machine.

Thanks!

--
B. Y.

Apr 29 '06 #1
36 2459
mrby wrote:
Hi,

Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication " on a typical machine.


Pages with this sort of information are scattered about
the Web. Most that I've seen have been highly system-specific,
whether they say they are or not. One page I saw made a fairly
serious effort to assign "costs" to various C constructs, but
it seemed rooted in the days when optimizers were less radical
and when the speed disparity between CPU and memory was not so
enormous.

Nowadays it is very nearly meaningless to ask whether an
addition is slower or faster than a multiplication, even if the
data types are specified. If the operands of the addition are
in memory while those of the multiplication are in registers,
the multiplication will likely finish far sooner. A division
will likely finish far sooner; even taking the square root of
a register-resident value will likely be quicker than performing
an addition that incurs two memory references.

You can probably get an answer by ignoring the effects of
memory, of the multiple levels of cache, and of pipelining --
but the answer you get by ignoring reality is not likely to be
very helpful. It's like noticing that jet airplanes are faster
than bicycles, and therefore choosing an airplane for a one-mile
journey.

In all but a tiny and steadily diminishing fraction of cases,
micro-optimization is a waste of time and effort. Choose a good
algorithm without worrying about whether it uses multiplications
or additions, pointer arithmetic or array indexing. Then code it
as clearly and robustly as you can. Go no further unless and
until you have measured the resulting performance and found it
inadequate.

--
Eric Sosman
es*****@acm-dot-org.invalid
Apr 29 '06 #2
mrby wrote:
Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication " on a typical machine.


Well, things are not that simple, but I have an old page with this kind
of information that still stands up:

http://www.pobox.com/~qed/optimize.html

where I stated that: "Arithmetic operation performance is ordered
roughly by: transcendental functions, square root, modulo, divide,
multiply, add/subtract/mutiply by power of 2/divide by power of
2/modulo by power of 2." I guess I should have added addition,
subtraction and logic operations at the end. Anyhow, this list is
still largely true on pretty much all architectures. The reason why
all architectures seem to perform so similarly on these operations is
that the best or near best techniques for building logic that perform
all those operations in hardware are generally well known. This is a
reality worth noting.

However, it should be noted that over time, memory bandwidth has not
kept pace with modern CPU speeds. Because of this, operations even as
slow as trascendental functions are now no longer slower than first
time uncached memory accesses. So if you think of memory access as an
"operation" , this would be the one major one which has changed its
relative performance over time (by getting slower).

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Apr 29 '06 #3
On 29 Apr 2006 05:55:30 -0700, "mrby" <bi******@gmail .com> wrote in
comp.lang.c:
Hi,

Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication " on a typical machine.

Thanks!


The real reason that your question is meaningless is that there is no
such thing as a "typical machine", as far as C is concerned. This is
not just theoretical, there is a vast difference between 8-bit
microcontroller s such as an 8051 or AVR on the one hand, and 64-bit
desk top processors like Pentium or PowerPC on the other.

I do a lot of work these days with a DSP where multiplication and
addition take exactly the same amount of time, one clock cycle. And
it can also do MAC, multiply two numbers and add them to an
accumulator, in one clock cycle. Or it can square a number and add it
to an accumulator, all in one clock cycle.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.l earn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Apr 29 '06 #4
In article <mr************ *************** *****@4ax.com>, Jack Klein
<ja*******@spam cop.net> wrote:
....
I do a lot of work these days with a DSP where multiplication and
addition take exactly the same amount of time, one clock cycle. And
it can also do MAC, multiply two numbers and add them to an
accumulator, in one clock cycle. Or it can square a number and add it
to an accumulator, all in one clock cycle.


VERY interesting. While this is irrelevant to C, do you know how the
DSP accomplishes this? IIRC the complexity of multiplication is higher
than that of addition (perhaps the DSP parallelizes the operation
better?).

--Ron Bruck
Jun 15 '06 #5
Ronald Bruck wrote:
In article <mr************ *************** *****@4ax.com>, Jack Klein
<ja*******@spam cop.net> wrote:
...
I do a lot of work these days with a DSP where multiplication and
addition take exactly the same amount of time, one clock cycle. And
it can also do MAC, multiply two numbers and add them to an
accumulator, in one clock cycle. Or it can square a number and add it
to an accumulator, all in one clock cycle.


VERY interesting. While this is irrelevant to C, do you know how the
DSP accomplishes this? IIRC the complexity of multiplication is higher
than that of addition (perhaps the DSP parallelizes the operation
better?).

--Ron Bruck

So we are to assume that these one cycle quotations apply to peak
throughput, not total latency?
Jun 15 '06 #6
On Thu, 15 Jun 2006 18:29:20 GMT, Tim Prince
<ti***********@ sbcglobal.net> wrote:
Ronald Bruck wrote:
In article <mr************ *************** *****@4ax.com>, Jack Klein
<ja*******@spam cop.net> wrote:
...
I do a lot of work these days with a DSP where multiplication and
addition take exactly the same amount of time, one clock cycle. And
it can also do MAC, multiply two numbers and add them to an
accumulator, in one clock cycle. Or it can square a number and add it
to an accumulator, all in one clock cycle.


VERY interesting. While this is irrelevant to C, do you know how the
DSP accomplishes this? IIRC the complexity of multiplication is higher
than that of addition (perhaps the DSP parallelizes the operation
better?).

--Ron Bruck

So we are to assume that these one cycle quotations apply to peak
throughput, not total latency?


<OT>
I am using a TI F2812, a similar processor (I believe) to the one Jack
is referring to. The "one op per clock cycle" can be sustained as long
as the code & operands reside in the CPU internal RAM.

The throughput is reduced drastically when accessing data in internal
FLASH memory, or external memory (of any kind,) since that requires
inserting a few wait states on each memory access cycle.

But still, as long as the location of the data is similar, the time
required to perform an addition, multiplication or MAC operation
remains identical.
</OT>
Jun 15 '06 #7
From: websn...@gmail. com (Paul Hsieh)
Date: Sat, Apr 29 2006 3:57 pm
mrby wrote:
Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication " on a typical machine.


Well, things are not that simple, but I have an old page with this kind
of information that still stands up:

http://www.pobox.com/~qed/optimize.html


I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? Or why would
"if( ((a-b)|(c-d)|(e-f))==0 )" be faster than "if( a==b &&c==d &&e==f
)" ?

Spiros Bousbouras

Jun 15 '06 #8
An off-topic answer to an off-topic question; I shall keep it short:

In article <15************ **********@math .usc.edu>
Ronald Bruck <br***@math.usc .edu> wrote:
... do you know how the DSP accomplishes [single-cycle multiply / MAC]?
IIRC the complexity of multiplication is higher than that of addition
(perhaps the DSP parallelizes the operation better?).


Look up "Booth multiplier".
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Jun 15 '06 #9
On Thu, 15 Jun 2006 18:29:20 GMT, Tim Prince <ti***********@ sbcglobal.net> wrote:
VERY interesting. While this is irrelevant to C, do you know how the
DSP accomplishes this?


A 16x16 multiply can be accomplished in one lookup into a 64K-word table.
--
#include <standard.discl aimer>
_
Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up
Jun 15 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

25
3459
by: Brian Patterson | last post by:
I have noticed in the book of words that hasattr works by calling getattr and raising an exception if no such attribute exists. If I need the value in any case, am I better off using getattr within a try statement myself, or is there some clever implementation enhancement which makes this a bad idea? i.e. should I prefer: if...
5
476
by: BCC | last post by:
Why the huge drop in performance in STL from VC6.0 to VC7.1? Particularly with vector? The following code shows what I mean... Any thoughts? Thanks, B
8
2041
by: Sebastian Werner | last post by:
Howdy, I currently develop the javascript toolkit qooxdoo (http://qooxdoo.sourceforge.net), some of you heard it already. We have discovered a slowdown on Internet Explorers performance when creating objects with some data and store them in a global object registry. It take some time to get this example extracted from our codebase. The...
4
1195
by: Jesper Nilsson | last post by:
Hi, I have imported my com dll with Visual studios "Add reference", and then i'm using this code: private static MyComDll connectionKit = new MyComDll(); public static void CreateBatch(object hBatch) { ... connectionKit.ComCreateBatch(...)
1
1869
by: Lakesider | last post by:
Hi NG, I have written an application with a lot of file- and database operations. There are several algorithmic operations, too. My question is: are ther any tools to improve performance - for "normal" C# methods - for database operations - for memory optimization - ...
13
4108
by: Bern McCarty | last post by:
I have run an experiment to try to learn some things about floating point performance in managed C++. I am using Visual Studio 2003. I was hoping to get a feel for whether or not it would make sense to punch out from managed code to native code (I was using IJW) in order to do some amount of floating point work and, if so, what that certain...
6
1710
by: Mike | last post by:
Lets just say my app is done HOO HOO. Now, I'm accessing the database via a web service and one thing i noticed that my app is running real slow. When I first started working on the app is ran pretty quick returned the data to the screens in about 2 - 3 seconds. Now its going about 5 - 10 seconds. How can I beef it up for better performance.
4
3591
by: =?Utf-8?B?V2lsc29uIEMuSy4gTmc=?= | last post by:
Hi Experts, I am doing a prototype of providing data access (read, write & search) through Web Service. We observed that the data storing in SQL Server 2005, the memory size is always within 250MB. Our aim is to support ~50K concurrency users. After investigation, we are thinking to use In-memory database for achieving
1
1443
by: jehugaleahsa | last post by:
Hello: I am experiencing performance related issues when my custom data structures work with value types. I use generics to prevent boxing wherever I can. For instance, I use IEqualityComparer, etc. I have gone through most of my data structures and verified that I don't compare to null or call methods that would box my value types. ...
0
7700
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7614
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8125
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7974
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6284
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5513
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3642
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2114
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1221
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.