By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,831 Members | 1,031 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,831 IT Pros & Developers. It's quick & easy.

The performance of all kinds of C operations

P: n/a
Hi,

Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication" on a typical machine.

Thanks!

--
B. Y.

Apr 29 '06 #1
Share this Question
Share on Google+
36 Replies


P: n/a
mrby wrote:
Hi,

Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication" on a typical machine.


Pages with this sort of information are scattered about
the Web. Most that I've seen have been highly system-specific,
whether they say they are or not. One page I saw made a fairly
serious effort to assign "costs" to various C constructs, but
it seemed rooted in the days when optimizers were less radical
and when the speed disparity between CPU and memory was not so
enormous.

Nowadays it is very nearly meaningless to ask whether an
addition is slower or faster than a multiplication, even if the
data types are specified. If the operands of the addition are
in memory while those of the multiplication are in registers,
the multiplication will likely finish far sooner. A division
will likely finish far sooner; even taking the square root of
a register-resident value will likely be quicker than performing
an addition that incurs two memory references.

You can probably get an answer by ignoring the effects of
memory, of the multiple levels of cache, and of pipelining --
but the answer you get by ignoring reality is not likely to be
very helpful. It's like noticing that jet airplanes are faster
than bicycles, and therefore choosing an airplane for a one-mile
journey.

In all but a tiny and steadily diminishing fraction of cases,
micro-optimization is a waste of time and effort. Choose a good
algorithm without worrying about whether it uses multiplications
or additions, pointer arithmetic or array indexing. Then code it
as clearly and robustly as you can. Go no further unless and
until you have measured the resulting performance and found it
inadequate.

--
Eric Sosman
es*****@acm-dot-org.invalid
Apr 29 '06 #2

P: n/a
mrby wrote:
Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication" on a typical machine.


Well, things are not that simple, but I have an old page with this kind
of information that still stands up:

http://www.pobox.com/~qed/optimize.html

where I stated that: "Arithmetic operation performance is ordered
roughly by: transcendental functions, square root, modulo, divide,
multiply, add/subtract/mutiply by power of 2/divide by power of
2/modulo by power of 2." I guess I should have added addition,
subtraction and logic operations at the end. Anyhow, this list is
still largely true on pretty much all architectures. The reason why
all architectures seem to perform so similarly on these operations is
that the best or near best techniques for building logic that perform
all those operations in hardware are generally well known. This is a
reality worth noting.

However, it should be noted that over time, memory bandwidth has not
kept pace with modern CPU speeds. Because of this, operations even as
slow as trascendental functions are now no longer slower than first
time uncached memory accesses. So if you think of memory access as an
"operation", this would be the one major one which has changed its
relative performance over time (by getting slower).

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Apr 29 '06 #3

P: n/a
On 29 Apr 2006 05:55:30 -0700, "mrby" <bi******@gmail.com> wrote in
comp.lang.c:
Hi,

Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication" on a typical machine.

Thanks!


The real reason that your question is meaningless is that there is no
such thing as a "typical machine", as far as C is concerned. This is
not just theoretical, there is a vast difference between 8-bit
microcontrollers such as an 8051 or AVR on the one hand, and 64-bit
desk top processors like Pentium or PowerPC on the other.

I do a lot of work these days with a DSP where multiplication and
addition take exactly the same amount of time, one clock cycle. And
it can also do MAC, multiply two numbers and add them to an
accumulator, in one clock cycle. Or it can square a number and add it
to an accumulator, all in one clock cycle.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Apr 29 '06 #4

P: n/a
In article <mr********************************@4ax.com>, Jack Klein
<ja*******@spamcop.net> wrote:
....
I do a lot of work these days with a DSP where multiplication and
addition take exactly the same amount of time, one clock cycle. And
it can also do MAC, multiply two numbers and add them to an
accumulator, in one clock cycle. Or it can square a number and add it
to an accumulator, all in one clock cycle.


VERY interesting. While this is irrelevant to C, do you know how the
DSP accomplishes this? IIRC the complexity of multiplication is higher
than that of addition (perhaps the DSP parallelizes the operation
better?).

--Ron Bruck
Jun 15 '06 #5

P: n/a
Ronald Bruck wrote:
In article <mr********************************@4ax.com>, Jack Klein
<ja*******@spamcop.net> wrote:
...
I do a lot of work these days with a DSP where multiplication and
addition take exactly the same amount of time, one clock cycle. And
it can also do MAC, multiply two numbers and add them to an
accumulator, in one clock cycle. Or it can square a number and add it
to an accumulator, all in one clock cycle.


VERY interesting. While this is irrelevant to C, do you know how the
DSP accomplishes this? IIRC the complexity of multiplication is higher
than that of addition (perhaps the DSP parallelizes the operation
better?).

--Ron Bruck

So we are to assume that these one cycle quotations apply to peak
throughput, not total latency?
Jun 15 '06 #6

P: n/a
On Thu, 15 Jun 2006 18:29:20 GMT, Tim Prince
<ti***********@sbcglobal.net> wrote:
Ronald Bruck wrote:
In article <mr********************************@4ax.com>, Jack Klein
<ja*******@spamcop.net> wrote:
...
I do a lot of work these days with a DSP where multiplication and
addition take exactly the same amount of time, one clock cycle. And
it can also do MAC, multiply two numbers and add them to an
accumulator, in one clock cycle. Or it can square a number and add it
to an accumulator, all in one clock cycle.


VERY interesting. While this is irrelevant to C, do you know how the
DSP accomplishes this? IIRC the complexity of multiplication is higher
than that of addition (perhaps the DSP parallelizes the operation
better?).

--Ron Bruck

So we are to assume that these one cycle quotations apply to peak
throughput, not total latency?


<OT>
I am using a TI F2812, a similar processor (I believe) to the one Jack
is referring to. The "one op per clock cycle" can be sustained as long
as the code & operands reside in the CPU internal RAM.

The throughput is reduced drastically when accessing data in internal
FLASH memory, or external memory (of any kind,) since that requires
inserting a few wait states on each memory access cycle.

But still, as long as the location of the data is similar, the time
required to perform an addition, multiplication or MAC operation
remains identical.
</OT>
Jun 15 '06 #7

P: n/a
From: websn...@gmail.com (Paul Hsieh)
Date: Sat, Apr 29 2006 3:57 pm
mrby wrote:
Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication" on a typical machine.


Well, things are not that simple, but I have an old page with this kind
of information that still stands up:

http://www.pobox.com/~qed/optimize.html


I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? Or why would
"if( ((a-b)|(c-d)|(e-f))==0 )" be faster than "if( a==b &&c==d &&e==f
)" ?

Spiros Bousbouras

Jun 15 '06 #8

P: n/a
An off-topic answer to an off-topic question; I shall keep it short:

In article <15**********************@math.usc.edu>
Ronald Bruck <br***@math.usc.edu> wrote:
... do you know how the DSP accomplishes [single-cycle multiply / MAC]?
IIRC the complexity of multiplication is higher than that of addition
(perhaps the DSP parallelizes the operation better?).


Look up "Booth multiplier".
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Jun 15 '06 #9

P: n/a
On Thu, 15 Jun 2006 18:29:20 GMT, Tim Prince <ti***********@sbcglobal.net> wrote:
VERY interesting. While this is irrelevant to C, do you know how the
DSP accomplishes this?


A 16x16 multiply can be accomplished in one lookup into a 64K-word table.
--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up
Jun 15 '06 #10

P: n/a
sp****@gmail.com wrote:
From: websn...@gmail.com (Paul Hsieh)
Date: Sat, Apr 29 2006 3:57 pm
mrby wrote:
Does anyone know of any link which describes the (relative)
performance of all kinds of C operations? e.g: how fast is "add"
comparing with "multiplication" on a typical machine.


Well, things are not that simple, but I have an old page with this kind
of information that still stands up:

http://www.pobox.com/~qed/optimize.html


I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? Or why would
"if( ((a-b)|(c-d)|(e-f))==0 )" be faster than "if( a==b &&c==d &&e==f
)" ?


It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it may
not perform the relevant transformation for you. Certainly for some
compilers it may make no difference.

The idea for x = y << 3, is to let the compiler use its faster integer
shift operations (a clock or two) instead of its slower multiplier
(usually upwards of 5 clocks). The x86 includes a *8 operation with
its address units, but its still usually costs an extra cycle of
latency between the address and integer units (so its effectively two
clocks). This is probably the easiest case where a good modern
compiler will make the transformation for you, and you probably don't
need to worry about it.

The second case is a bit more subtle. First you should see that the
operations are the same. Originally it was thought that short cutting
was better because it performed fewer actual ALU operations on average.
However, with the advent of super-scalar and out of order execution
CPUs, it turns out that ALU speed has gone way up relative to branch
control (which at *best* has stayed the same, but sometimes gets much
worse due to mispredictions). So this calculation: (a-b)|(c-d)|(e-f)
leverages everything that modern CPUs have been geared towards
calculating with maximum performance, leaving only one conditional
operation: if ( (...) == 0). Whereas if (a==b && c==d && e==f) is the
equivalent of if (a==b) if (c==d) if (e==f), which is 3 conditional
operations.

The shortcut method *may* be faster if typically a != b, with a high
degree of probability. The problem is that it becomes associated with
a very fast control transfer. That is to say, it decreases the "meat"
code versus the branch code. Many modern processors are not well
designed for that kind of behaviour -- more geared towards the
assumption that there is some minimum number of ALU operations per
branch. So even in this case, the CPU may not benefit from the best
case of operation short cutting.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jun 15 '06 #11

P: n/a
Tim Prince wrote:
Ronald Bruck wrote:
<ja*******@spamcop.net> wrote:
...
I do a lot of work these days with a DSP where multiplication and
addition take exactly the same amount of time, one clock cycle.
And it can also do MAC, multiply two numbers and add them to an
accumulator, in one clock cycle. Or it can square a number and
add it to an accumulator, all in one clock cycle.


VERY interesting. While this is irrelevant to C, do you know how
the DSP accomplishes this? IIRC the complexity of multiplication
is higher than that of addition (perhaps the DSP parallelizes the
operation better?).


So we are to assume that these one cycle quotations apply to peak
throughput, not total latency?


If you are willing to throw enough gates, or lookup table space, at
it the multiplication becomes a one clock operation.

--
Some informative links:
news:news.announce.newusers
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html
Jun 15 '06 #12

P: n/a
we******@gmail.com wrote:
sp****@gmail.com wrote:

I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? [...]


It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it may
not perform the relevant transformation for you. Certainly for some
compilers it may make no difference. [...]


Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?

Thought exercise #2: Since some of the answers to #1
are at least moderately subtle, is it a good idea to recommend
this substitution in a section titled "For Beginners Only?"

--
Eric Sosman
es*****@acm-dot-org.invalid
Jun 16 '06 #13

P: n/a

Eric Sosman wrote:
we******@gmail.com wrote:
sp****@gmail.com wrote:

I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? [...]
It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it may
not perform the relevant transformation for you. Certainly for some
compilers it may make no difference. [...]


Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?


<Raises hand>
I'd say they are equivalent if y is integral , non-negative and the
result of
the operation won't exceed its type's range. If any of these conditions
are
violated you can't be sure.

Is this all ?
Thought exercise #2: Since some of the answers to #1
are at least moderately subtle, is it a good idea to recommend
this substitution in a section titled "For Beginners Only?"


A thoughtful beginner will try to understand why the expressions are
equivalent rather than blindly replace the "slow" one by the "fast"
one.
If she's not sure she will ask and learn something in the process.

Jun 16 '06 #14

P: n/a
Eric Sosman wrote:
we******@gmail.com wrote:
sp****@gmail.com wrote:

I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? [...]
It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it may
not perform the relevant transformation for you. Certainly for some
compilers it may make no difference. [...]


Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?


Whatever dude, the real world operates in 2s complement.
Thought exercise #2: Since some of the answers to #1
are at least moderately subtle, is it a good idea to recommend
this substitution in a section titled "For Beginners Only?"


Well its not a page about C pedantry, so that sort of mind set didn't
make it into my thinking when I wrote it. Its for beginners who care
about real programming, not beginners who want to turn into C pedants.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jun 16 '06 #15

P: n/a
we******@gmail.com wrote:
Eric Sosman wrote:
we******@gmail.com wrote:
sp****@gmail.com wrote:I didn't read your whole page but had a look at the table in the
>section "Strictly for beginners". Can you explain why would
>"x = y << 3" be faster than "x = y * 8" ? [...]
I was using a compiler in the '80s that was smart enough to replace * 8
with << 3. Stuff like this in the source just obscures the intent.
Don't
encourage beginners to do this.
It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it may
not perform the relevant transformation for you. Certainly for some
compilers it may make no difference. [...]


Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?


Whatever dude, the real world operates in 2s complement.
Thought exercise #2: Since some of the answers to #1
are at least moderately subtle, is it a good idea to recommend
this substitution in a section titled "For Beginners Only?"


Well its not a page about C pedantry, so that sort of mind set didn't
make it into my thinking when I wrote it. [It's] for beginners who care
about real programming, not beginners who want to turn into C pedants.


perhaps the page should have been labelled,
"For Beginners Who Never Want To Aspire To Excellence",
then
--
Nick Keighley

Quantum Boggum Sort:
Q1. use a source of quantum noise (eg. radioactive decay) to
randomly permutate an array.
Q2. if the array is not ordered, destroy the universe (*)
Q3. if you reached this step your universe has sorted the array
in O(n) time.
(*) [100] this is left as a exercise

Jun 16 '06 #16

P: n/a
we******@gmail.com wrote:
Eric Sosman wrote:

.... snip ...

Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?


Whatever dude, the real world operates in 2s complement.
Thought exercise #2: Since some of the answers to #1
are at least moderately subtle, is it a good idea to recommend
this substitution in a section titled "For Beginners Only?"


Well its not a page about C pedantry, so that sort of mind set
didn't make it into my thinking when I wrote it. Its for
beginners who care about real programming, not beginners who
want to turn into C pedants.


Which is exactly the sort of sloppy thinking that leads to
Microsoft software. It can also kill and maim human and other
beings. People make mistakes, but apparently you just don't care.

--
"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
Jun 16 '06 #17

P: n/a
we******@gmail.com wrote:
Eric Sosman wrote:
we******@gmail.com wrote:
sp****@gmail.com wrote:

I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? [...]

It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it may
not perform the relevant transformation for you. Certainly for some
compilers it may make no difference. [...]


Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?

Whatever dude, the real world operates in 2s complement.


Well, you've spotted one of the non-equivalences.
Care to try for any of the others?
Thought exercise #2: Since some of the answers to #1
are at least moderately subtle, is it a good idea to recommend
this substitution in a section titled "For Beginners Only?"

Well its not a page about C pedantry, so that sort of mind set didn't
make it into my thinking when I wrote it. Its for beginners who care
about real programming, not beginners who want to turn into C pedants.


"Real programming" is not about trying to outguess the
compiler. It is in fact possible for a programmer to outguess
a compiler, at least some of the time, but I've never met or
even heard of anyone who could outguess multiple compilers on
multiple systems.

Back in the 1960s the sort of stuff you recommend was not
only defensible, but necessary. I myself used every sneaky
trick I could, because in them thar days computer time was
scarce and expensive, while people time was plentiful and
cheap. It made economic sense to expend people time to save
on computer time.

But now? Computer time is cheap. Cheap, cheap, cheap.
And plentiful, too. No, "plentiful" doesn't even begin to
capture the situation: computer time is not only sloshing in
the bilge, it's overflowing the gunwales. Meanwhile, it's the
time of skilled people that's scarce and expensive -- and the
person who still insists on spending precious people time to
make the glut of computer time microscopically larger ... "The
only constant is change," yet some people seem unable to shake
the habits formed half a century ago.

Let's see: `y << 3' takes one more keystroke than `y * 8'.
Let's say it takes you one-tenth of a second to strike that
key. How many times must you evaluate the speeded-up expression
(if it is in fact speeded up) just to recoup your 0.1 second?

--
Eric Sosman
es*****@acm-dot-org.invalid
Jun 16 '06 #18

P: n/a
Eric Sosman wrote:
we******@gmail.com wrote:
Eric Sosman wrote:
we******@gmail.com wrote:
sp****@gmail.com wrote:
>I didn't read your whole page but had a look at the table in the
>section "Strictly for beginners". Can you explain why would
>"x = y << 3" be faster than "x = y * 8" ? [...]

It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it may
not perform the relevant transformation for you. Certainly for some
compilers it may make no difference. [...]

Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?
Whatever dude, the real world operates in 2s complement.


Well, you've spotted one of the non-equivalences.
Care to try for any of the others?


No -- why would I do that?
Thought exercise #2: Since some of the answers to #1
are at least moderately subtle, is it a good idea to recommend
this substitution in a section titled "For Beginners Only?"


Well its not a page about C pedantry, so that sort of mind set didn't
make it into my thinking when I wrote it. Its for beginners who care
about real programming, not beginners who want to turn into C pedants.


"Real programming" is not about trying to outguess the
compiler.


Well, its always possible to out-*know* the compiler. You can, of
course, decide not to know, and assume the world is perfectly ISO C,
and that there is nothing to know outside of that. I am curious as to
whether or not you've ever encountered a compiler with a bug in it?
[...] It is in fact possible for a programmer to outguess
a compiler, at least some of the time, but I've never met or
even heard of anyone who could outguess multiple compilers on
multiple systems.
What? I'm not sure how to even respond to that.

I mean, I do that all the time; I've got webpages that *show* this kind
of thing. Anyone who does any serious cross-platform development
probably does this. Perhaps Chris Torek would like to chime in on how
VxWorks is developed.

My real world includes doing exactly that, as just a standard activity.
Your claims about who you've met or heard of relative to this, seem
curious to say the least. I mean if you were CBF or Keith, then fine,
but you work for Sun. Your company makes a thing called "Solaris", and
it runs really well on multiple platforms. You think they were just
hoping the compilers were good enough?
Back in the 1960s the sort of stuff you recommend was not
only defensible, but necessary. I myself used every sneaky
trick I could, because in them thar days computer time was
scarce and expensive, while people time was plentiful and
cheap. It made economic sense to expend people time to save
on computer time.

But now? Computer time is cheap. Cheap, cheap, cheap.
And plentiful, too. No, "plentiful" doesn't even begin to
capture the situation: computer time is not only sloshing in
the bilge, it's overflowing the gunwales. Meanwhile, it's the
time of skilled people that's scarce and expensive -- and the
person who still insists on spending precious people time to
make the glut of computer time microscopically larger ... "The
only constant is change," yet some people seem unable to shake
the habits formed half a century ago.
And that's why you are a Java/Ruby/Python programmer right?
Let's see: `y << 3' takes one more keystroke than `y * 8'.
Well, I'm an old-school math-oriented based person. Knowing that
something is a power of 2 (like a multiplying factor) still fires
neurons in my head, even if it doesn't for you. I also like putting
things like this: /* ... */ in my code too; they take lots of
keystrokes for no gain at all.
Let's say it takes you one-tenth of a second to strike that
key. How many times must you evaluate the speeded-up expression
(if it is in fact speeded up) just to recoup your 0.1 second?


Oh ... well I don't know, maybe billions. But I suppose if n is always
less than a billion you must be right.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jun 16 '06 #19

P: n/a
Nick Keighley wrote:
we******@gmail.com wrote:
Eric Sosman wrote:
we******@gmail.com wrote:
> sp****@gmail.com wrote:
>>I didn't read your whole page but had a look at the table in the
>>section "Strictly for beginners". Can you explain why would
>>"x = y << 3" be faster than "x = y * 8" ? [...]
I was using a compiler in the '80s that was smart enough to replace * 8
with << 3. Stuff like this in the source just obscures the intent.


Explain how it obscures the intent.
Well its not a page about C pedantry, so that sort of mind set didn't
make it into my thinking when I wrote it. [It's] for beginners who care
about real programming, not beginners who want to turn into C pedants.


perhaps the page should have been labelled,
"For Beginners Who Never Want To Aspire To Excellence",
then


Do you issue these "excellence" badges yourself?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jun 16 '06 #20

P: n/a
we******@gmail.com wrote:
Eric Sosman wrote:
we******@gmail.com wrote:
Eric Sosman wrote:

we******@gmail.com wrote:

>sp****@gmail.com wrote:
>
>>I didn't read your whole page but had a look at the table in the
>>section "Strictly for beginners". Can you explain why would
>>"x = y << 3" be faster than "x = y * 8" ? [...]
>
>It depends on your compiler. However, the assumption is that you
>aren't in control of the quality of your compiler, and hence, it may
>not perform the relevant transformation for you. Certainly for some
>compilers it may make no difference. [...]

Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?

Whatever dude, the real world operates in 2s complement.


Well, you've spotted one of the non-equivalences.
Care to try for any of the others?

No -- why would I do that?


Um, er, to throw back the night of ignorance with the
brilliant lamp of learning?

--
Eric Sosman
es*****@acm-dot-org.invalid
Jun 17 '06 #21

P: n/a
we******@gmail.com wrote:
Nick Keighley wrote:
we******@gmail.com wrote:
Eric Sosman wrote:
> we******@gmail.com wrote:
> > sp****@gmail.com wrote: > >>I didn't read your whole page but had a look at the table in the
> >>section "Strictly for beginners". Can you explain why would
> >>"x = y << 3" be faster than "x = y * 8" ? [...]


I was using a compiler in the '80s that was smart enough to replace * 8
with << 3. Stuff like this in the source just obscures the intent.


Explain how it obscures the intent.


obviously we are not on the same wavelength. If I want to multiply by 8
I write *8. Why would I do anything else? (unless measurement showed
<<3 was faster *and* this line of code needed to be faster).

What about *10 would you replace that with two shifts and an add?

Well its not a page about C pedantry, so that sort of mind set didn't
make it into my thinking when I wrote it. [It's] for beginners who care
about real programming, not beginners who want to turn into C pedants.


perhaps the page should have been labelled,
"For Beginners Who Never Want To Aspire To Excellence",
then


Do you issue these "excellence" badges yourself?


sure, my fees are very reasonable
:-)
--
Nick Keighley

A ruby trembled. Two tourmaline nets failed to rectify the laser beam.
A diamond noted the error. Both the error and the correction went into
the general computer.
Corwainer Smith "The Dead Lady of Clown Town"

Jun 17 '06 #22

P: n/a
Nick Keighley wrote:
we******@gmail.com wrote:
Nick Keighley wrote:
we******@gmail.com wrote:
> Eric Sosman wrote:
> > we******@gmail.com wrote:
> > > sp****@gmail.com wrote:
> > >>I didn't read your whole page but had a look at the table in the
> > >>section "Strictly for beginners". Can you explain why would
> > >>"x = y << 3" be faster than "x = y * 8" ? [...]

I was using a compiler in the '80s that was smart enough to replace * 8
with << 3. Stuff like this in the source just obscures the intent.
Explain how it obscures the intent.


obviously we are not on the same wavelength. If I want to multiply by 8
I write *8. Why would I do anything else? (unless measurement showed
<<3 was faster *and* this line of code needed to be faster).


If shifting by 3 is the *same* as multiplying by 8, then how is
shifting by 8 "doing something else" other than multiplying by 8? Tell
me, why do you think C includes a << operation?
What about *10 would you replace that with two shifts and an add?


That's a different question. Doing so involves a dependency chain of
operations, and there are alternative platform specific tricks for
this. So it doesn't belong in the "beginners section".

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jun 17 '06 #23

P: n/a
Nick Keighley wrote:
we******@gmail.com wrote:
Nick Keighley wrote:
.... snip ...

I was using a compiler in the '80s that was smart enough to
replace * 8 with << 3. Stuff like this in the source just obscures the intent.

Explain how it obscures the intent.


obviously we are not on the same wavelength. If I want to multiply
by 8 I write *8. Why would I do anything else? (unless measurement
showed <<3 was faster *and* this line of code needed to be faster).

What about *10 would you replace that with two shifts and an add?

In general, replacing an understandable "multiply by 8" with an
obscure phrase meaning "bodily shift the (defined elsewhere) bits
of the value to the left 3 places resulting in possibly undefined
results", would be considered obscuring. In fact many would
consider it extremely sloppy programming. To do so requires
several things:

1. Proof that it improves performance on this system.
2. Proof that it will be valid for all possible operands.
3. Proof that the performance improvement (1) is significant.

otherwise the decision to make any such transformations is much
better left to the compiler optimizer.
From the standard, 6.5.7


[#4] The result of E1 << E2 is E1 left-shifted E2 bit
positions; vacated bits are filled with zeros. If E1 has an
unsigned type, the value of the result is E1*2^E2, reduced
modulo one more than the maximum value representable in the
result type. If E1 has a signed type and nonnegative value,
and E1*2^E2 is representable in the result type then that is
the resulting value; otherwise, the behavior is undefined.

--
"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
Jun 17 '06 #24

P: n/a
we******@gmail.com writes:
Nick Keighley wrote:
we******@gmail.com wrote:
> Nick Keighley wrote:
> > we******@gmail.com wrote:
> > > Eric Sosman wrote:
> > > > we******@gmail.com wrote:
> > > > > sp****@gmail.com wrote:
> > > > >>I didn't read your whole page but had a look at the table in the
> > > > >>section "Strictly for beginners". Can you explain why would
> > > > >>"x = y << 3" be faster than "x = y * 8" ? [...]
> >
> > I was using a compiler in the '80s that was smart enough to replace * 8
> > with << 3. Stuff like this in the source just obscures the intent.
>
> Explain how it obscures the intent.


obviously we are not on the same wavelength. If I want to multiply by 8
I write *8. Why would I do anything else? (unless measurement showed
<<3 was faster *and* this line of code needed to be faster).


If shifting by 3 is the *same* as multiplying by 8, then how is
shifting by 8 "doing something else" other than multiplying by 8? Tell
me, why do you think C includes a << operation?


For bitwise shifting, of course. Why do you think C includes a "*"
operation? Even if they happen to be mathematically equivalent in
certain cases, they're *conceptually* very different.

If I want something to be 8 times as big, I'll multiply it by 8.
(Tomorrow I might decide I want it to be 9 times as big.)

If I want to shift the bits of an object left by 3 positions I'll use
"<< 3". If shifting is really what I want to do here, I might decide
tomorrow that I want to shift by 4 bits; it's unlikely I'll decide I
want to multiply by 9.

I'll use the multiplication if it expresses what I want to do. If a
shift happens to be more efficient, I'll trust the compile to generate
a shift instruction -- or not to if it can't prove that it's really
equivalent, or if a multiplication is just as fast as a shift.

What if y is negative? (Answer: undefined behavior.) What if y is
floating-point? (Answer: you can't apply "<<" to floating-point
values.)

Finally, even assuming that y is unsigned or positive, what happens if
I replace this:
x = y * 8 + z;
by this:
x = y << 3 + z;
? The answer depends on the relative precedence of the "*", "+", and
"<<" operators. I don't know about you, but I had to look it up; even
if you have all the operator precedences memorized, the next person
who reads your code probably hasn't.

Now, please explain how
x = y << 3;
is *better* than
x = y * 8;
particularly for beginners. (Answer: it isn't.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 17 '06 #25

P: n/a
we******@gmail.com wrote:
Nick Keighley wrote:
we******@gmail.com wrote:
Nick Keighley wrote:

we******@gmail.com wrote:

>Eric Sosman wrote:
>
>>we******@gmail.com wrote:
>>
>>>sp****@gmail.com wrote:
>>>
>>>>I didn't read your whole page but had a look at the table in the
>>>>section "Strictly for beginners". Can you explain why would
>>>>"x = y << 3" be faster than "x = y * 8" ? [...]

I was using a compiler in the '80s that was smart enough to replace * 8
with << 3. Stuff like this in the source just obscures the intent.

Explain how it obscures the intent.


obviously we are not on the same wavelength. If I want to multiply by 8
I write *8. Why would I do anything else? (unless measurement showed
<<3 was faster *and* this line of code needed to be faster).

If shifting by 3 is the *same* as multiplying by 8, then how is
shifting by 8 "doing something else" other than multiplying by 8? Tell
me, why do you think C includes a << operation?

To left shift?

Can you name a compiler that won't optimise a multiply by power of 2 to
a shift?

Unnecessary optimisation is even worse than premature optimisation.

--
Ian Collins.
Jun 17 '06 #26

P: n/a
Keith Thompson a écrit :
we******@gmail.com writes:
Nick Keighley wrote:
we******@gmail.com wrote:

Nick Keighley wrote:

>we******@gmail.com wrote:
>
>>Eric Sosman wrote:
>>
>>>we******@gmail.com wrote:
>>>
>>>>sp****@gmail.com wrote:
>>>>
>>>>>I didn't read your whole page but had a look at the table in the
>>>>>section "Strictly for beginners". Can you explain why would
>>>>>"x = y << 3" be faster than "x = y * 8" ? [...]
>
>I was using a compiler in the '80s that was smart enough to replace * 8
>with << 3. Stuff like this in the source just obscures the intent.

Explain how it obscures the intent.

obviously we are not on the same wavelength. If I want to multiply by 8
I write *8. Why would I do anything else? (unless measurement showed
<<3 was faster *and* this line of code needed to be faster).


If shifting by 3 is the *same* as multiplying by 8, then how is
shifting by 8 "doing something else" other than multiplying by 8? Tell
me, why do you think C includes a << operation?

For bitwise shifting, of course. Why do you think C includes a "*"
operation? Even if they happen to be mathematically equivalent in
certain cases, they're *conceptually* very different.

If I want something to be 8 times as big, I'll multiply it by 8.
(Tomorrow I might decide I want it to be 9 times as big.)

If I want to shift the bits of an object left by 3 positions I'll use
"<< 3". If shifting is really what I want to do here, I might decide
tomorrow that I want to shift by 4 bits; it's unlikely I'll decide I
want to multiply by 9.

I'll use the multiplication if it expresses what I want to do. If a
shift happens to be more efficient, I'll trust the compile to generate
a shift instruction -- or not to if it can't prove that it's really
equivalent, or if a multiplication is just as fast as a shift.

What if y is negative? (Answer: undefined behavior.) What if y is
floating-point? (Answer: you can't apply "<<" to floating-point
values.)

Finally, even assuming that y is unsigned or positive, what happens if
I replace this:
x = y * 8 + z;
by this:
x = y << 3 + z;
? The answer depends on the relative precedence of the "*", "+", and
"<<" operators. I don't know about you, but I had to look it up; even
if you have all the operator precedences memorized, the next person
who reads your code probably hasn't.

Now, please explain how
x = y << 3;
is *better* than
x = y * 8;
particularly for beginners. (Answer: it isn't.)


I agree with Mr Thomson here.

1) In modern Pentium processors, the (very clever) Intel designers
took the barrel shifter (that was there since the 8086) away.
This means that a <<= 3 is SLOWER than a *= 8.

Modern processors are highly complex beasts, and unless you
study the subject matter you risk to do the wrong decisions.

2) All C compilers will do the optimization a *= 8; with a shift
if a is unsigned, AND the processor warrants it. Lcc-win32
does this always since I was too lazy to modify the compiler
to follow that brain-dead Intel decision. After measuring
the actual code I could not find any difference even after
several billion tries, so I think... well forget it :-)

3) Most of this micro optimizations will not speed up a program.
A better algorithm, and above all a better DATA LAYOUT will
improve the program.

In modern workstation processors, the speed of main memory is VERY
slow compared to the speed of the processor, almost a factor of
20 or so. So you risk to do a lot of wait states when waiting for
memory to bring the data into the L1 cache. Optimizing data layout
so that locality of access is preserved (i.e. maintaining the data
in the L1 cache) is MUCH more important than caring about shifting
data or multiplying by 8. Those "optimization" are from the days
of the PDP11 or the 80286 maybe, but not in the modern processors
of today.

jacob
Jun 17 '06 #27

P: n/a
jacob navia wrote:
I agree with Mr Thomson here.

1) In modern Pentium processors, the (very clever) Intel designers
took the barrel shifter (that was there since the 8086) away.
This means that a <<= 3 is SLOWER than a *= 8.
You mean the abortifact called the Pentium 4? First of all, its not
*SLOWER*, its about the same speed (the hardware operations I mean).
Second of all, the Pentium 4 is now officially obsolete, and the "slow
shifter" decision has been recinded. The new Israeli designed cores
from Intel have gone back to fast shifting. Implicit in my standing
behind my recommendation of continuing to use left shift instead of
multiplication when you can was a prediction that this would happen.
(It appears, I was right.) And finally, what do you say about AMD's
Opteron cores? They include 3 parallel single cycle integer shifters.
Modern processors are highly complex beasts, and unless you
study the subject matter you risk to do the wrong decisions.
You know who you are talking to right?
2) All C compilers will do the optimization a *= 8; with a shift
if a is unsigned, AND the processor warrants it. Lcc-win32
does this always since I was too lazy to modify the compiler
to follow that brain-dead Intel decision. After measuring
the actual code I could not find any difference even after
several billion tries, so I think... well forget it :-)
On the P4? Like I said, the reason why you measured no difference, is
because there is no difference. However, for Opterons, its still 0.33
- 1.0 clocks versus 3 clocks.
3) Most of this micro optimizations will not speed up a program.
A better algorithm, and above all a better DATA LAYOUT will
improve the program.
This sounds like a blanket statement, that is inherently inconsistent.
Once you *have* this better algorithm, are you still sure that
micro-optimizations wont work? This is a classic line typically spoken
by people who are actually bad at both strategies.
In modern workstation processors, the speed of main memory is VERY
slow compared to the speed of the processor, almost a factor of
20 or so.
First of all, you have to normalize your apples versus oranges
comparison, and second of all the current factor is more like 200.
However, CPU caches and caching strategies have also gotten a lot
better.
[...] So you risk to do a lot of wait states when waiting for
memory to bring the data into the L1 cache. Optimizing data layout
so that locality of access is preserved (i.e. maintaining the data
in the L1 cache) is MUCH more important than caring about shifting
data or multiplying by 8. Those "optimization" are from the days
of the PDP11 or the 80286 maybe, but not in the modern processors
of today.


Perhaps the days in which the *average* programmer was capable of
seeing the difference have gone. However, the benefits and
opportunities have not gone away.

Real time speech recognition and translation is still quite taxing on
processor, as is real time facial recognition and simpler things like
video compression. For a directory of very large jpeg images
Microsoft's image view is unable to keep up with holding down the left
arrow key. You still think optimization is unimportant?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jun 18 '06 #28

P: n/a
Keith Thompson wrote:
we******@gmail.com writes:
Nick Keighley wrote:
we******@gmail.com wrote:
> Nick Keighley wrote:
> > we******@gmail.com wrote:
> > > Eric Sosman wrote:
> > > > we******@gmail.com wrote:
> > > > > sp****@gmail.com wrote:
> > > > >>I didn't read your whole page but had a look at the table in the
> > > > >>section "Strictly for beginners". Can you explain why would
> > > > >>"x = y << 3" be faster than "x = y * 8" ? [...]
> >
> > I was using a compiler in the '80s that was smart enough to replace * 8
> > with << 3. Stuff like this in the source just obscures the intent.
>
> Explain how it obscures the intent.

obviously we are not on the same wavelength. If I want to multiply by 8
I write *8. Why would I do anything else? (unless measurement showed
<<3 was faster *and* this line of code needed to be faster).
If shifting by 3 is the *same* as multiplying by 8, then how is
shifting by 8 "doing something else" other than multiplying by 8? Tell
me, why do you think C includes a << operation?


For bitwise shifting, of course. Why do you think C includes a "*"
operation? Even if they happen to be mathematically equivalent in
certain cases, they're *conceptually* very different.


Ignoring pedantry -- no they are *NOT*. This is like your belief that
a == 0 is conceptually different from 0 == a. If they are the same,
then they are the same. They don't gain a difference, conceptual or
otherwise, because they have a syntactical difference.
If I want something to be 8 times as big, I'll multiply it by 8.
(Tomorrow I might decide I want it to be 9 times as big.)

If I want to shift the bits of an object left by 3 positions I'll use
"<< 3".
So if I want to get a tan, should I sit out in the sun or should I go
to a tanning machine? Afterall its a *TAN* -- shouldn't I *USE* the
sun? But the tanning machine *ALSO* gives me a tan ... how do I
decide?

If you see this:

(x << c) * 8, or (x * 8) << c, or (x * 8) & 3

Does your decision still make as much sense?

If the tanning salon is around the corner and I live in Seattle, then
doesn't that help me decide?

Now, if I am concerned about performance, and I know that some
compilers might not catch the *8 -> << 3 transformation, then doesn't
*that* say something about the decision I make?
[...] If shifting is really what I want to do here, I might decide
tomorrow that I want to shift by 4 bits; it's unlikely I'll decide I
want to multiply by 9.

I'll use the multiplication if it expresses what I want to do. If a
shift happens to be more efficient, I'll trust the compile to generate
a shift instruction -- or not to if it can't prove that it's really
equivalent, or if a multiplication is just as fast as a shift.

What if y is negative? (Answer: undefined behavior.)
Not on a serious machine. That the standard chooses to take this
position is not surprising, but not really relevant either. Here's a
challenge for you: Name 3 platforms with a total install base > 1000
independent machines where this is an issue.
[...] What if y is floating-point? (Answer: you can't apply "<<" to floating-point
values.)
Or what if y is a string? That's pretty deep.
Finally, even assuming that y is unsigned or positive, what happens if
I replace this:
x = y * 8 + z;
by this:
x = y << 3 + z;
? The answer depends on the relative precedence of the "*", "+", and
"<<" operators. I don't know about you, but I had to look it up;
That's because you left off the parentheses. C's order of precidence
is stupid, and leaving off the parentheses should be a violation of
every coding guideline in existence. You can't use ioccc techniques to
defend your inability to see an equivalence.
[...] even
if you have all the operator precedences memorized, the next person
who reads your code probably hasn't.

Now, please explain how
x = y << 3;
is *better* than
x = y * 8;
particularly for beginners. (Answer: it isn't.)


Why would I do that? You're not about to pull a CBF here, and drag my
platform-dependent not-C language focused webpages into here and claim
I am teaching C with them are you?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jun 18 '06 #29

P: n/a
we******@gmail.com a écrit :
jacob navia wrote:
I agree with Mr Thomson here.

1) In modern Pentium processors, the (very clever) Intel designers
took the barrel shifter (that was there since the 8086) away.
This means that a <<= 3 is SLOWER than a *= 8.

You mean the abortifact called the Pentium 4? First of all, its not
*SLOWER*, its about the same speed (the hardware operations I mean).
Second of all, the Pentium 4 is now officially obsolete, and the "slow
shifter" decision has been recinded.


I thought it would arrive anyway, so it is good news.

The new Israeli designed cores
from Intel have gone back to fast shifting. Implicit in my standing
behind my recommendation of continuing to use left shift instead of
multiplication when you can was a prediction that this would happen.
(It appears, I was right.) And finally, what do you say about AMD's
Opteron cores? They include 3 parallel single cycle integer shifters.

Modern processors are highly complex beasts, and unless you
study the subject matter you risk to do the wrong decisions.

You know who you are talking to right?


Excuse me sir. Should I speak to you as "your holiness" maybe ? :-)
2) All C compilers will do the optimization a *= 8; with a shift
if a is unsigned, AND the processor warrants it. Lcc-win32
does this always since I was too lazy to modify the compiler
to follow that brain-dead Intel decision. After measuring
the actual code I could not find any difference even after
several billion tries, so I think... well forget it :-)

On the P4? Like I said, the reason why you measured no difference, is
because there is no difference. However, for Opterons, its still 0.33
- 1.0 clocks versus 3 clocks.

3) Most of this micro optimizations will not speed up a program.
A better algorithm, and above all a better DATA LAYOUT will
improve the program.

This sounds like a blanket statement, that is inherently inconsistent.
Once you *have* this better algorithm, are you still sure that
micro-optimizations wont work? This is a classic line typically spoken
by people who are actually bad at both strategies.


Thanks a lot for this compliment your holiness...

But may I add a small (final) comment?

You have the bad habit (that many people here share with you, relax)
of always expressing your views in the most polemic form, so your
"opponents" (that actually are people that maybe disagree with you
in some detail point, nothing more) are depicted as ignorants,
stupid know nothing folks.

Let it be, as Lennon would say. You need this form of discussion,
it is good for your ego... Let it be.

Personally, I respect people and their views, so it is impossible
for me to communicate correctly with someone that has this
"I am always the best" attitude. Sorry.
Jun 18 '06 #30

P: n/a
we******@gmail.com wrote:
Keith Thompson wrote:

.... snip ...

Now, please explain how
x = y << 3;
is *better* than
x = y * 8;
particularly for beginners. (Answer: it isn't.)


Why would I do that? You're not about to pull a CBF here, and
drag my platform-dependent not-C language focused webpages into
here and claim I am teaching C with them are you?


Apart from one instance when I downloaded your alleged better
string library and made some comments after a cursory scan, my
criticisms have been based on material and practices you presented
here. You don't teach C, you tend to ignore it, and to become
hyper-sensitive when the failings are pointed out. Using << for
multiplication is just one such example. Assuming ints larger than
16 bits is another.

--
"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
Jun 18 '06 #31

P: n/a
we******@gmail.com writes:
Keith Thompson wrote:
we******@gmail.com writes:
> Nick Keighley wrote:
>> we******@gmail.com wrote:
>> > Nick Keighley wrote:
>> > > we******@gmail.com wrote:
>> > > > Eric Sosman wrote:
>> > > > > we******@gmail.com wrote:
>> > > > > > sp****@gmail.com wrote:
>> > > > > >>I didn't read your whole page but had a look at the
>> > > > > >>table in the section "Strictly for beginners". Can you
>> > > > > >>explain why would "x = y << 3" be faster than "x = y *
>> > > > > >>8" ? [...]
>> > >
>> > > I was using a compiler in the '80s that was smart enough to
>> > > replace * 8 with << 3. Stuff like this in the source just
>> > > obscures the intent.
>> >
>> > Explain how it obscures the intent.
>>
>> obviously we are not on the same wavelength. If I want to multiply by 8
>> I write *8. Why would I do anything else? (unless measurement showed
>> <<3 was faster *and* this line of code needed to be faster).
>
> If shifting by 3 is the *same* as multiplying by 8, then how is
> shifting by 8 "doing something else" other than multiplying by 8? Tell
> me, why do you think C includes a << operation?
For bitwise shifting, of course. Why do you think C includes a "*"
operation? Even if they happen to be mathematically equivalent in
certain cases, they're *conceptually* very different.


Ignoring pedantry -- no they are *NOT*. This is like your belief that
a == 0 is conceptually different from 0 == a. If they are the same,
then they are the same. They don't gain a difference, conceptual or
otherwise, because they have a syntactical difference.


Nonsense, but I see there's no point in arguing further.

[tanning analogy snipped]
Now, if I am concerned about performance, and I know that some
compilers might not catch the *8 -> << 3 transformation, then doesn't
*that* say something about the decision I make?


Not a whole not, no. I *might* consider polluting my source code with
a "*8 --> <<3" transformation *if and only if* (a) I know that the
compiler doesn't do the transformation itself, (b) I know that the
transformation will actually improve performance, and (c) I know that
the performance increase is actually significint (because I've
measured it). I can imagine this being sensible on an embedded system
with a defective compiler. It's hardly appropriate for something
that's intended for beginners only.

I might include this in a beginners' tutorial as an example of the
kind of thing programmers commonly had to do in the past, but that's
now far less useful than writing clear code.
[...] If shifting is really what I want to do here, I might decide
tomorrow that I want to shift by 4 bits; it's unlikely I'll decide I
want to multiply by 9.

I'll use the multiplication if it expresses what I want to do. If a
shift happens to be more efficient, I'll trust the compile to generate
a shift instruction -- or not to if it can't prove that it's really
equivalent, or if a multiplication is just as fast as a shift.

What if y is negative? (Answer: undefined behavior.)


Not on a serious machine. That the standard chooses to take this
position is not surprising, but not really relevant either. Here's a
challenge for you: Name 3 platforms with a total install base > 1000
independent machines where this is an issue.


I didn't say it misbehaves on any particular machine. I said it
invokes undefined behavior. Get a copy of any draft of the C standard
and look up the definition of that phrase.

In answer to your challenge, I don't know. It's possible that there
are no such machines, but I'd much rather write "x * 8" than spend the
time convincing myself that "x << 3" is equivalent on all possible
present an future platforms on which my code might run.

Here's a challenge for you. Name 3 platforms with a total install
base greater than 1000 independent machines where:

(a) The compiler does not transform "x * 8" to the machine code
equivalent of "x << 3" in cases where it's provably equivalent
when the compiler is invoked with common optimization options, *and*
(b) The transformation, if it were performed, would make the code
faster.

This excludes platforms where the compiler doesn't perform the
transformation by default, but does with, say, a "-O" option (if
you're that concerned about performance, you're obviously going to
crank up the optimization level anyway). It also excludes platforms
where "x << 3" is no faster than "x * 8". It particularly excludes
platforms where "x << 3" might be *slower* than "x * 8" (where your
source-level transformation could actually hurt performance, unless
the compiler is smart enough to undo it for you).

[...]
Finally, even assuming that y is unsigned or positive, what happens if
I replace this:
x = y * 8 + z;
by this:
x = y << 3 + z;
? The answer depends on the relative precedence of the "*", "+", and
"<<" operators. I don't know about you, but I had to look it up;


That's because you left off the parentheses. C's order of precidence
is stupid, and leaving off the parentheses should be a violation of
every coding guideline in existence. You can't use ioccc techniques to
defend your inability to see an equivalence.


Would you insist in parentheses in "x = (y * 8) + z;"?
[...] even
if you have all the operator precedences memorized, the next person
who reads your code probably hasn't.

Now, please explain how
x = y << 3;
is *better* than
x = y * 8;
particularly for beginners. (Answer: it isn't.)


Why would I do that? You're not about to pull a CBF here, and drag my
platform-dependent not-C language focused webpages into here and claim
I am teaching C with them are you?


I haven't even looked at your web pages. I'm responding only to
what's been posted here.

You're defending the benefits of writing "x = y << 3;" rather than
"x = y * 8;", and you're doing so in this newsgroup.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 18 '06 #32

P: n/a
<we******@gmail.com> wrote in message
news:11**********************@p79g2000cwp.googlegr oups.com...
Eric Sosman wrote:
we******@gmail.com wrote:
> Eric Sosman wrote:
>> Thought exercise #1: For what `y' are the expressions
>>`y * 8' and `y << 3' *not* equivalent?
>
> Whatever dude, the real world operates in 2s complement.
Well, you've spotted one of the non-equivalences.
Care to try for any of the others?


No -- why would I do that?


Because it's helpful to know when your "optimization" isn't valid, otherwise
you're going to insert bugs into your code without knowing it.
"Real programming" is not about trying to outguess the
compiler.


Well, its always possible to out-*know* the compiler. You can, of
course, decide not to know, and assume the world is perfectly ISO C,
and that there is nothing to know outside of that. I am curious as to
whether or not you've ever encountered a compiler with a bug in it?


We all have, but I'd say with certainty that all of us have written far more
bugs in our own code than we've found in our compilers.

If your compiler is not smart enough to do strength reduction when it's
safe, you need a new compiler, period. This sort of "coding trick" makes
code harder to understand, introduces subtle bugs, and worst of all provides
zero performance benefit even when they're correct.
[...] It is in fact possible for a programmer to outguess
a compiler, at least some of the time, but I've never met or
even heard of anyone who could outguess multiple compilers on
multiple systems.


What? I'm not sure how to even respond to that.

I mean, I do that all the time; I've got webpages that *show* this kind
of thing. Anyone who does any serious cross-platform development
probably does this.


Anyone who does serious cross-platform compatibility writes as much of the
code in a portable fashion as possible and lets the compiler sort it out for
each platform. Any code which is not possible to make portable is
segregated by platform and tuned for the specific implementation in use, but
that doesn't matter since it's not _intended_ to be portable.

If the compiler doesn't do a good enough job optimizing the portable code
for each platform, you complain to the compiler vendor, not try to outsmart
it. Trying to outsmart the compiler costs far more in future code
maintenance than any possible benefit you could get from microoptimizations.
My real world includes doing exactly that, as just a standard activity.
Your claims about who you've met or heard of relative to this, seem
curious to say the least. I mean if you were CBF or Keith, then fine,
but you work for Sun. Your company makes a thing called "Solaris", and
it runs really well on multiple platforms. You think they were just
hoping the compilers were good enough?
Note that Sun uses their own compilers, so if they run into sections of code
that aren't optimized well, they can make the compiler better. That's a
far, far better use of coder time (which is an expense) than hacking user
code to cope with bad compilers because it's a force multiplier.
Let's see: `y << 3' takes one more keystroke than `y * 8'.


Well, I'm an old-school math-oriented based person. Knowing that
something is a power of 2 (like a multiplying factor) still fires
neurons in my head, even if it doesn't for you.


I'm sure all of us recognize that 8 == 2^3. But we also recognize we're in
the 21st century and the compilers we use do strength reduction, so we're
more concerned with expressing to the people who will maintain our code
later on what the intent was than we are with proving we're clever.
I also like putting things like this: /* ... */ in my code too; they take
lots of keystrokes for no gain at all.


If you think useful comments provide "no gain at all", that says more about
your competence as a coder than anything else in this entire thread.

If you're specifically referring to a comment containing only an ellipsis,
you're right, that's just a complete waste of time, and you should be
embarrassed to admit you do that. That's worse than no comments at all.
Let's say it takes you one-tenth of a second to strike that
key. How many times must you evaluate the speeded-up expression
(if it is in fact speeded up) just to recoup your 0.1 second?


Oh ... well I don't know, maybe billions. But I suppose if n is always
less than a billion you must be right.


Given that it will provide zero speedup on any system you're likely to use,
any case where N is less than infinity proves him right. Not to mention all
the extra time you spent proving the change is safe that you'll need to
recoup. If you want to multiply by 8, just multiply by 8 and move on.
Profiling will tell you if that's not the best choice, but the decision
after that is _not_ whether to shift by 3 but which compiler to switch to.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin
--
Posted via a free Usenet account from http://www.teranews.com

Jun 18 '06 #33

P: n/a

Eric Sosman wrote:
we******@gmail.com wrote:
sp****@gmail.com wrote:

I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? [...]


It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it may
not perform the relevant transformation for you. Certainly for some
compilers it may make no difference. [...]


Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?


So what's a complete answer to this ?

Jun 18 '06 #34

P: n/a
sp****@gmail.com wrote:
Eric Sosman wrote:

we******@gmail.com wrote:

sp****@gmail.com wrote:

I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? [...]

It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it may
not perform the relevant transformation for you. Certainly for some
compilers it may make no difference. [...]
Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?

So what's a complete answer to this ?


As far as I know, your answer from last Friday:
I'd say they are equivalent if y is integral , non-negative
and the result of the operation won't exceed its type's range.


.... is correct. There's a tiny bit of fuzz on the phrase "its
type's range" because of integer promotions, and there's a tiny
bit of fuzz about whether the result "exceeds" the range of an
unsigned operand, but We All Know What You Meant.

--
Eric Sosman
es*****@acm-dot-org.invalid
Jun 19 '06 #35

P: n/a
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.org> wrote:
This excludes platforms where the compiler doesn't perform the
transformation by default, but does with, say, a "-O" option (if
you're that concerned about performance, you're obviously going to
crank up the optimization level anyway).


I'm not sure that your conclusion actually follows from your premise.

If you're going to add source-level "optimizations" to your code without
verifying that the the language guarantees that they'll ALWAYS do EXACTLY
what you expect them to, sooner or later (probably sooner) you'll start
noticing that your code is so fragile that the correctness-preserving
optimizations the compiler knows how to do will in fact not preserve
the appearance-of-correctness that your code has. When this happens,
the easiest solution (in the short term at least) is to stop asking the
compiler to optimize.

At this point, the type of programmer who actually does this sort of thing
is likely to conclude that these optimizations are in fact necessary
(because the code is slow without them, and they're replacing "slow"
operations with "faster" ones, so the new code will be faster[1]), rather
than taking the time to clean up the code so that the compiler optimizer
(which can be assumed to be better than the programmer - this may not
be universally true, but the number of cases where it isn't is small
enough to be ignored in discussions making any pretense of generality)
can be invoked.
dave

[1] This reasoning is likely to be at least as valid as the conclusion
it's supporting.

--
Dave Vandervies dj******@csclub.uwaterloo.ca
A reminder, then, that when you think someone's being unprofessional,
the first thing you should do is blame spammers. Realistically, they're
more likely to be at fault than anyone else is. --seebs
Jun 19 '06 #36

P: n/a
sp****@gmail.com wrote:
Eric Sosman wrote:
we******@gmail.com wrote:
sp****@gmail.com wrote:

I didn't read your whole page but had a look at the table in the
section "Strictly for beginners". Can you explain why would
"x = y << 3" be faster than "x = y * 8" ? [...]

It depends on your compiler. However, the assumption is that you
aren't in control of the quality of your compiler, and hence, it
may not perform the relevant transformation for you. Certainly
for some compilers it may make no difference. [...]


Thought exercise #1: For what `y' are the expressions
`y * 8' and `y << 3' *not* equivalent?


So what's a complete answer to this ?


Someone will probably come up with something more. If the type of
y is int, then whenever (y < 0), and whenever (y > (INT_MAX >> 1))

--
"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
Jun 19 '06 #37

This discussion thread is closed

Replies have been disabled for this discussion.