473,320 Members | 2,097 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

optimizing the switch statement in Duff's Device (casting a label, label abuse)

Here's a reminder of duff's device:

/*************************************/
#include <stdio.h>

#define STEP 8
#define MAX_LEN STEP*4+1
#define SOURCE_LEN 28
int main(void)
{
char source[SOURCE_LEN] =
"abcdefghijklmnopqrstuvwxyz0";
char destination[MAX_LEN] = {0};

char *src_ptr;
char *dest_ptr;
char *lim;

int user_input;

printf("Enter a number between incl. 1 and %d\n"
"To quit enter: 0<ENTER>\n", SOURCE_LEN-1);

while ((scanf("%d", &user_input) 0)
&& (user_input 0)
&& (user_input < SOURCE_LEN)) {

src_ptr = source;
lim = source + user_input;
dest_ptr = destination;

switch (user_input % 8) {
case 0: do {*dest_ptr++ = *src_ptr++;
case 7: *dest_ptr++ = *src_ptr++;
case 6: *dest_ptr++ = *src_ptr++;
case 5: *dest_ptr++ = *src_ptr++;
case 4: *dest_ptr++ = *src_ptr++;
case 3: *dest_ptr++ = *src_ptr++;
case 2: *dest_ptr++ = *src_ptr++;
case 1: *dest_ptr++ = *src_ptr++;
} while (src_ptr < lim);
}

#if 0 /*** alternative ***/
switch (user_input % 8) {
do {*dest_ptr++ = *src_ptr++;
case 7: *dest_ptr++ = *src_ptr++;
case 6: *dest_ptr++ = *src_ptr++;
case 5: *dest_ptr++ = *src_ptr++;
case 4: *dest_ptr++ = *src_ptr++;
case 3: *dest_ptr++ = *src_ptr++;
case 2: *dest_ptr++ = *src_ptr++;
case 1: *dest_ptr++ = *src_ptr++;
case 0: ; } while (src_ptr < lim);
}
#endif

*dest_ptr = '\0';
puts(destination);
}
return 0;
}
/*************************************/
The switch itself scales horribly. At worst we encounter 8 compare-
operations until we reach the desired case.

Maby there is a better way of doing the switch statement: In
particular - the argument to the switch evaluates to an integer

between 0 and 7 (incl.) and now the question is: could we not use this
number as an offset to immediately jump the code to

the desired location.

goto la_0 - (user_input % 8)
// * sizeof("*dest_ptr++ = *src_ptr++;"instruction)
;
do {*dest_ptr++ = *src_ptr++;
la_7: *dest_ptr++ = *src_ptr++;
la_6: *dest_ptr++ = *src_ptr++;
la_5: *dest_ptr++ = *src_ptr++;
la_4: *dest_ptr++ = *src_ptr++;
la_3: *dest_ptr++ = *src_ptr++;
la_2: *dest_ptr++ = *src_ptr++;
la_1: *dest_ptr++ = *src_ptr++;
la_0: ; } while (src_ptr < lim);
The code snippet above shows the idea:
we jump to an address relative to out pivot-label la_0.

Example
If (user_input % 8) == 1, we
want to jump to "la_0 minus a few instructions" to land up at la_1.
Can something like this be done?
Can we get a label's address and subtract (or add) an integer to it?
Or are there other alternatives that don't need a label...?

Of course, there are issues involved:
What if the compiler thinks it's "clever" :) ?
what does the compiler do with the empty ; located at la_0 (of course
we do not want a nop)?

Thanks a thousand... I'm getting a cold duff from the fridge!

Kind regards,
Albert

Oct 10 '07 #1
10 2404
On Oct 10, 5:25 pm, anon.a...@gmail.com wrote:
Here's a reminder of duff's device:

/*************************************/
#include <stdio.h>

#define STEP 8
#define MAX_LEN STEP*4+1
#define SOURCE_LEN 28

int main(void)
{
char source[SOURCE_LEN] =
"abcdefghijklmnopqrstuvwxyz0";
char destination[MAX_LEN] = {0};

char *src_ptr;
char *dest_ptr;
char *lim;

int user_input;

printf("Enter a number between incl. 1 and %d\n"
"To quit enter: 0<ENTER>\n", SOURCE_LEN-1);

while ((scanf("%d", &user_input) 0)
&& (user_input 0)
&& (user_input < SOURCE_LEN)) {

src_ptr = source;
lim = source + user_input;
dest_ptr = destination;

switch (user_input % 8) {
case 0: do {*dest_ptr++ = *src_ptr++;
case 7: *dest_ptr++ = *src_ptr++;
case 6: *dest_ptr++ = *src_ptr++;
case 5: *dest_ptr++ = *src_ptr++;
case 4: *dest_ptr++ = *src_ptr++;
case 3: *dest_ptr++ = *src_ptr++;
case 2: *dest_ptr++ = *src_ptr++;
case 1: *dest_ptr++ = *src_ptr++;
} while (src_ptr < lim);
}

#if 0 /*** alternative ***/
switch (user_input % 8) {
do {*dest_ptr++ = *src_ptr++;
case 7: *dest_ptr++ = *src_ptr++;
case 6: *dest_ptr++ = *src_ptr++;
case 5: *dest_ptr++ = *src_ptr++;
case 4: *dest_ptr++ = *src_ptr++;
case 3: *dest_ptr++ = *src_ptr++;
case 2: *dest_ptr++ = *src_ptr++;
case 1: *dest_ptr++ = *src_ptr++;
case 0: ; } while (src_ptr < lim);
}
#endif

*dest_ptr = '\0';
puts(destination);
}
return 0;

}

/*************************************/

The switch itself scales horribly. At worst we encounter 8 compare-
operations until we reach the desired case.
Really? What compiler are you using as a matter of interest? It must
be a pretty poor quality one.
Maby there is a better way of doing the switch statement: In
particular - the argument to the switch evaluates to an integer
between 0 and 7 (incl.) and now the question is: could we not use this
number as an offset to immediately jump the code to
the desired location.
The compiler could obviously do it, that's what tools are for. I'm
interested to know what compiler you've got which doesn't do what you
say for the code given above.
goto la_0 - (user_input % 8)
// * sizeof("*dest_ptr++ = *src_ptr++;"instruction)
;

do {*dest_ptr++ = *src_ptr++;
la_7: *dest_ptr++ = *src_ptr++;
la_6: *dest_ptr++ = *src_ptr++;
la_5: *dest_ptr++ = *src_ptr++;
la_4: *dest_ptr++ = *src_ptr++;
la_3: *dest_ptr++ = *src_ptr++;
la_2: *dest_ptr++ = *src_ptr++;
la_1: *dest_ptr++ = *src_ptr++;
la_0: ; } while (src_ptr < lim);

The code snippet above shows the idea:
we jump to an address relative to out pivot-label la_0.

Example
If (user_input % 8) == 1, we
want to jump to "la_0 minus a few instructions" to land up at la_1.

Can something like this be done?
Not easily or efficiently in C.
Can we get a label's address and subtract (or add) an integer to it?
Or are there other alternatives that don't need a label...?
Use Duff's device and a decent compiler.
Of course, there are issues involved:
What if the compiler thinks it's "clever" :) ?
what does the compiler do with the empty ; located at la_0 (of course
we do not want a nop)?

Thanks a thousand... I'm getting a cold duff from the fridge!
I'd track down a decent compiler which does the sensible thing with
Duff's device before spending time trying to hack up some alternative.
An old gcc on x86 does the obvious thing correctly, for example.

Oct 10 '07 #2
an*******@gmail.com writes:
Here's a reminder of duff's device:
<snip>
switch (user_input % 8) {
case 0: do {*dest_ptr++ = *src_ptr++;
case 7: *dest_ptr++ = *src_ptr++;
case 6: *dest_ptr++ = *src_ptr++;
case 5: *dest_ptr++ = *src_ptr++;
case 4: *dest_ptr++ = *src_ptr++;
case 3: *dest_ptr++ = *src_ptr++;
case 2: *dest_ptr++ = *src_ptr++;
case 1: *dest_ptr++ = *src_ptr++;
} while (src_ptr < lim);
}
<snip>
The switch itself scales horribly. At worst we encounter 8 compare-
operations until we reach the desired case.

Maby there is a better way of doing the switch statement: In
particular - the argument to the switch evaluates to an integer

between 0 and 7 (incl.) and now the question is: could we not use this
number as an offset to immediately jump the code to
Yes. Most compilers will do this for you simply by giving it the
above code. A compiler might choose to use repeated compare
operations, but it seems like any unlikely choice in this case. If
yours does, I'd look to change it rather than venture into
non-standard "computed goto" territory (not that I am saying any
compilers support it, but if they did, I would avoid it).

--
Ben.
Oct 10 '07 #3
On Wed, 10 Oct 2007 09:25:41 -0700, anon.asdf wrote:

<snip>
>
The switch itself scales horribly. At worst we encounter 8 compare-
operations until we reach the desired case.
<snip>
>
Can we get a label's address and subtract (or add) an integer to it?
Or are there other alternatives that don't need a label...?

Of course, there are issues involved:
What if the compiler thinks it's "clever" :) ?
what does the compiler do with the empty ; located at la_0 (of course
we do not want a nop)?

Thanks a thousand... I'm getting a cold duff from the fridge!

Kind regards,
Albert
You shouldn't assume that a switch is always implemented by a
sequence of comparisons; it might well be implemented as a jump table.

What you seek can be done with gcc, using the labels as values
extension, (you can store the labels in an array L say, and goto
L[user_output%8]) but that isn't (standard) C.


Oct 10 '07 #4
On Oct 10, 7:12 pm, Duncan Muirhead <dm...@csl.co.ukwrote:
On Wed, 10 Oct 2007 09:25:41 -0700, anon.asdf wrote:

<snip>
The switch itself scales horribly. At worst we encounter 8 compare-
operations until we reach the desired case.

<snip>
Can we get a label's address and subtract (or add) an integer to it?
Or are there other alternatives that don't need a label...?
Of course, there are issues involved:
What if the compiler thinks it's "clever" :) ?
what does the compiler do with the empty ; located at la_0 (of course
we do not want a nop)?
Thanks a thousand... I'm getting a cold duff from the fridge!
Kind regards,
Albert

You shouldn't assume that a switch is always implemented by a
sequence of comparisons; it might well be implemented as a jump table.

Ah great! Thanks for the info.
(I was mislead by reading http://www.geekronomicon.com/?q=node/68#switch).

What you seek can be done with gcc, using the labels as values
extension, (you can store the labels in an array L say, and goto
L[user_output%8]) but that isn't (standard) C.
I'm using gcc, so I'll keep that in mind in case I need to do
something more exotic than a switch, one day.
Kind regards,
Albert

Oct 10 '07 #5
The whole thing is completely stupid.

1. If you want to make a loop fast, use a good compiler and tell it to
unroll the loop. This has many advantages: The compiler can make a
good decision how often to unroll the loop. It will know where using
more unrolling is just pointless. It knows exactly what is the best
way to divide things up between partial and complete iterations. Using
Duff's device, the compiler has to figure out what is happening and
will most likely produce suboptimal code.

For example, on a typical x86 system, with eight times unrolling,
there would be a loop looking like dst[0]=src[0];dst[1]=src[1];...;dst
+=8;src+=8; Only two operations incrementing pointers. Duff's device
has a label after each *dst++=*src++; so it needs eight separate
increments for src and dst. It would probably be possible to detect
Duff's device and transform it back to a straightforward loop, but
that would just be a waste of compiler developer's time.

Now consider what happens if you have an auto-vectorising compiler. A
straightforward loop could be transformed to vector operations and
become ten times faster. But seeing Duff's device, the compiler will
just shake his head in disgust and run away.

However, in this case you could have got ten times faster code by just
calling memcpy (). What are the advantages? I'll give you one example:
If you compiled code for an earlier version of MacOS X running on an
ancient G3 processor, memcpy () would execute hand-optimised assembler
code running at maximum speed for that processor. Take the same
executable (NOT recompiled) and run it on last years G5 processor, and
it executes completely different code, optimised for a G5 processor.
And now you take the same executable and run it on a Macintosh with an
Intel processor. While Duff's device originally generated awful code
for a PowerPC G3, which is now automatically translated to even worse
x86 code, the memcpy () call actually executes hand-optimised x86
assembler code!

Summary: You can use an error-prone, unreadable construct and feel
clever about it, while wasting your time and producing slow code, or
you can leave the hard work to the compiler, save lots of coding and
debugging time, and get code that executes a lot faster. Tough
choice.
Oct 10 '07 #6
On Oct 10, 6:14 pm, Ben Bacarisse <ben.use...@bsb.me.ukwrote:
Yes. Most compilers will do this for you simply by giving it the
above code. A compiler might choose to use repeated compare
operations, but it seems like any unlikely choice in this case. If
yours does, I'd look to change it rather than venture into
non-standard "computed goto" territory (not that I am saying any
compilers support it, but if they did, I would avoid it).
For small numbers of consecutive values, it is much easier to do say
four compares and eight conditional branches. Compare to 1, branch if
less, branch if equal, compare to 3, branch if less, branch if equal,
Compare to 5 etc. etc. n/4 compares and n/2 branches on average.
Computed branches usually are much worse for branch prediction
hardware. For large numbers, use a binary branch strategy with log(n)
compares and branches in every case.

Oct 10 '07 #7
christian.bau wrote On 10/10/07 16:21,:
The whole thing is completely stupid.
Yes to that part. However ...
[...] Duff's device
has a label after each *dst++=*src++; so it needs eight separate
increments for src and dst. [...]
[...] in this case you could have got ten times faster code by just
calling memcpy ().
.... The O.P.'s code could (and should) be replaced by a
call to memcpy(), but Duff's device could not be. See,
for example,

http://en.wikipedia.org/wiki/Duff's_device

.... and let's give no more guff to Duff!

--
Er*********@sun.com
Oct 10 '07 #8
On Oct 10, 7:12 pm, Duncan Muirhead <dm...@csl.co.ukwrote:
On Wed, 10 Oct 2007 09:25:41 -0700, anon.asdf wrote:

<snip>
The switch itself scales horribly. At worst we encounter 8 compare-
operations until we reach the desired case.

<snip>
Can we get a label's address and subtract (or add) an integer to it?
Or are there other alternatives that don't need a label...?
Of course, there are issues involved:
What if the compiler thinks it's "clever" :) ?
what does the compiler do with the empty ; located at la_0 (of course
we do not want a nop)?
Thanks a thousand... I'm getting a cold duff from the fridge!
Kind regards,
Albert

You shouldn't assume that a switch is always implemented by a
sequence of comparisons; it might well be implemented as a jump table.

What you seek can be done with gcc, using the labels as values
extension, (you can store the labels in an array L say, and goto
L[user_output%8]) but that isn't (standard) C.
Hi!

Here's a code example using the non-standard,
but cool gcc extensions!

I cannot understand why this is not part of the C standard: it's
really powerful!!

{
//deliberately provocative statement
Ah who cares, gcc is the defacto standard anyway! // ;)
}

Regards,
Albert

/***************************************/

#include <stdio.h>

#define STEP 8
#define MAX_LEN STEP*4+1
#define SOURCE_LEN 28

int main(void)
{
char source[SOURCE_LEN] =
"abcdefghijklmnopqrstuvwxyz0";
char destination[MAX_LEN] = {0};

char *src_ptr;
char *dest_ptr;
char *lim;

void *jmp;

int user_input;

printf("Enter a number between incl. 1 and %d\n"
"To quit enter: 0<ENTER>\n", SOURCE_LEN-1);

while ((scanf("%d", &user_input) 0)
&& (user_input 0)
&& (user_input < SOURCE_LEN)) {

src_ptr = source;
lim = source + user_input;
dest_ptr = destination;

goto *(&&la_0 -
(user_input % 8) * (&&la_0 - && la_1));

do { *dest_ptr++ = *src_ptr++;
la_7: *dest_ptr++ = *src_ptr++;
la_6: *dest_ptr++ = *src_ptr++;
la_5: *dest_ptr++ = *src_ptr++;
la_4: *dest_ptr++ = *src_ptr++;
la_3: *dest_ptr++ = *src_ptr++;
la_2: *dest_ptr++ = *src_ptr++;
la_1: *dest_ptr++ = *src_ptr++;
la_0: ; } while (src_ptr < lim);

*dest_ptr = '\0';
puts(destination);
}
return 0;

}

Oct 11 '07 #9
an*******@gmail.com writes:
On Oct 10, 7:12 pm, Duncan Muirhead <dm...@csl.co.ukwrote:
>On Wed, 10 Oct 2007 09:25:41 -0700, anon.asdf wrote:

<snip>
The switch itself scales horribly. At worst we encounter 8 compare-
operations until we reach the desired case.

<snip>
Can we get a label's address and subtract (or add) an integer to it?
Or are there other alternatives that don't need a label...?
<snip>
>You shouldn't assume that a switch is always implemented by a
sequence of comparisons; it might well be implemented as a jump table.

What you seek can be done with gcc, using the labels as values
extension, (you can store the labels in an array L say, and goto
L[user_output%8]) but that isn't (standard) C.

Here's a code example using the non-standard,
but cool gcc extensions!
The GCC people are indeed "cool" -- you should take their advice:

"The switch statement is cleaner, so use that rather than an array
[of labels] unless the problem does not fit a switch statement very
well."
I cannot understand why this is not part of the C standard: it's
really powerful!!
I can't understand why you would obfuscate your code like this. What
is wrong with memcpy?

<snip>
goto *(&&la_0 -
(user_input % 8) * (&&la_0 - && la_1));
Way to go! You've taken a non-standard, non-portable, extension and
used it in such a way as to rely on assumptions that even gcc does not
guarantee. You must be aiming for minimum portability!
do { *dest_ptr++ = *src_ptr++;
la_7: *dest_ptr++ = *src_ptr++;
la_6: *dest_ptr++ = *src_ptr++;
la_5: *dest_ptr++ = *src_ptr++;
la_4: *dest_ptr++ = *src_ptr++;
la_3: *dest_ptr++ = *src_ptr++;
la_2: *dest_ptr++ = *src_ptr++;
la_1: *dest_ptr++ = *src_ptr++;
la_0: ; } while (src_ptr < lim);
As someone helpfully pointed out (in a reply to a post of mine) jump
tables are not even guaranteed to be the fastest was to implement a
switch.

Anyway, in this case you need memcpy.

--
Ben.
Oct 11 '07 #10
Duncan Muirhead <dm***@csl.co.ukwrote:
>
You shouldn't assume that a switch is always implemented by a
sequence of comparisons; it might well be implemented as a jump table.
Indeed. Even Ritchie's original PDP-11 compiler from 30 years ago had
three or four different strategies (decision tree, jump table, hash
function, etc.) for implementing a switch depending on the number and
distribution of the case values.

-Larry Jones

I take it there's no qualifying exam to be a Dad. -- Calvin
Oct 11 '07 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Dave | last post by:
Hello all, Consider the case of an if with many else if clauses vs. a switch with many cases. In thinking about how such control constructs would likely be implemented, it would seem that...
25
by: Ian Tuomi | last post by:
Is it possible to go through multiple cases in one case like this? how? switch(foo) { case 1 or 2: dosomething(); break; //dosomething if foo is either 1 or 2 case 3: dosomethingelse();...
7
by: Colin King | last post by:
Amusingly, one can use a while(0) statement to allow one to perform a switch statement without breaks. The while (0) enables the continue statements to break out of the switch. Ugly and...
22
by: Technoid | last post by:
Is it possible to have a conditional if structure nested inside a conditional switch structure? switch(freq) { case 1: CASENAME if (variable==1) { do some code }
6
by: peter_k | last post by:
Hi, Last time i'm interested in optimizing small c programs. On my studies we are sending the programs using the web interface to online judge. The person who will wrote the faster program get...
11
by: Hallvard B Furuseth | last post by:
I've been wondering sometimes: Anyone know why Duff's device is usually written like this: void duff(const char *str, int len) { int n = (len + 7) / 8; switch (len % 8) { case 0: do{...
22
by: John | last post by:
Hi Folks, I'm experimenting a little with creating a custom CEdit control so that I can decide on what the user is allowed to type into the control. I started off only allowing floating point...
12
by: gopala | last post by:
Hey people, Can somebody explain me how this fragment works ? I came across this question in a competition and was really blank about this at the time. Actually I traced the program later but I...
1
by: parag_paul | last post by:
void duff(register char *to, register char *from, register int count) { register int n=(count+7)/8; switch(count%8){ case 0: do{ *to++ = *from++; case 7: *to++ = *from++; case 6: *to++ =...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.