Hi,
I don't understand why this could happen?
The Code 1 will output `fff9'
and the Code 2 will output `1'
How could the `mod 8' not have effect?
/* Code 1 */
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
unsigned short a, b, c;
a = 0;
b = 7;
c = (a-b)%8;
printf("%x\n", c);
return 0;
}
/* Code 2 */
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
unsigned short a, b, c;
a = 0;
b = 7;
c = (a-b)%8U;
printf("%x\n", c);
return 0;
} 20 5077
In article <11**********************@z14g2000cwz.googlegroups .com>,
Hanzac Chen <ha****@gmail.com> wrote: /* Code 1 */ #include <stdio.h> #include <stdlib.h>
Note: you do not actually use stdlib.h in any of the code you show.
stdlib.h is, though, the source of EXIT_SUCCESS which you could be
using as your return code instead of using the magic number 0.
int main(int argc, char *argv[]) { unsigned short a, b, c;
The Code 1 will output `fff9'
Not if unsigned short happens to be a different size than on the machine
you happened to test it on.
a = 0; b = 7;
c = (a-b)%8;
printf("%x\n", c);
A printf %x format requires an unsigned int, not an unsigned short.
There's probably some default argument promotion going on, but
that's going to affect the result.
return 0; }
--
If you lie to the compiler, it will get its revenge. -- Eric Sosman ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) writes: In article <11**********************@z14g2000cwz.googlegroups .com>, Hanzac Chen <ha****@gmail.com> wrote:/* Code 1 */ #include <stdio.h> #include <stdlib.h>
Note: you do not actually use stdlib.h in any of the code you show. stdlib.h is, though, the source of EXIT_SUCCESS which you could be using as your return code instead of using the magic number 0.
But 0 is the least magical of all numbers, and the standard guarantees
that "return 0;" in main() returns a status indicating success.
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
On 19 Oct 2005 19:49:04 -0700, "Hanzac Chen" <ha****@gmail.com> wrote
in comp.lang.c:
In addition to the things that Walter Roberson correctly mentioned... Hi,
I don't understand why this could happen? The Code 1 will output `fff9' and the Code 2 will output `1' How could the `mod 8' not have effect?
/* Code 1 */ #include <stdio.h> #include <stdlib.h>
int main(int argc, char *argv[]) { unsigned short a, b, c; a = 0; b = 7;
c = (a-b)%8;
If signed int on your platform can hold all the values of an unsigned
short, 'a' and 'b' will be promoted to signed int for the subtraction.
Equivalent to:
unsigned short a, b, c;
signed int sia, sib;
a = 0;
b = 7;
sia = a;
sib = b;
c = (sia - sib)%8;
The expression inside the parentheses evaluates to (int)-7. One of
the two possible correct values for -7 % 8 allowed prior to C99 is -7.
When you assign (int)-7 to an unsigned short, the behavior of unsigned
types with out of range values occurs. (int)-7 is converted to
USHRT_MAX - 7. For a 16-bit unsigned short, this is 0xffff - 7, which
equals 0xfff9.
printf("%x\n", c); return 0; }
/* Code 2 */ #include <stdio.h> #include <stdlib.h>
int main(int argc, char *argv[]) { unsigned short a, b, c; a = 0; b = 7;
c = (a-b)%8U;
The same thing happens here, the subtraction yields a value of
(int)-7. Since the other operand of the % operator has the type
unsigned int, however, the value (int)-7 is promoted to (unsigned
int)-7 before the operation is performed.
printf("%x\n", c); return 0; }
When dealing with promotions of unsigned integer types to higher
ranking integer types, these promotions sometimes result in signed
values of the higher types.
--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.org> wrote: But 0 is the least magical of all numbers
The ancient Greek mathemeticians (e.g., Pythagorus) would
not have agreed at all -- they couldn't grasp the existance of 0.
0 has so many unusual properties that it is one of the -most-
magical of finite numbers.
In every other context in C, 0 represents "false" and non-zero
represents "true", but to continue that on to the exit value
would suppose that the exit value is posing the question
"Did this program fail", and 0 is answering that in the negative,
"No, it is false that the program failed". That's a pretty magical
interpretation.
--
All is vanity. -- Ecclesiastes
"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.ca> wrote in message
news:dj**********@canopus.cc.umanitoba.ca... In article <ln************@nuthaus.mib.org>, Keith Thompson <ks***@mib.org> wrote:But 0 is the least magical of all numbers The ancient Greek mathemeticians (e.g., Pythagorus) would not have agreed at all -- they couldn't grasp the existance of 0.
0 has so many unusual properties that it is one of the -most- magical of finite numbers.
In every other context in C, 0 represents "false" and non-zero represents "true",
Not all contexts. It often means literally the value
zero, or some function could give it some special meaning as
a return value (e.g. 'not found'), etc.
but to continue that on to the exit value would suppose that the exit value is posing the question "Did this program fail",
Or, "did it succeed?"
and 0 is answering that in the negative, "No, it is false that the program failed". That's a pretty magical interpretation.
No magic at all. THe standard explicitly speicifies
that main() returning zero means 'success'. Whether
or not the value recieved by the host is also zero,
is implementation/platform dependent.
-Mike
Hi,
Thanks to all of you that replied to my mail.
Jack Klein wrote: In addition to the things that Walter Roberson correctly mentioned...
Yes, I read that. But I've just write a test program. :-)
Thanks to Walter Roberson although you're strict.
If signed int on your platform can hold all the values of an unsigned short, 'a' and 'b' will be promoted to signed int for the subtraction. Equivalent to:
unsigned short a, b, c; signed int sia, sib; a = 0; b = 7; sia = a; sib = b;
c = (sia - sib)%8;
The expression inside the parentheses evaluates to (int)-7. One of the two possible correct values for -7 % 8 allowed prior to C99 is -7.
This is what I can't fully understand: all the variables in this
calculation are unsigned short, there is no need to promote to int.
Anyway, I have to obey the rule.
When you assign (int)-7 to an unsigned short, the behavior of unsigned types with out of range values occurs. (int)-7 is converted to USHRT_MAX - 7. For a 16-bit unsigned short, this is 0xffff - 7, which equals 0xfff9.
printf("%x\n", c); return 0; }
/* Code 2 */ #include <stdio.h> #include <stdlib.h>
int main(int argc, char *argv[]) { unsigned short a, b, c; a = 0; b = 7;
c = (a-b)%8U;
The same thing happens here, the subtraction yields a value of (int)-7. Since the other operand of the % operator has the type unsigned int, however, the value (int)-7 is promoted to (unsigned int)-7 before the operation is performed.
printf("%x\n", c); return 0; }
When dealing with promotions of unsigned integer types to higher ranking integer types, these promotions sometimes result in signed values of the higher types.
Thanks, I understand the full process now. :-)
On 19 Oct 2005 22:17:36 -0700, "Hanzac Chen" <ha****@gmail.com> wrote
in comp.lang.c: Hi,
Thanks to all of you that replied to my mail.
Jack Klein wrote: In addition to the things that Walter Roberson correctly mentioned... Yes, I read that. But I've just write a test program. :-) Thanks to Walter Roberson although you're strict.
If signed int on your platform can hold all the values of an unsigned short, 'a' and 'b' will be promoted to signed int for the subtraction. Equivalent to:
unsigned short a, b, c; signed int sia, sib; a = 0; b = 7; sia = a; sib = b;
c = (sia - sib)%8;
The expression inside the parentheses evaluates to (int)-7. One of the two possible correct values for -7 % 8 allowed prior to C99 is -7. This is what I can't fully understand: all the variables in this calculation are unsigned short, there is no need to promote to int.
There IS a need. The need is provided by the C standard, that
requires that almost all operators act on values of int or greater
rank.
Anyway, I have to obey the rule.
When you assign (int)-7 to an unsigned short, the behavior of unsigned types with out of range values occurs. (int)-7 is converted to USHRT_MAX - 7. For a 16-bit unsigned short, this is 0xffff - 7, which equals 0xfff9.
printf("%x\n", c); return 0; }
/* Code 2 */ #include <stdio.h> #include <stdlib.h>
int main(int argc, char *argv[]) { unsigned short a, b, c; a = 0; b = 7;
c = (a-b)%8U;
The same thing happens here, the subtraction yields a value of (int)-7. Since the other operand of the % operator has the type unsigned int, however, the value (int)-7 is promoted to (unsigned int)-7 before the operation is performed.
printf("%x\n", c); return 0; }
When dealing with promotions of unsigned integer types to higher ranking integer types, these promotions sometimes result in signed values of the higher types.
Thanks, I understand the full process now. :-)
--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
"Jack Klein" <ja*******@spamcop.net> wrote in message
news:c8********************************@4ax.com... On 19 Oct 2005 22:17:36 -0700, "Hanzac Chen" <ha****@gmail.com> wrote in comp.lang.c:
Hi,
Thanks to all of you that replied to my mail.
Jack Klein wrote: > In addition to the things that Walter Roberson correctly mentioned... Yes, I read that. But I've just write a test program. :-) Thanks to Walter Roberson although you're strict.
> If signed int on your platform can hold all the values of an unsigned > short, 'a' and 'b' will be promoted to signed int for the subtraction. > Equivalent to: > > unsigned short a, b, c; > signed int sia, sib; > a = 0; > b = 7; > sia = a; > sib = b; > > c = (sia - sib)%8; > > The expression inside the parentheses evaluates to (int)-7. One of > the two possible correct values for -7 % 8 allowed prior to C99 is -7. This is what I can't fully understand: all the variables in this calculation are unsigned short, there is no need to promote to int.
There IS a need. The need is provided by the C standard, that requires that almost all operators act on values of int or greater rank.
Anyway, I have to obey the rule.
> When you assign (int)-7 to an unsigned short, the behavior of unsigned > types with out of range values occurs. (int)-7 is converted to > USHRT_MAX - 7. For a 16-bit unsigned short, this is 0xffff - 7, which > equals 0xfff9. > > > printf("%x\n", c); > > return 0; > > } > > > > /* Code 2 */ > > #include <stdio.h> > > #include <stdlib.h> > > > > int main(int argc, char *argv[]) > > { > > unsigned short a, b, c; > > a = 0; > > b = 7; > > > > c = (a-b)%8U; > > The same thing happens here, the subtraction yields a value of > (int)-7. Since the other operand of the % operator has the type > unsigned int, however, the value (int)-7 is promoted to (unsigned > int)-7 before the operation is performed. > > > > printf("%x\n", c); > > return 0; > > } > > When dealing with promotions of unsigned integer types to higher > ranking integer types, these promotions sometimes result in signed > values of the higher types.
Thanks, I understand the full process now. :-)
Side affects on code generation when unsigned types are promoted to signed:
--------------------
/* Signed constant promotes expression from */
/* unsigned short to signed int. */
/* Mod operator returns a range of -7 to +7 */
unsigned short modTest(unsigned short a, unsigned short b)
{
return ( (a - b) % 8 );
}
00401000 mov eax,dword ptr [esp+4]
00401004 mov ecx,dword ptr [esp+8]
00401008 and eax,0FFFFh
0040100D and ecx,0FFFFh
00401013 sub eax,ecx
00401015 and eax,80000007h
0040101A jns 00401021
0040101C dec eax
0040101D or eax,0F8h
00401020 inc eax
00401021 ret
--------------------
/* Unsigned constant promotes expression from */
/* unsigned short to unsigned int. */
/* Mod operator returns a range of 0 to +7 */
unsigned short modTest(unsigned short a, unsigned short b)
{
return ( (a - b) % 8U );
}
00401000 mov eax,dword ptr [esp+4]
00401004 mov ecx,dword ptr [esp+8]
00401008 sub eax,ecx
0040100A and eax,7
0040100D ret
--------------------
/* Included just because it's so strange. */
/* The explicit casts resulted in demoting */
/* the interim expression to unsigned char! */
unsigned short modTest(unsigned short a, unsigned short b)
{
return ( (unsigned short)( (unsigned int)(a - b) ) % 8U );
}
00401000 mov al,byte ptr [esp+4]
00401004 mov cl,byte ptr [esp+8]
00401008 sub al,cl
0040100A and eax,7
0040100D ret
--------------------
All code generated by Microsoft VC6, release build,
code optimizations set for Maximize Speed.
(and may be copyrighted (C) by Microsoft Inc.)
Keyser Soze wrote:
<snip> Side affects on code generation when unsigned types are promoted to signed:
--------------------
/* Signed constant promotes expression from */ /* unsigned short to signed int. */ /* Mod operator returns a range of -7 to +7 */
unsigned short modTest(unsigned short a, unsigned short b) { return ( (a - b) % 8 ); }
00401000 mov eax,dword ptr [esp+4] 00401004 mov ecx,dword ptr [esp+8] 00401008 and eax,0FFFFh 0040100D and ecx,0FFFFh 00401013 sub eax,ecx 00401015 and eax,80000007h 0040101A jns 00401021 0040101C dec eax 0040101D or eax,0F8h 00401020 inc eax 00401021 ret
OT: Crappy code generation. It could simulaneously do the 'mov' and
'and' with 'movz', and eliminate branches using 'cmov' or some other
equivalent non-branching code transformation. Yawn.
Mark F. Haigh mf*****@sbcglobal.net
Mark F. Haigh wrote: Keyser Soze wrote:
<snip>
Side affects on code generation when unsigned types are promoted to signed:
--------------------
/* Signed constant promotes expression from */ /* unsigned short to signed int. */ /* Mod operator returns a range of -7 to +7 */
unsigned short modTest(unsigned short a, unsigned short b) { return ( (a - b) % 8 ); }
00401000 mov eax,dword ptr [esp+4] 00401004 mov ecx,dword ptr [esp+8] 00401008 and eax,0FFFFh 0040100D and ecx,0FFFFh 00401013 sub eax,ecx 00401015 and eax,80000007h 0040101A jns 00401021 0040101C dec eax 0040101D or eax,0F8h 00401020 inc eax 00401021 ret
OT: Crappy code generation. It could simulaneously do the 'mov' and 'and' with 'movz', and eliminate branches using 'cmov' or some other equivalent non-branching code transformation. Yawn.
Yes, OT.
There's also no indication of what architecture was used. 'cmov' only
exists on Pentium Pro and higher, while the 'mov'-'and' combination
beats 'movzx' on the 486 and 586. Indeed, "gcc -march=i386 -mtune=i486"
(use 386 instruction set but optimize for 486) produces code quite
similar to the above. On a pure 386 they should be equal, but gcc
chooses 'movzx' regardless, so it might know something I don't.
For better or worse, the default setting for most x86 compilers is to
use the 386 instruction set. For a 386-586, the VC compiler is producing
fine code. You can fault it for its default settings, but not for the
generated code.
Morals:
- Intel instruction sets are weird and optimal code differs from CPU to CPU.
- If code generation is a concern to you, tell your compiler exactly
what architecture you're compiling for: it matters.
- Don't assume too quickly that you're smarter than a compiler.
S.
Skarmander wrote: Mark F. Haigh wrote: Keyser Soze wrote:
<snip>
Side affects on code generation when unsigned types are promoted to signed:
--------------------
/* Signed constant promotes expression from */ /* unsigned short to signed int. */ /* Mod operator returns a range of -7 to +7 */
unsigned short modTest(unsigned short a, unsigned short b) { return ( (a - b) % 8 ); }
00401000 mov eax,dword ptr [esp+4] 00401004 mov ecx,dword ptr [esp+8] 00401008 and eax,0FFFFh 0040100D and ecx,0FFFFh 00401013 sub eax,ecx 00401015 and eax,80000007h 0040101A jns 00401021 0040101C dec eax 0040101D or eax,0F8h 00401020 inc eax 00401021 ret
OT: Crappy code generation. It could simulaneously do the 'mov' and 'and' with 'movz', and eliminate branches using 'cmov' or some other equivalent non-branching code transformation. Yawn.
Yes, OT.
There's also no indication of what architecture was used. 'cmov' only exists on Pentium Pro and higher, [...]
Gee, really? Why do you think I wrote "or some other equivalent
non-branching code transformation"?
[...] while the 'mov'-'and' combination beats 'movzx' on the 486 and 586.
Who cares? It's a single uop on anything modern, and (allegedly) costs
a cycle or two on 486 or 586's. The Intel optimization manuals say to
use it, and that's good enough for me, and good enough for gcc,
apparently.
Indeed, "gcc -march=i386 -mtune=i486" (use 386 instruction set but optimize for 486) produces code quite similar to the above. On a pure 386 they should be equal, but gcc chooses 'movzx' regardless, so it might know something I don't.
The newer gcc's will produce branchless code for the example when
tuning for a 586 or higher. A mispredicted branch is a minor
catastrophe. In fact, eliminating branches is the *first*
assembler/compiler coding rule in the Intel optimization manuals.
<snip>
Mark F. Haigh mf*****@sbcglobal.net
Mark F. Haigh wrote: Skarmander wrote:
Mark F. Haigh wrote:
Keyser Soze wrote:
<snip>
Side affects on code generation when unsigned types are promoted to signed:
--------------------
/* Signed constant promotes expression from */ /* unsigned short to signed int. */ /* Mod operator returns a range of -7 to +7 */
unsigned short modTest(unsigned short a, unsigned short b) { return ( (a - b) % 8 ); }
00401000 mov eax,dword ptr [esp+4] 00401004 mov ecx,dword ptr [esp+8] 00401008 and eax,0FFFFh 0040100D and ecx,0FFFFh 00401013 sub eax,ecx 00401015 and eax,80000007h 0040101A jns 00401021 0040101C dec eax 0040101D or eax,0F8h 00401020 inc eax 00401021 ret OT: Crappy code generation. It could simulaneously do the 'mov' and 'and' with 'movz', and eliminate branches using 'cmov' or some other equivalent non-branching code transformation. Yawn.
Yes, OT.
There's also no indication of what architecture was used. 'cmov' only exists on Pentium Pro and higher, [...]
Gee, really? Why do you think I wrote "or some other equivalent non-branching code transformation"?
Presumably because you expected me to believe you knew one, but felt
strengthening your argument by actually exhibiting it unnecessary. If
you feel this is trivial, I apologize. I'll admit I don't immediately
see it, and I was also distracted by what appeared to be unwarranted
assumptions on your end. [...] while the 'mov'-'and' combination beats 'movzx' on the 486 and 586.
Who cares? It's a single uop on anything modern, and (allegedly) costs a cycle or two on 486 or 586's. The Intel optimization manuals say to use it, and that's good enough for me, and good enough for gcc, apparently.
Yes, it's possible gcc chooses to use 'movzx' for "generic" 386 code on
the assumption that the instruction will on average be faster for newer
processors. It certainly isn't *worse* on an actual 386, so this is
acceptable.
Note that, the Intel optimization manuals and alleged cycle costs
notwithstanding, gcc 3.4.2 avoids the "good enough" on either the 486 or
the 586. So VC's choice may be suboptimal here (good only for the 386,
486 and 586, while gcc covers the 386, the 686 and (presumably, since
you never know with Intel) the future), but this doesn't seem to warrant
your snap judgment of "crappy code generation".
The newer gcc's will produce branchless code for the example when tuning for a 586 or higher.
I only have gcc 3.4.2 at my disposal here. You do mean tuning for 586
while using the 386 instruction set, right? Could you reproduce the
relevant code?
A mispredicted branch is a minor catastrophe. In fact, eliminating branches is the *first* assembler/compiler coding rule in the Intel optimization manuals.
Of course. It's quite possible that if you tell VC to tune for a 586 or
higher, the outcome would be different too. I'm fairly sure that
expecting a compiler to eliminate branches without giving it the
assumptions necessary to do this effectively will produce more than a
few minor catastrophes.
Like I said, you can fault VC for its default assumptions, but when you
know these, you have to evaluate its code generation in light of them.
S.
< snip of all other OT rants >
This thread has been hijacked down a rabbit whole of how many code
generators can dance on the head of an optimizer.
The point of the examples was to show how explicit type casts can affect the
generated code.
Not to start a discussion of the quality of the generated code.
I am too old to get a life but you youngsters should really get out more :)
Skarmander wrote: Mark F. Haigh wrote: Skarmander wrote:
<snip> There's also no indication of what architecture was used. 'cmov' only exists on Pentium Pro and higher, [...]
Gee, really? Why do you think I wrote "or some other equivalent non-branching code transformation"?
Presumably because you expected me to believe you knew one, but felt strengthening your argument by actually exhibiting it unnecessary. If you feel this is trivial, I apologize. I'll admit I don't immediately see it, and I was also distracted by what appeared to be unwarranted assumptions on your end.
Ok, this is getting too far OT, and this will be the last I have to say
on the matter. It has been my experience that the code fragment is
small and short enough that it can be solved (by an appropriately
advanced optimizing compiler) without branches. [...] while the 'mov'-'and' combination beats 'movzx' on the 486 and 586.
Who cares? It's a single uop on anything modern, and (allegedly) costs a cycle or two on 486 or 586's. The Intel optimization manuals say to use it, and that's good enough for me, and good enough for gcc, apparently.
Yes, it's possible gcc chooses to use 'movzx' for "generic" 386 code on the assumption that the instruction will on average be faster for newer processors. It certainly isn't *worse* on an actual 386, so this is acceptable.
Note that, the Intel optimization manuals and alleged cycle costs notwithstanding, gcc 3.4.2 avoids the "good enough" on either the 486 or the 586. So VC's choice may be suboptimal here (good only for the 386, 486 and 586, while gcc covers the 386, the 686 and (presumably, since you never know with Intel) the future), but this doesn't seem to warrant your snap judgment of "crappy code generation".
The newer gcc's will produce branchless code for the example when tuning for a 586 or higher.
I only have gcc 3.4.2 at my disposal here. You do mean tuning for 586 while using the 386 instruction set, right? Could you reproduce the relevant code?
I suppose. Last week's gcc 4.1 from CVS:
[mark@icepick ~]$ gcc-4_1_cvs_20051015 --version
gcc-4_1_cvs_20051015 (GCC) 4.1.0 20051015 (experimental)
[...]
[mark@icepick ~]$ gcc-4_1_cvs_20051015 -Wall -ansi -pedantic -O2
-mtune=i686 -fomit-frame-pointer -c -o foo.o foo.c
modTest:
movzwl 8(%esp), %edx
movzwl 4(%esp), %eax
subl %edx, %eax
cltd
shrl $29, %edx
addl %edx, %eax
andl $7, %eax
subl %edx, %eax
movzwl %ax, %eax
ret A mispredicted branch is a minor catastrophe. In fact, eliminating branches is the *first* assembler/compiler coding rule in the Intel optimization manuals.
Of course. It's quite possible that if you tell VC to tune for a 586 or higher, the outcome would be different too. I'm fairly sure that expecting a compiler to eliminate branches without giving it the assumptions necessary to do this effectively will produce more than a few minor catastrophes.
IMO, probably not. It's been a while since I've used VC6, but I
remember it seemed to produce nearly the same code no matter what you
tell it. Give it a try yourself-- hopefully I'll never have to touch
it again. Like I said, you can fault VC for its default assumptions, but when you know these, you have to evaluate its code generation in light of them.
Fair enough. I think it's a waste of time to look at the code
generation of what I consider to be a mostly obsolete compiler that
targets mostly obsolete machines. Like I originally said: Yawn.
Mark F. Haigh mf*****@sbcglobal.net
Keyser Soze wrote: < snip of all other OT rants >
This thread has been hijacked down a rabbit whole of how many code generators can dance on the head of an optimizer.
The point of the examples was to show how explicit type casts can affect the generated code.
Not to start a discussion of the quality of the generated code.
I am too old to get a life but you youngsters should really get out more :)
That's why it was flagged "off-topic". And telling Usenet posters to get
out more? Hmm...
Hey, at least nobody was compared to Hitler. That should count for
something.
S.
Mark F. Haigh wrote: Skarmander wrote:
Mark F. Haigh wrote:
Skarmander wrote:[...] while the 'mov'-'and' combination beats 'movzx' on the 486 and 586.
Who cares? It's a single uop on anything modern, and (allegedly) costs a cycle or two on 486 or 586's. The Intel optimization manuals say to use it, and that's good enough for me, and good enough for gcc, apparently.
Yes, it's possible gcc chooses to use 'movzx' for "generic" 386 code on the assumption that the instruction will on average be faster for newer processors. It certainly isn't *worse* on an actual 386, so this is acceptable.
Note that, the Intel optimization manuals and alleged cycle costs notwithstanding, gcc 3.4.2 avoids the "good enough" on either the 486 or the 586. So VC's choice may be suboptimal here (good only for the 386, 486 and 586, while gcc covers the 386, the 686 and (presumably, since you never know with Intel) the future), but this doesn't seem to warrant your snap judgment of "crappy code generation".
The newer gcc's will produce branchless code for the example when tuning for a 586 or higher.
I only have gcc 3.4.2 at my disposal here. You do mean tuning for 586 while using the 386 instruction set, right? Could you reproduce the relevant code?
I suppose. Last week's gcc 4.1 from CVS:
[mark@icepick ~]$ gcc-4_1_cvs_20051015 --version gcc-4_1_cvs_20051015 (GCC) 4.1.0 20051015 (experimental) [...]
[mark@icepick ~]$ gcc-4_1_cvs_20051015 -Wall -ansi -pedantic -O2 -mtune=i686 -fomit-frame-pointer -c -o foo.o foo.c
modTest: movzwl 8(%esp), %edx movzwl 4(%esp), %eax subl %edx, %eax cltd shrl $29, %edx addl %edx, %eax andl $7, %eax subl %edx, %eax movzwl %ax, %eax ret
Yup, clever. Branchless and pipeline-optimized. Thanks, I didn't see this. A mispredicted branch is a minor catastrophe. In fact, eliminating branches is the *first* assembler/compiler coding rule in the Intel optimization manuals.
Of course. It's quite possible that if you tell VC to tune for a 586 or higher, the outcome would be different too. I'm fairly sure that expecting a compiler to eliminate branches without giving it the assumptions necessary to do this effectively will produce more than a few minor catastrophes.
IMO, probably not. It's been a while since I've used VC6, but I remember it seemed to produce nearly the same code no matter what you tell it. Give it a try yourself-- hopefully I'll never have to touch it again.
I don't have it. What do you take me for, a Microsoft apologist? :-) The
point here was not to defend VC, but to verify whether your condemnation
was valid. Like I said, you can fault VC for its default assumptions, but when you know these, you have to evaluate its code generation in light of them.
Fair enough. I think it's a waste of time to look at the code generation of what I consider to be a mostly obsolete compiler that targets mostly obsolete machines. Like I originally said: Yawn.
Such is the danger of throwaway comments. That said, thanks for
indulging my demands that you expand your statements.
S.
Skarmander wrote: Mark F. Haigh wrote:
<snip> [mark@icepick ~]$ gcc-4_1_cvs_20051015 --version gcc-4_1_cvs_20051015 (GCC) 4.1.0 20051015 (experimental) [...]
[mark@icepick ~]$ gcc-4_1_cvs_20051015 -Wall -ansi -pedantic -O2 -mtune=i686 -fomit-frame-pointer -c -o foo.o foo.c
modTest: movzwl 8(%esp), %edx movzwl 4(%esp), %eax subl %edx, %eax cltd shrl $29, %edx addl %edx, %eax andl $7, %eax subl %edx, %eax movzwl %ax, %eax ret
Yup, clever. Branchless and pipeline-optimized. Thanks, I didn't see this.
One slight coda to this: gcc 3.4.2 will produce almost identical
branchless code if you manually crank up the cost of branches with
-mbranch-cost. Obviously, I didn't know about this option before, or I
would have tried it.
S.
Hello,
this is cross-posted with follow-up to comp.lang.asm.x86, because I talk about
generated and optimized x86 code for a C program, which is rahter OT in comp.lang.c.
Skarmander <in*****@dontmailme.com> wrote: I suppose. Last week's gcc 4.1 from CVS:
[mark@icepick ~]$ gcc-4_1_cvs_20051015 --version gcc-4_1_cvs_20051015 (GCC) 4.1.0 20051015 (experimental) [...]
[mark@icepick ~]$ gcc-4_1_cvs_20051015 -Wall -ansi -pedantic -O2 -mtune=i686 -fomit-frame-pointer -c -o foo.o foo.c
modTest: movzwl 8(%esp), %edx movzwl 4(%esp), %eax subl %edx, %eax cltd shrl $29, %edx addl %edx, %eax andl $7, %eax subl %edx, %eax movzwl %ax, %eax ret
Yup, clever. Branchless and pipeline-optimized. Thanks, I didn't see this.
Now, you are comparing something very weird. MSVC++ 6.0 is some years
older than "last weeks gcc CVS".
Ok, although it is OT, I did some test with a newer version of the MS
compiler (as been available with the latest release DDK, Win 2003 DDK SP
1, with default settings for release builds):
C:\test>cl
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.4035 for 80x86
Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.
Now, let's have a look at the compiled code:
unsigned short modTest1(unsigned short a, unsigned short b)
{
return ( (a - b) % 8 );
}
test!modTest1:
01001ba7 8bff mov edi,edi
01001ba9 55 push ebp
01001baa 8bec mov ebp,esp
01001bac 0fb74d0c movzx ecx,word ptr [ebp+0xc]
01001bb0 0fb74508 movzx eax,word ptr [ebp+0x8]
01001bb4 2bc1 sub eax,ecx
01001bb6 99 cdq
01001bb7 6a08 push 0x8
01001bb9 59 pop ecx
01001bba f7f9 idiv ecx
01001bbc 668bc2 mov ax,dx
01001bbf 5d pop ebp
01001bc0 c20800 ret 0x8
[...]
unsigned short modTest2(unsigned short a, unsigned short b)
{
return ( (a - b) % 8U );
}
test!modTest2:
01001bc8 8bff mov edi,edi
01001bca 55 push ebp
01001bcb 8bec mov ebp,esp
01001bcd 8b4508 mov eax,[ebp+0x8]
01001bd0 8b4d0c mov ecx,[ebp+0xc]
01001bd3 2bc1 sub eax,ecx
01001bd5 83e007 and eax,0x7
01001bd8 5d pop ebp
01001bd9 c20800 ret 0x8
[...]
unsigned short modTest3(unsigned short a, unsigned short b)
{
return ( (unsigned short)( (unsigned int)(a - b) ) % 8U );
}
test!modTest3:
01001be1 8bff mov edi,edi
01001be3 55 push ebp
01001be4 8bec mov ebp,esp
01001be6 33c0 xor eax,eax
01001be8 8a4508 mov al,[ebp+0x8]
01001beb 2a450c sub al,[ebp+0xc]
01001bee 83e007 and eax,0x7
01001bf1 5d pop ebp
01001bf2 c20800 ret 0x8
(Remark: I was not able to compile modTest2 and modTest3 if they were both
available with the main() calling it. The compiler always insisted on
optimizing them away. I had to use two different compilation units and link
them together to actually get code for them.)
Now, this looks much better than the MSVC++ 6.0 code, doesn't it?
Anyway, I am not sure if the IDIV approach with modTest1() is a good solution
compared with the solution of gcc 4.1. Any thoughts from the assembler gurus
here? ;)
Regards,
Spiro.
--
Spiro R. Trikaliotis http://cbm4win.sf.net/ http://www.trikaliotis.net/ http://www.viceteam.org/
"Skarmander" <in*****@dontmailme.com> wrote in message
news:43***********************@news.xs4all.nl... Keyser Soze wrote: < snip of all other OT rants >
This thread has been hijacked down a rabbit whole of how many code generators can dance on the head of an optimizer.
The point of the examples was to show how explicit type casts can affect the generated code.
Not to start a discussion of the quality of the generated code.
I am too old to get a life but you youngsters should really get out more :)
That's why it was flagged "off-topic". And telling Usenet posters to get out more? Hmm...
Hey, at least nobody was compared to Hitler. That should count for something.
S.
Yes that's a good thing.
---------------
"It's just a matter of time."
B. Mussolini http://cao.pt/image/hitler.jpg
On Wed, 19 Oct 2005 22:46:37 -0500, Jack Klein <ja*******@spamcop.net>
wrote:
<snip> If signed int on your platform can hold all the values of an unsigned short, 'a' and 'b' will be promoted to signed int for the subtraction. Equivalent to: <snip>
Right.
The expression inside the parentheses evaluates to (int)-7. One of the two possible correct values for -7 % 8 allowed prior to C99 is -7.
One of two correct values in C89/90 (and C95) and the only one in C99.
Prior to C89/90, I'm not sure if % was specified that precisely.
When you assign (int)-7 to an unsigned short, the behavior of unsigned types with out of range values occurs. (int)-7 is converted to USHRT_MAX - 7. For a 16-bit unsigned short, this is 0xffff - 7, which equals 0xfff9.
(USHRT_MAX + 1) - 7, in this case 0x10000 - 7. In mathematically
correct arithmetic, not necessarily C or machine arithmetic.
<snip rest>
- David.Thompson1 at worldnet.att.net This discussion thread is closed Replies have been disabled for this discussion. Similar topics
8 posts
views
Thread by Rade |
last post: by
|
27 posts
views
Thread by Marcus Kwok |
last post: by
|
9 posts
views
Thread by Fred Ma |
last post: by
|
9 posts
views
Thread by dam_fool_2003 |
last post: by
|
10 posts
views
Thread by tinesan |
last post: by
|
14 posts
views
Thread by junky_fellow |
last post: by
|
4 posts
views
Thread by ravinderthakur |
last post: by
|
13 posts
views
Thread by In a little while |
last post: by
|
26 posts
views
Thread by John Harrison |
last post: by
| | | | | | | | | | |