446,305 Members | 1,614 Online
Need help? Post your question and get tips & solutions from a community of 446,305 IT Pros & Developers. It's quick & easy.

# data types (bitwidth) and multiplication

 P: n/a Hi all! First of all: I am a C-newbie. I have noticed a "strange" behavior with the standart integer multiplication. The code is: void main(void) { int a = 0x1234; int b = 0xABCD; long int result_1 = (a * b); long int result_2 = (long int)(a * b); //the same as result_1 } My machine is a 16 bit microcontroller (Texas Instruments MSP430). Therefor "int" is 16 bits and "long int" is 32 bits long. The problem is independent from the C-compiler (tested with IAR embedded workbench and MSP430-GCC). For clarity I use the following syntax (like VHDL): a(15 downto 0) depicts all bits of variable a result_1(31 downto 0) depicts all bits of result_1 (long int) result_1(15 downto 0) depicts only the lower 16 bits (int) The problem: As everybody knows (16 bits) * (16 bits) = (32 bits). Therefor the correct result would be (a * b)(31 downto 0). But only result_1(15 downto 0) holds the correct lower bits of the result and result_1(31 downto 16) is sign-exteded result(15 downto 0). In other words: (a * b)(15 downto 0) is taken (and sign-extended) and (a * b)(31 downto 16) is thrown away. O.k., any arithmetic operation in C on two "int" variables lead to an "int" as result and the type conversion to "long int" is done afterwards. (So it is clear to see, that result_2==result_1.) Let's have a look closer to hardware, before going on: Every machine, that has a n*n hardware multiplier, computes the result of this multiplication with bitwidth 2n. Therefor the correct result is already stored in the output register of the hardware multiplier. The first "solution" of this problem could be this: void main(void) { long int a = 0x1234; long int b = 0xABCD; long int result_1 = (a * b); } It computes the correct result, but has a *major* disadvantage: Because both "a" and "b" are now declared to be 32 bits long, the compiler has to map this code to this multiplication (32 bits) * (32 bits) = (64 bits), that leads to the use of two times the hardware multiplier (and even in software a 32 bit multiplication has to be done). Finally - my questions: Does a code construct exist in C, that forces the compiler to take all 32 bits from the result of a 16*16 bit multiplication? Why is the result of the arithmetic operation "*" defined to be the same data type like the inputs and has not doubled bitwidth? Additionally: The same problem should occur on any other border of the bitwidths (if a data type exists, that has two times the bitwidth of the machine). Thanks in advantage. Ralf Nov 14 '05 #1
9 Replies

 P: n/a Ralf Hildebrandt wrote: I have noticed a "strange" behavior with the standart integer multiplication. The code is: void main(void) { int a = 0x1234; int b = 0xABCD; long int result_1 = (a * b); long int result_2 = (long int)(a * b); //the same as result_1 } Does this help? http://www.eskimo.com/~scs/C-faq/q3.14.html Nov 14 '05 #2

 P: n/a Ralf Hildebrandt wrote: void main(void) By the way, this may apply: http://www.eskimo.com/~scs/C-faq/q11.12.html Nov 14 '05 #3

 P: n/a On Thu, 18 Dec 2003 11:34:28 +0100, Ralf Hildebrandt wrote: Hi all! First of all: I am a C-newbie. I have noticed a "strange" behavior with the standart integer Not strange behaviour. Just behaviour you don't understand. multiplication. The code is:void main(void){ int a = 0x1234; int b = 0xABCD; long int result_1 = (a * b); long int result_2 = (long int)(a * b); //the same as result_1 Yes, so? You perform 16bit multplication, then explicitly cast the results to long int. Your (long int) cast doesn't affect precision of the multiplication, it affects the precision of the expression of the results of the integer multiplication. In other words, your (long int) cast simply converts the results of an int multiplication to long. If you wanted to increase the precision of the results, you should have cast the two values to long /before/ you multiplied. In other words, you should have long int result_3 = (long int)a * (long int)b; or even long int result_4 = (long int)a * b; So long as one of the two operands is of long int precision, the multiplication will be performed with long int precision, and the results will be of long int precision. The cast(s) on the operand(s) simply coerce one or both of the operands into long int precision, thus leading to long int multiplication, and a suitable long int result. }[snip] -- Lew Pitcher IT Consultant, Enterprise Technology Solutions Toronto Dominion Bank Financial Group (Opinions expressed are my own, not my employers') Nov 14 '05 #4

 P: n/a Hi Lew and Grumble! void main(void){ int a = 0x1234; int b = 0xABCD; long int result_1 = (a * b); Yes, so? You perform 16bit multplication, then explicitly cast the results to long int. Your (long int) cast doesn't affect precision of the multiplication, it affects the precision of the expression of the results of the integer multiplication. In other words, your (long int) cast simply converts the results of an int multiplication to long. This is exactly, what I have observed and what I explained in my question. If you wanted to increase the precision of the results, you should have cast the two values to long /before/ you multiplied. In other words, you should have long int result_3 = (long int)a * (long int)b; or even long int result_4 = (long int)a * b; or as Grumble said: http://www.eskimo.com/~scs/C-faq/q3.14.html So long as one of the two operands is of long int precision, the multiplication will be performed with long int precision, and the results will be of long int precision. The cast(s) on the operand(s) simply coerce one or both of the operands into long int precision, thus leading to long int multiplication, and a suitable long int result. Is this the only way to compute a (16 bits) * (16 bits) = (32 bits) multiplication? I can't imagine, that this *really silly* workaround is the only way. Let me explain again: If one of the multiplicants is of type "long int", the other one will be converted to "long int" too (if not already been done). This is a basic principle in C when handling arithmetic operations. But now -both operands are "long int"- truly a full (32 bits) * (32 bits) = (64 bits) operation is computed and the upper 32 bits of the result are thrown away. Again - for clarity: I can see, that the hardware multiplier computes *two times* a multiplication (2 times a 16 bit multiplications, that result in a 32 bit multiplication.). But the correct result of the 16 bit multiplication can be computed with only one 16 bit multiplication. As I am a hardware engineer, I will translate the problem to software: If I don't have any hardware multiplier, the multiplication is done in Software (some conditional accumulations). This means long int result_3 = (long int)a * (long int)b; // and derivatives results in truely a complete 32 bit multiplication. All nessecary steps and the correct result of a complete 32 bit multiplication are computed (64 bits wide). (I can see the correct result in the registers.) But then, after all this unnessecary computing, the upper half of the result is moved to trash. Both realisations -with hardware multiplier or in software- result in the *doubled* computing load. I proved it with studying the assembler code. Back to the 16 bit multiplication: I can see, studying the assembler code, that the correct result of the "int" * "int" multiplication is visible in the registers (both, if a hardware multiplier is used and if all is done in software), but after this, the upper half part of the result is destroyed. Conclusion: On a n bit machine, where type "int" is n bits wide, AND there exists a integer data type with 2n bithwidth (let's call it a "int2"), the doubled and unnessecary computing load is done, if one wants to have the result of a multiplication as wide as "int2". I am not sure, but this problem should occur on a standart 32 bit x86 machine, if one wants to compute the "int" * "int" = "qword" multiplication, too. (qword is 64 bits wide, isn't it?) O.k. - today nearly everybodes does not care, if there are 1 or 2 multiplications computed, but only one is nessecary, but I do! Especially for low power and even for high speed applications on small (embedded) systems, the (low) CPU power has a major influence and I do not want to waste it. So is coding the multiplication in assembler the only solution? Thanks for your comments. Ralf Nov 14 '05 #5

 P: n/a Ralf Hildebrandt wrote: Conclusion: On a n bit machine, where type "int" is n bits wide, AND there exists a integer data type with 2n bithwidth (let's call it a "int2"), the doubled and unnessecary computing load is done, if one wants to have the result of a multiplication as wide as "int2". I am not sure, but this problem should occur on a standart 32 bit x86 machine, if one wants to compute the "int" * "int" = "qword" multiplication, too. (qword is 64 bits wide, isn't it?) O.k. - today nearly everybodes does not care, if there are 1 or 2 multiplications computed, but only one is nessecary, but I do! Especially for low power and even for high speed applications on small (embedded) systems, the (low) CPU power has a major influence and I do not want to waste it. So is coding the multiplication in assembler the only solution? [ I think this is called the "widening multiply" problem. ] I'm no C expert, but I don't think you can compute an NxN product and expect to grab 2N bits in C. I could be wrong. As far as I can tell, the problem does occur on IA32: MUL will take two 32-bit registers as input, and store the result in two 32-bit registers. You could use inline assembly, if your compiler supports it. Nov 14 '05 #6

 P: n/a Ralf Hildebrandt wrote: Hi Lew and Grumble! (snip) > If you wanted to increase the precision of the results, you should > have cast the two values to long /before/ you multiplied. In other > words, you should have > >> long int result_3 = (long int)a * (long int)b; > > or even > >> long int result_4 = (long int)a * b; or as Grumble said: > http://www.eskimo.com/~scs/C-faq/q3.14.html (snip) Is this the only way to compute a (16 bits) * (16 bits) = (32 bits) multiplication? I can't imagine, that this *really silly* workaround is the only way. I believe that some compilers recognize that both operands were converted from 16 bits, and do a 16 bit multiply, when you cast one or both to long. I don't at all know how many or which compilers do that. It is a tradition for high level languages, but consider if they didn't do it, what would happen when multiplying a series of numbers? i=a*b*c*d*e*f; all variables are 16 bit: a*b is 32 bit, 32 bit product times c, convert c and a 64 bit product, convert e, a 128 bit product, convert f, a 256 bit product. So it is always that you get out what you put in. (PL/I has the MULTIPLY function that allows one to specify the precision of the result. Maybe C could add that.) -- glen Nov 14 '05 #7

 P: n/a Ralf Hildebrandt wrote in message news:... For context: 16-bit int, 32-bit long { int a = 0x1234; int b = 0xABCD; long int result_1 = (a * b); gives a 16-bit result in result_1. If you wanted to increase the precision of the results, you should have cast the two values to long /before/ you multiplied. In other words, you should have long int result_3 = (long int)a * (long int)b; or even long int result_4 = (long int)a * b; So long as one of the two operands is of long int precision, the multiplication will be performed with long int precision, and the results will be of long int precision. The cast(s) on the operand(s) simply coerce one or both of the operands into long int precision, thus leading to long int multiplication, and a suitable long int result. Is this the only way to compute a (16 bits) * (16 bits) = (32 bits) multiplication? I can't imagine, that this *really silly* workaround is the only way. It's the normal way to get the result you want in C. That's not directly related to the width of multiplication that the compiler generates for the processor. Let me explain again: If one of the multiplicants is of type "long int", the other one will be converted to "long int" too (if not already been done). This is a basic principle in C when handling arithmetic operations. But now -both operands are "long int"- truly a full (32 bits) * (32 bits) = (64 bits) operation is computed and the upper 32 bits of the result are thrown away. Yes, that's how the C Virtual Machine is required to behave (in essence - I'm ignoring any edge conditions here). Again - for clarity: I can see, that the hardware multiplier computes *two times* a multiplication (2 times a 16 bit multiplications, that result in a 32 bit multiplication.). But the correct result of the 16 bit multiplication can be computed with only one 16 bit multiplication. As I am a hardware engineer, I will translate the problem to software: If I don't have any hardware multiplier, the multiplication is done in Software (some conditional accumulations). This means long int result_3 = (long int)a * (long int)b; // and derivatives results in truely a complete 32 bit multiplication. All nessecary steps and the correct result of a complete 32 bit multiplication are computed (64 bits wide). (I can see the correct result in the registers.) But then, after all this unnessecary computing, the upper half of the result is moved to trash. Both realisations -with hardware multiplier or in software- result in the *doubled* computing load. I proved it with studying the assembler code. You proved something about the behaviour of your compiler used in the way you invoked it. You didn't prove anything about C. Back to the 16 bit multiplication: I can see, studying the assembler code, that the correct result of the "int" * "int" multiplication is visible in the registers (both, if a hardware multiplier is used and if all is done in software), but after this, the upper half part of the result is destroyed. Conclusion: On a n bit machine, where type "int" is n bits wide, AND there exists a integer data type with 2n bithwidth (let's call it a "int2"), the doubled and unnessecary computing load is done, if one wants to have the result of a multiplication as wide as "int2". The compiler can do what it wants as long as it emulates the defined behaviour of the C Virtual Machine. It could widen them all to 128 bits, do the arithmetic, and throw away the extra bits, if it wanted to. This is an issue of the quality of implementation of the compiler. In the case of long int result = (long int)a * b; where a and b are ints, the compiler can "see" that it is being asked to multiply two 16-bit numbers to give a 32-bit result. There's no reason why it shouldn't generate machine code to do exactly that, since the program wouldn't be able to tell the difference between its doing that and its following the semantics of the C Virtual Machine. So is coding the multiplication in assembler the only solution? Either that, or find a decent C compiler. A compiler tuned for getting the best out of small processors in embedded environments ought to make use of an optimization like this. Nov 14 '05 #8

 P: n/a Ralf Hildebrandt wrote: .... snip ... Conclusion: On a n bit machine, where type "int" is n bits wide, AND there exists a integer data type with 2n bithwidth (let's call it a "int2"), the doubled and unnessecary computing load is done, if one wants to have the result of a multiplication as wide as "int2". In addition, any time you perform a multiplication (or any other arithmetic operation) where the result exceeds the capacity of the signed type, the result is undefined behavior, and absolutely anything can happen. This does not apply to unsigned types, where the overflow action is carefully defined. I.E. if you have 16 bit integers, then int i, a, b; long lg; a = 4; b = 10000; i = a * b; results in undefined behavior. However lg = (long)a * b; is well defined. but a following "i = lg; is not. -- Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net) Available for consulting/temporary embedded and systems. USE worldnet address! Nov 14 '05 #9

 P: n/a In message glen herrmannsfeldt wrote: Ralf Hildebrandt wrote: Hi Lew and Grumble! (snip) > If you wanted to increase the precision of the results, you should > have cast the two values to long /before/ you multiplied. In other > words, you should have > >> long int result_3 = (long int)a * (long int)b; > > or even > >> long int result_4 = (long int)a * b; or as Grumble said: > http://www.eskimo.com/~scs/C-faq/q3.14.html (snip) Is this the only way to compute a (16 bits) * (16 bits) = (32 bits) multiplication? I can't imagine, that this *really silly* workaround is the only way. I believe that some compilers recognize that both operands were converted from 16 bits, and do a 16 bit multiply, when you cast one or both to long. I don't at all know how many or which compilers do that. The Norcroft ARM C and C++ compilers (supplied as part of ARM Ltd's development tools), distinguishes the 5 cases s32 x s32 -> 64 (note that the signedness matters for narrow u32 x u32 -> 64 arguments) s32 x 64 -> 64 u32 x 64 -> 64 64 x 64 -> 64 This is analogous to the OP's original example, given that it's a 32-bit CPU, with 32x32->64 multiply instructions. It's a very obvious optimisation to add for architectures that benefit from it, and fairly straightforward to implement. One only needs to transform the expression tree: <64*64> / \ / \ / \ / \ / \ / \ into | | | and have separate code for each type of multiply. I'd agree that it looks damned ugly in the source though, and it's not intuitive to the end user that the compiler doesn't actually have to do a full 64x64->64 multiply. The OP should put the casts in, inspect the compiler's output, and if it is actually performing a full 32x32 multiply in his case, suggest the improvement to his vendor. -- Kevin Bracey, Principal Software Engineer Tematic Ltd Tel: +44 (0) 1223 503464 182-190 Newmarket Road Fax: +44 (0) 1223 503458 Cambridge, CB5 8HE, United Kingdom WWW: http://www.tematic.com/ Nov 14 '05 #10

### This discussion thread is closed

Replies have been disabled for this discussion.