473,385 Members | 1,531 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

data types (bitwidth) and multiplication

Hi all!
First of all: I am a C-newbie.
I have noticed a "strange" behavior with the standart integer
multiplication. The code is:

void main(void)
{
int a = 0x1234;
int b = 0xABCD;
long int result_1 = (a * b);
long int result_2 = (long int)(a * b); //the same as result_1
}

My machine is a 16 bit microcontroller (Texas Instruments MSP430).
Therefor "int" is 16 bits and "long int" is 32 bits long. The problem is
independent from the C-compiler (tested with IAR embedded workbench and
MSP430-GCC).

For clarity I use the following syntax (like VHDL):

a(15 downto 0) depicts all bits of variable a
result_1(31 downto 0) depicts all bits of result_1 (long int)
result_1(15 downto 0) depicts only the lower 16 bits (int)

The problem:
As everybody knows (16 bits) * (16 bits) = (32 bits).
Therefor the correct result would be (a * b)(31 downto 0). But only
result_1(15 downto 0) holds the correct lower bits of the result and
result_1(31 downto 16) is sign-exteded result(15 downto 0).
In other words: (a * b)(15 downto 0) is taken (and sign-extended) and (a
* b)(31 downto 16) is thrown away.

O.k., any arithmetic operation in C on two "int" variables lead to an
"int" as result and the type conversion to "long int" is done
afterwards. (So it is clear to see, that result_2==result_1.)

Let's have a look closer to hardware, before going on:
Every machine, that has a n*n hardware multiplier, computes the result
of this multiplication with bitwidth 2n. Therefor the correct result is
already stored in the output register of the hardware multiplier.

The first "solution" of this problem could be this:

void main(void)
{
long int a = 0x1234;
long int b = 0xABCD;
long int result_1 = (a * b);
}

It computes the correct result, but has a *major* disadvantage: Because
both "a" and "b" are now declared to be 32 bits long, the compiler has
to map this code to this multiplication (32 bits) * (32 bits) = (64
bits), that leads to the use of two times the hardware multiplier (and
even in software a 32 bit multiplication has to be done).
Finally - my questions:
Does a code construct exist in C, that forces the compiler to take all
32 bits from the result of a 16*16 bit multiplication?

Why is the result of the arithmetic operation "*" defined to be the same
data type like the inputs and has not doubled bitwidth?

Additionally: The same problem should occur on any other border of the
bitwidths (if a data type exists, that has two times the bitwidth of the
machine).
Thanks in advantage.
Ralf

Nov 14 '05 #1
9 7971
Ralf Hildebrandt wrote:
I have noticed a "strange" behavior with the standart integer
multiplication. The code is:

void main(void)
{
int a = 0x1234;
int b = 0xABCD;
long int result_1 = (a * b);
long int result_2 = (long int)(a * b); //the same as result_1
}


Does this help?
http://www.eskimo.com/~scs/C-faq/q3.14.html

Nov 14 '05 #2
Ralf Hildebrandt wrote:
void main(void)


By the way, this may apply:
http://www.eskimo.com/~scs/C-faq/q11.12.html

Nov 14 '05 #3
On Thu, 18 Dec 2003 11:34:28 +0100, Ralf Hildebrandt <Ra**************@gmx.de>
wrote:
Hi all!
First of all: I am a C-newbie.
I have noticed a "strange" behavior with the standart integer
Not strange behaviour. Just behaviour you don't understand.
multiplication. The code is:

void main(void)
{
int a = 0x1234;
int b = 0xABCD;
long int result_1 = (a * b);
long int result_2 = (long int)(a * b); //the same as result_1
Yes, so? You perform 16bit multplication, then explicitly cast the results to
long int. Your (long int) cast doesn't affect precision of the multiplication,
it affects the precision of the expression of the results of the integer
multiplication. In other words, your (long int) cast simply converts the results
of an int multiplication to long.

If you wanted to increase the precision of the results, you should have cast the
two values to long /before/ you multiplied. In other words, you should have
long int result_3 = (long int)a * (long int)b;
or even
long int result_4 = (long int)a * b;
So long as one of the two operands is of long int precision, the multiplication
will be performed with long int precision, and the results will be of long int
precision. The cast(s) on the operand(s) simply coerce one or both of the
operands into long int precision, thus leading to long int multiplication, and a
suitable long int result.
}
[snip]


--
Lew Pitcher
IT Consultant, Enterprise Technology Solutions
Toronto Dominion Bank Financial Group

(Opinions expressed are my own, not my employers')
Nov 14 '05 #4
Hi Lew and Grumble!

void main(void)
{
int a = 0x1234;
int b = 0xABCD;
long int result_1 = (a * b);
Yes, so? You perform 16bit multplication, then explicitly cast the
results to long int. Your (long int) cast doesn't affect precision of
the multiplication, it affects the precision of the expression of the
results of the integer multiplication. In other words, your (long int)
cast simply converts the results of an int multiplication to long.
This is exactly, what I have observed and what I explained in my question.

If you wanted to increase the precision of the results, you should
have cast the two values to long /before/ you multiplied. In other
words, you should have
long int result_3 = (long int)a * (long int)b;
or even
long int result_4 = (long int)a * b;


or as Grumble said: http://www.eskimo.com/~scs/C-faq/q3.14.html

So long as one of the two operands is of long int precision, the
multiplication will be performed with long int precision, and the
results will be of long int precision. The cast(s) on the operand(s)
simply coerce one or both of the operands into long int precision,
thus leading to long int multiplication, and a suitable long int
result.


Is this the only way to compute a (16 bits) * (16 bits) = (32 bits)
multiplication? I can't imagine, that this *really silly* workaround is
the only way.

Let me explain again:
If one of the multiplicants is of type "long int", the other one will be
converted to "long int" too (if not already been done). This is a basic
principle in C when handling arithmetic operations.
But now -both operands are "long int"- truly a full (32 bits) * (32
bits) = (64 bits) operation is computed and the upper 32 bits of the
result are thrown away.

Again - for clarity: I can see, that the hardware multiplier computes
*two times* a multiplication (2 times a 16 bit multiplications, that
result in a 32 bit multiplication.). But the correct result of the 16
bit multiplication can be computed with only one 16 bit multiplication.
As I am a hardware engineer, I will translate the problem to software:
If I don't have any hardware multiplier, the multiplication is done in
Software (some conditional accumulations). This means

long int result_3 = (long int)a * (long int)b; // and derivatives

results in truely a complete 32 bit multiplication. All nessecary steps
and the correct result of a complete 32 bit multiplication are computed
(64 bits wide). (I can see the correct result in the registers.) But
then, after all this unnessecary computing, the upper half of the result
is moved to trash.

Both realisations -with hardware multiplier or in software- result in
the *doubled* computing load. I proved it with studying the assembler code.

Back to the 16 bit multiplication: I can see, studying the assembler
code, that the correct result of the "int" * "int" multiplication is
visible in the registers (both, if a hardware multiplier is used and if
all is done in software), but after this, the upper half part of the
result is destroyed.

Conclusion: On a n bit machine, where type "int" is n bits wide, AND
there exists a integer data type with 2n bithwidth (let's call it a
"int2"), the doubled and unnessecary computing load is done, if one
wants to have the result of a multiplication as wide as "int2".

I am not sure, but this problem should occur on a standart 32 bit x86
machine, if one wants to compute the "int" * "int" = "qword"
multiplication, too. (qword is 64 bits wide, isn't it?)
O.k. - today nearly everybodes does not care, if there are 1 or 2
multiplications computed, but only one is nessecary, but I do!
Especially for low power and even for high speed applications on small
(embedded) systems, the (low) CPU power has a major influence and I do
not want to waste it.
So is coding the multiplication in assembler the only solution?
Thanks for your comments.
Ralf

Nov 14 '05 #5
Ralf Hildebrandt wrote:
Conclusion: On a n bit machine, where type "int" is n bits wide, AND
there exists a integer data type with 2n bithwidth (let's call it a
"int2"), the doubled and unnessecary computing load is done, if one
wants to have the result of a multiplication as wide as "int2".

I am not sure, but this problem should occur on a standart 32 bit x86
machine, if one wants to compute the "int" * "int" = "qword"
multiplication, too. (qword is 64 bits wide, isn't it?)
O.k. - today nearly everybodes does not care, if there are 1 or 2
multiplications computed, but only one is nessecary, but I do!
Especially for low power and even for high speed applications on small
(embedded) systems, the (low) CPU power has a major influence and I do
not want to waste it.

So is coding the multiplication in assembler the only solution?


[ I think this is called the "widening multiply" problem. ]

I'm no C expert, but I don't think you can compute an NxN product
and expect to grab 2N bits in C. I could be wrong.

As far as I can tell, the problem does occur on IA32: MUL will take
two 32-bit registers as input, and store the result in two 32-bit
registers.

You could use inline assembly, if your compiler supports it.

Nov 14 '05 #6
Ralf Hildebrandt wrote:
Hi Lew and Grumble!
(snip)
> If you wanted to increase the precision of the results, you should
> have cast the two values to long /before/ you multiplied. In other
> words, you should have
>
>> long int result_3 = (long int)a * (long int)b;

>
> or even
>
>> long int result_4 = (long int)a * b;


or as Grumble said:
> http://www.eskimo.com/~scs/C-faq/q3.14.html


(snip)
Is this the only way to compute a (16 bits) * (16 bits) = (32 bits)
multiplication? I can't imagine, that this *really silly* workaround is
the only way.


I believe that some compilers recognize that both operands were
converted from 16 bits, and do a 16 bit multiply, when you
cast one or both to long.

I don't at all know how many or which compilers do that.

It is a tradition for high level languages, but consider if
they didn't do it, what would happen when multiplying a
series of numbers?

i=a*b*c*d*e*f;

all variables are 16 bit: a*b is 32 bit,
32 bit product times c, convert c and a 64 bit product,
convert e, a 128 bit product, convert f, a 256 bit product.

So it is always that you get out what you put in.

(PL/I has the MULTIPLY function that allows one to specify
the precision of the result. Maybe C could add that.)

-- glen

Nov 14 '05 #7
Ralf Hildebrandt <Ra**************@gmx.de> wrote in message news:<br************@ID-8609.news.uni-berlin.de>...

For context: 16-bit int, 32-bit long
{
int a = 0x1234;
int b = 0xABCD;
long int result_1 = (a * b);

gives a 16-bit result in result_1.
If you wanted to increase the precision of the results, you should
have cast the two values to long /before/ you multiplied. In other
words, you should have
long int result_3 = (long int)a * (long int)b;


or even
long int result_4 = (long int)a * b;


So long as one of the two operands is of long int precision, the
multiplication will be performed with long int precision, and the
results will be of long int precision. The cast(s) on the operand(s)
simply coerce one or both of the operands into long int precision,
thus leading to long int multiplication, and a suitable long int
result.


Is this the only way to compute a (16 bits) * (16 bits) = (32 bits)
multiplication? I can't imagine, that this *really silly* workaround is
the only way.


It's the normal way to get the result you want in C. That's not
directly related to the width of multiplication that the compiler
generates for the processor.
Let me explain again:
If one of the multiplicants is of type "long int", the other one will be
converted to "long int" too (if not already been done). This is a basic
principle in C when handling arithmetic operations.
But now -both operands are "long int"- truly a full (32 bits) * (32
bits) = (64 bits) operation is computed and the upper 32 bits of the
result are thrown away.
Yes, that's how the C Virtual Machine is required to behave (in
essence - I'm ignoring any edge conditions here).
Again - for clarity: I can see, that the hardware multiplier computes
*two times* a multiplication (2 times a 16 bit multiplications, that
result in a 32 bit multiplication.). But the correct result of the 16
bit multiplication can be computed with only one 16 bit multiplication.
As I am a hardware engineer, I will translate the problem to software:
If I don't have any hardware multiplier, the multiplication is done in
Software (some conditional accumulations). This means

long int result_3 = (long int)a * (long int)b; // and derivatives

results in truely a complete 32 bit multiplication. All nessecary steps
and the correct result of a complete 32 bit multiplication are computed
(64 bits wide). (I can see the correct result in the registers.) But
then, after all this unnessecary computing, the upper half of the result
is moved to trash.

Both realisations -with hardware multiplier or in software- result in
the *doubled* computing load. I proved it with studying the assembler code.
You proved something about the behaviour of your compiler used
in the way you invoked it. You didn't prove anything about C.
Back to the 16 bit multiplication: I can see, studying the assembler
code, that the correct result of the "int" * "int" multiplication is
visible in the registers (both, if a hardware multiplier is used and if
all is done in software), but after this, the upper half part of the
result is destroyed.

Conclusion: On a n bit machine, where type "int" is n bits wide, AND
there exists a integer data type with 2n bithwidth (let's call it a
"int2"), the doubled and unnessecary computing load is done, if one
wants to have the result of a multiplication as wide as "int2".
The compiler can do what it wants as long as it emulates the
defined behaviour of the C Virtual Machine. It could widen
them all to 128 bits, do the arithmetic, and throw away the
extra bits, if it wanted to. This is an issue of the quality
of implementation of the compiler.

In the case of

long int result = (long int)a * b;

where a and b are ints, the compiler can "see" that it is
being asked to multiply two 16-bit numbers to give a 32-bit
result. There's no reason why it shouldn't generate machine
code to do exactly that, since the program wouldn't be able
to tell the difference between its doing that and its
following the semantics of the C Virtual Machine.
So is coding the multiplication in assembler the only solution?


Either that, or find a decent C compiler. A compiler tuned
for getting the best out of small processors in embedded
environments ought to make use of an optimization like this.
Nov 14 '05 #8
Ralf Hildebrandt wrote:
.... snip ...
Conclusion: On a n bit machine, where type "int" is n bits wide, AND
there exists a integer data type with 2n bithwidth (let's call it a
"int2"), the doubled and unnessecary computing load is done, if one
wants to have the result of a multiplication as wide as "int2".


In addition, any time you perform a multiplication (or any other
arithmetic operation) where the result exceeds the capacity of the
signed type, the result is undefined behavior, and absolutely
anything can happen. This does not apply to unsigned types, where
the overflow action is carefully defined.

I.E. if you have 16 bit integers, then

int i, a, b;
long lg;

a = 4; b = 10000;
i = a * b;

results in undefined behavior. However

lg = (long)a * b;

is well defined. but a following "i = lg; is not.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #9
In message <n4mEb.423972$ao4.1359319@attbi_s51>
glen herrmannsfeldt <ga*@ugcs.caltech.edu> wrote:
Ralf Hildebrandt wrote:
Hi Lew and Grumble!


(snip)
> If you wanted to increase the precision of the results, you should
> have cast the two values to long /before/ you multiplied. In other
> words, you should have
>
>> long int result_3 = (long int)a * (long int)b;
>
> or even
>
>> long int result_4 = (long int)a * b;


or as Grumble said:
> http://www.eskimo.com/~scs/C-faq/q3.14.html


(snip)
Is this the only way to compute a (16 bits) * (16 bits) = (32 bits)
multiplication? I can't imagine, that this *really silly* workaround is
the only way.


I believe that some compilers recognize that both operands were
converted from 16 bits, and do a 16 bit multiply, when you
cast one or both to long.

I don't at all know how many or which compilers do that.


The Norcroft ARM C and C++ compilers (supplied as part of ARM Ltd's
development tools), distinguishes the 5 cases

s32 x s32 -> 64 (note that the signedness matters for narrow
u32 x u32 -> 64 arguments)
s32 x 64 -> 64
u32 x 64 -> 64
64 x 64 -> 64

This is analogous to the OP's original example, given that it's a 32-bit CPU,
with 32x32->64 multiply instructions.

It's a very obvious optimisation to add for architectures that benefit from
it, and fairly straightforward to implement. One only needs to transform the
expression tree:
<64*64> <s32*64>
/ \ / \
/ \ / \
/ \ / \
<cast to ll> <ll expr> into <int expr> <ll expr>
|
|
|
<int expr>

and have separate code for each type of multiply. I'd agree that it looks
damned ugly in the source though, and it's not intuitive to the end user that
the compiler doesn't actually have to do a full 64x64->64 multiply.

The OP should put the casts in, inspect the compiler's output, and if it
is actually performing a full 32x32 multiply in his case, suggest the
improvement to his vendor.

--
Kevin Bracey, Principal Software Engineer
Tematic Ltd Tel: +44 (0) 1223 503464
182-190 Newmarket Road Fax: +44 (0) 1223 503458
Cambridge, CB5 8HE, United Kingdom WWW: http://www.tematic.com/
Nov 14 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

21
by: Batista, Facundo | last post by:
Here I send it. Suggestions and all kinds of recomendations are more than welcomed. If it all goes ok, it'll be a PEP when I finish writing/modifying the code. Thank you. .. Facundo
10
by: Andrew Thompson | last post by:
http://www.physci.org/test/chem/element.html, represents information on a chemical element. (http://www.physci.org/test/chem/ for the CSS's) Not yet coded for links to other forms of the...
2
by: Chris Foster | last post by:
Hi, I'm having some difficulty using types which are defined in a base class inside a derived class. The problem crops up using template classes. The following test code encapsulates what I'd...
87
by: Vijay Kumar R Zanvar | last post by:
Hi, Why multiplication of pointers is not allowed? Till now I only know this, but not the reason why! PS: As a rule, I searched the FAQ, but could not find an answer. -- Vijay Kumar R...
27
by: jacob navia | last post by:
As Richard Bos rightly pointed out, I had left in my classification of types the C99 types Complex and boolean. Here is a new classification. Those are not mentioned in the classification of...
6
by: sara | last post by:
I have what I think is a little strange...I have to get data from our payroll system into a specific format (fixed record length) as a .txt or .prn file only to upload to our 401k custodian. I...
10
by: Robert Dailey | last post by:
Hi, Is there build-in or third party support for large integer types, such as 96 or 128 bits in size? I require such large sizes for precision issues (nanoseconds). Thanks.
11
by: a | last post by:
To solve the 100k * 100k data, I finally adopt the file read method and use malloc. The system somehow knows to use virtual memory. Now I first test 1k * 1k data but something uninterpretable...
1
by: Sozos | last post by:
Hi guys. I have a problem with writing the base case for the following matrix multiplication function I have implemented. Please help. #define index(i,j,power) (((i)<<(power))+(j)) void...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.