32-bit IEEE float multiplication

Andy

Hi,
I don't know if this is the correct group to post this, but
when I multiply a huge floating point value by a really
small (non-zero) floating point value, I get 0 (zero) for the result.
This creates a big hole in a 32-bit timer routine I wrote.

Questions.
1. Why does this happen?
2. Is there C macros/functions I can call to tell me
when two non-zero numbers are multiplied and the
result will be zero?

TIA
Andy

Nov 13 '05 #1

Subscribe Post Reply

8278

Arthur J. O'Dwyer

On Wed, 3 Dec 2003, Andy wrote:

I don't know if this is the correct group to post this, but
when I multiply a huge floating point value by a really
small (non-zero) floating point value, I get 0 (zero) for the result.
This creates a big hole in a 32-bit timer routine I wrote.

Questions.
1. Why does this happen?
It's in the FAQ. In general, Computers Are Not Math. Computers,
even big fast ones, deal exclusively in fixed-width data. That
means you only get a finite amount of precision to work with.
Keep dividing X by 2 over and over again, and eventually X will
get so small that it's indistinguishable from zero. And on a
computer, that means it's *equal* to zero. It's called "truncation"
or "underflow" or "overflow" or "rounding error" (depending on
exactly what's going on), and it's something that every numeric
programmer should understand.
Your subject line suggests you understand what a "bit" is; so
think about it this way. You're using 32-bit numbers -- 32 bits
can hold 2**32 different values -- so if the number you want to
compute isn't one of those 2**32 particular values representable
by your 32-bit numbers, then you'll get rounding error. If your
number is very close to zero, then it'll *become* zero, just because
that's the nearest representation the computer can get.
2. Is there C macros/functions I can call to tell me
when two non-zero numbers are multiplied and the
result will be zero?

if (a != 0 && b != 0 && a*b == 0)
puts("duh");

-Arthur

Nov 13 '05 #2

Christian Bau

In article <ae**************************@posting.google.com >,
bi*****@hotmail.com (Andy) wrote:

Hi,
I don't know if this is the correct group to post this, but
when I multiply a huge floating point value by a really
small (non-zero) floating point value, I get 0 (zero) for the result.
This creates a big hole in a 32-bit timer routine I wrote.

Questions.
1. Why does this happen?
2. Is there C macros/functions I can call to tell me
when two non-zero numbers are multiplied and the
result will be zero?

This is unusual. Could you post which compiler is used, and post source
code that demonstrates the problem? You mean something like

double result = 1e300 * 1e-300;

?

Nov 13 '05 #3

Sidney Cadot

Andy wrote:

Hi,
I don't know if this is the correct group to post this, but
when I multiply a huge floating point value by a really
small (non-zero) floating point value, I get 0 (zero) for the result.
This creates a big hole in a 32-bit timer routine I wrote.

Questions.
1. Why does this happen?
2. Is there C macros/functions I can call to tell me
when two non-zero numbers are multiplied and the
result will be zero?

TIA
Andy

This is not really on-topic here, but I wouldn't know where it would be,
so let's give it a go.

Assuming 32-bit IEEE floating point numbers (from your subject line),
their product should be the representation of the number that is closest
to the nearest representable number, subject to rounding. To quote the
first paragraph of Section 4 of IEEE 754-1985:

"Rounding takes a number regarded as infinitely precise and, if
necessary, modifies it to fit in the destination’s format while
signaling the inexact exception (7.5). Except for binary <---> decimal
conversion (whose weaker conditions are specified in 5.6), every
operation specified in Section 5 shall be performed as if it first
produced an intermediate result correct to infinite precision and with
unbounded range, and then rounded that result according to one of the
modes in this section. "

(and, of course, multiplication is specified in Section 5).

So if you are working using IEEE 754-compliant floating point, this
could be the case. To work this out, please show the exact (hex)
representation of the two numbers being multiplied and the result (is
the result a single- or double-precision number)?

Furher, it would help to know the hardware or software your running this on.

And lastly, the minimal program you can produce to show the problem
would be of assistance in understanding this.
Best regards,

Sidney Cadot

Nov 13 '05 #4

Sidney Cadot

Arthur J. O'Dwyer wrote:

On Wed, 3 Dec 2003, Andy wrote:
I don't know if this is the correct group to post this, but
when I multiply a huge floating point value by a really
small (non-zero) floating point value, I get 0 (zero) for the result.
This creates a big hole in a 32-bit timer routine I wrote.

Questions.
1. Why does this happen?

It's in the FAQ. In general, Computers Are Not Math. Computers,
even big fast ones, deal exclusively in fixed-width data. That
means you only get a finite amount of precision to work with.
Keep dividing X by 2 over and over again, and eventually X will
get so small that it's indistinguishable from zero.

That being true of course in general, IEEE-754 has some pretty strict
rules on what to expect, and I think they preclude the behavior that the
OP describes. Probably a dud, but worth checking out if only because it
beats the 'Indian C programmers and "u"' hands-down for interestingness.

Best regards,

Sidney

Nov 13 '05 #5

John Smith

Arthur J. O'Dwyer wrote:

think about it this way. You're using 32-bit numbers -- 32 bits
can hold 2**32 different values -- so if the number you want to
compute isn't one of those 2**32 particular values representable
by your 32-bit numbers, then you'll get rounding error. If your
number is very close to zero, then it'll *become* zero, just because
that's the nearest representation the computer can get.

I have always been a little hazy on this issue (some fractional base 10
values not representable in base 2). I have "almost" understood it --
until hearing it expressed in this way. It is like the understanding you
acquire after finally hearing the sound of one hand clapping. Thanks,
Arthur.

JS

Nov 13 '05 #6

Kevin D. Quitt

ftp://ftp.quitt.net/Outgoing/goldbergFollowup.pdf
--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up
Per the FCA, this address may not be added to any commercial mail list

Nov 13 '05 #7

Keith Thompson

bi*****@hotmail.com (Andy) writes:

I don't know if this is the correct group to post this, but
when I multiply a huge floating point value by a really
small (non-zero) floating point value, I get 0 (zero) for the result.
This creates a big hole in a 32-bit timer routine I wrote.

Questions.
1. Why does this happen?
2. Is there C macros/functions I can call to tell me
when two non-zero numbers are multiplied and the
result will be zero?

Floating-point arithmetic is strange, but it's not usually *that*
strange. I'll call your huge number huge_num, and your small number
small_num. Presumably they meet the following conditions:

small_num > 0.0
small_num < 1.0
huge_num > 1.0

I would expect the following to be true, even in the presence of
rounding errors:

small_num * 1.0 == small_num
small_num * X >= small_num, for any X >= 1.0 (barring overflow)
small_num * huge_num > small_num
therefore
small_num * huge_num > 0.0

I can imagine small_num * huge_num yielding 0.0 if small_num is a
denorm (maybe), or if small_num and huge_num are of different types.

Are you sure that the result of the multiplication is 0.0? If you're
displaying it with something like a printf "%f" format, it may be
rounding to zero on output, not in the computation.

What are the actual types and values of small_num and huge_num? Can
you post a small self-contained program that exhibits the problem?
Try using printf's "%g" format for output, or even something like
"%.50g" to display more digits.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
Schroedinger does Shakespeare: "To be *and* not to be"
(Note new e-mail address)

Nov 13 '05 #8

Dik T. Winter

In article <ln************@nuthaus.mib.org> Keith Thompson <ks***@mib.org> writes:

Floating-point arithmetic is strange, but it's not usually *that*
strange. I'll call your huge number huge_num, and your small number
small_num. Presumably they meet the following conditions:

small_num > 0.0
small_num < 1.0
huge_num > 1.0

I would expect the following to be true, even in the presence of
rounding errors:

small_num * 1.0 == small_num
small_num * X >= small_num, for any X >= 1.0 (barring overflow)
Overflow can not occur when 0.0 < small_num < 1.0, because the product
is smaller than or equal to X (equality is possible due to rounding
when small_num is very close to 1.0).
small_num * huge_num > small_num
therefore
small_num * huge_num > 0.0

I can imagine small_num * huge_num yielding 0.0 if small_num is a
denorm (maybe), or if small_num and huge_num are of different types.
Not even when small_num is a denormalised number. Because huge_num > 1.0
the product must be larger than or equal to small_num (equality is possible
when some rounding occurs and huge_num is very close to 1.0).
Are you sure that the result of the multiplication is 0.0? If you're
displaying it with something like a printf "%f" format, it may be
rounding to zero on output, not in the computation.

I think this is the most likely cause.

Under IEEE 0.0 can only be the result of a multiplication when one of the
operands is 0.0 (obviously) or when both operands are in absolute value
smaller than 1.0 and their mathematical product is in absulute value
smaller than the smallest representable number.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Nov 13 '05 #9

Arthur J. O'Dwyer

On Thu, 4 Dec 2003, John Smith wrote:

Arthur J. O'Dwyer wrote:

think about it this way. You're using 32-bit numbers -- 32 bits
can hold 2**32 different values -- so if the number you want to
compute isn't one of those 2**32 particular values representable
by your 32-bit numbers, then you'll get rounding error. If your
number is very close to zero, then it'll *become* zero, just because
that's the nearest representation the computer can get.

I have always been a little hazy on this issue (some fractional base 10
values not representable in base 2). I have "almost" understood it --
until hearing it expressed in this way. It is like the understanding you
acquire after finally hearing the sound of one hand clapping. Thanks,
Arthur.

You're welcome!
But it occurs to me (after reading some of the other replies)
that there is at least one point I should have mentioned in my
response, and one more point that you still seem to be slightly
"hazy" on:

First, as several others have pointed out, if we have

double a,b,c; /* initialized somehow */
assert(a > 1.0);
assert(b > 0.0);
assert(b < 1.0);
c = a*b;

then it is always the case that

assert(c != 0.0);

However, it is plausible that the OP might have initialized
'b' in such a way as to make him *think* it was a small positive
value, while in fact it had already been corrupted by round-off
error. For example, I think

b = 1e-400; /* Really small positive number? -- NOT! */

is likely to set 'b' to 0.0 on many common platforms. (If I'm
mistaken, just add one more zero to the end of that exponent. ;)
So that's where the OP should be looking in this case. (Or he
should look and see whether he's using the %f or %g format
specifiers to display the numbers, as others have said.)

Secondly, you talk about "fractional base 10 values not
representable in base 2." Such values do exist (modulo certain
objections about the nature of "representable") -- for example,

0.1 (base 10) = 0.00011001100110011... (base 2)

That is to say, 1/10 has a binary representation that can't be
precisely represented with any finite number of bits -- as opposed
to, say, 1/2 or 3/16, which can.
However, I was talking mostly about numbers that *can* be
represented with a finite number of bits -- the problem is just
that they require *more* bits than the OP's computer is willing
to allocate. For example,

pow(2, -10000)

is mathematically representable as

0.000000... (several thousand more zeroes) ...000001 (base 2),
or just
1 (base 2) multiplied by 2 to the -10011100010000 (base 2) power.

But the OP's 32-bit floating-point format doesn't have enough
bits in the exponent to represent this number (we need 14 bits
to store "-10000" in binary, and the IEEE format only has 8 bits
in its exponent field).
So even though the number pow(2, -10000) has a finite binary
expansion, it's *still* not perfectly representable by the OP's
computer. And so it rounds off to 0.0, and we have problems. :)

HTH,
-Arthur

Nov 13 '05 #10

Dan Pop

In <ae**************************@posting.google.com > bi*****@hotmail.com (Andy) writes:

Hi,
I don't know if this is the correct group to post this, but
when I multiply a huge floating point value by a really
small (non-zero) floating point value, I get 0 (zero) for the result.
This creates a big hole in a 32-bit timer routine I wrote.

Questions.
1. Why does this happen?
Show us the code. A minimal, but complete program illustrating your
problem.
2. Is there C macros/functions I can call to tell me
when two non-zero numbers are multiplied and the
result will be zero?

If the (positive) result is less than FLT_MIN, i.e. it cannot be
represented as a floating number, it is zero.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 13 '05 #11

Andy

The compiler is Keil for Intel 8051 and derivatives. It's
an embedded compiler. It was actually my mistake. I think the
code is actually working, but the test code I wrote had a race
condition on the variable gcGCLK with the ISR that actually
increments this variable. The actual code is appended at the
end of my message.
But here's a real question, given the code below

unsigned long ticks = 0; /* 32-bit */
float fVal; /* 32-bit IEEE-754 */

while(++ticks) {
fVal = (float)ticks * 0.004;
}

1. Can I expect fVal to always not decreasing as ticks is
incremented? (until of course when ticks wraps around
to 0)
2. Does fVal covers all the integral values? ie, could it
go from say 56 to 60 skipping 57, 58, and 59?
3. In this example, each integral value equals to 250
ticks. Are all intervals between any two consecutive
integral values of fVal always 250 ticks? (within tolerance
of course). ie, between fVal of 250 and 251, there're
250 ticks. Is this true for all integral intervals of fVal?
3. How about for a smaller number like 0.00004.

Sorry to have asked so many questions. I'm not too good with
this IEEE float thing.

/* the actual test code I wrote */
typedef unsigned long GCLK_T;
#define SECS_PER_TICK 0.004
GCLK_T RealElapsedTicks(GCLK_T zStart) {
return(gzGCLK - zStart); /* gzGCLK gets incremented in an ISR */
}
GCLK_T RealElapsedSecs(GCLK_T zStart) {
float fVal;

gzGCLK = 0xffffffff; /* ERROR. Race condition with ISR */
zStart = 0x00000000;

fVal = (float)RealElapsedTicks(zStart) * ((float)GCLK_SEC_PER_TICK);
return(fVal);
}

TIA
Andy
Christian Bau <ch***********@cbau.freeserve.co.uk> wrote in message news:<ch*********************************@slb-newsm1.svr.pol.co.uk>...

In article <ae**************************@posting.google.com >,
bi*****@hotmail.com (Andy) wrote:
Hi,
I don't know if this is the correct group to post this, but
when I multiply a huge floating point value by a really
small (non-zero) floating point value, I get 0 (zero) for the result.
This creates a big hole in a 32-bit timer routine I wrote.

Questions.
1. Why does this happen?
2. Is there C macros/functions I can call to tell me
when two non-zero numbers are multiplied and the
result will be zero?

This is unusual. Could you post which compiler is used, and post source
code that demonstrates the problem? You mean something like

double result = 1e300 * 1e-300;

?

Nov 13 '05 #12

Dik T. Winter

In article <ae**************************@posting.google.com > bi*****@hotmail.com (Andy) writes:

But here's a real question, given the code below

unsigned long ticks = 0; /* 32-bit */
float fVal; /* 32-bit IEEE-754 */

while(++ticks) {
fVal = (float)ticks * 0.004;
}

1. Can I expect fVal to always not decreasing as ticks is
incremented? (until of course when ticks wraps around
to 0)
Yes.
2. Does fVal covers all the integral values? ie, could it
go from say 56 to 60 skipping 57, 58, and 59?
No. It will not go from 56 to 60, but it may go from slightly less than
59 to slightly more than 59, skipping 59 itself.
3. In this example, each integral value equals to 250
ticks. Are all intervals between any two consecutive
integral values of fVal always 250 ticks? (within tolerance
of course). ie, between fVal of 250 and 251, there're
250 ticks. Is this true for all integral intervals of fVal?
Because of the above observation, no, not necessarily.
3. How about for a smaller number like 0.00004.

Similar answer. 0.04 and 0.00004 are not exactly representable as
floating point numbers, so rounding occurs both when the representation
of those numbers is created and when this value is used in the
multiplication. A better way would be to calculate:
(float)ticks / 250.0
but that may be slower on your system. (250.0 is exactly representable,
so IEEE mandates that when ticks is an integer that is a multiple of
250 the result should be exact.)

Another problem with your code is that when ticks exceeds 2**24 the
number is no longer exactly represenatable as a floating point number,
so all bets are off.

To get the best possible answer you need something like:
(float)(ticks / 250) + (float)(ticks % 250)/250.0
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Nov 13 '05 #13

Dan Pop

In <ae**************************@posting.google.com > bi*****@hotmail.com (Andy) writes:

The compiler is Keil for Intel 8051 and derivatives. It's
an embedded compiler. It was actually my mistake. I think the
code is actually working, but the test code I wrote had a race
condition on the variable gcGCLK with the ISR that actually
increments this variable. The actual code is appended at the
end of my message.
But here's a real question, given the code below

unsigned long ticks = 0; /* 32-bit */
float fVal; /* 32-bit IEEE-754 */

while(++ticks) {
fVal = (float)ticks * 0.004;
You're evaluating this expression using double precision, which is
probably the last thing you want on a 8051 or any other processor with
no hardware support for floating point. The right expression is:

fVal = ticks * 0.004f;
}

1. Can I expect fVal to always not decreasing as ticks is
incremented? (until of course when ticks wraps around
to 0)
Yes.
2. Does fVal covers all the integral values? ie, could it
go from say 56 to 60 skipping 57, 58, and 59?
If fVal ever reaches the value 56, its next values will be 56.004,
56.008 and so on. Actually, approximations of these values, because
IEEE-754 cannot represent them exactly.
3. In this example, each integral value equals to 250
ticks. Are all intervals between any two consecutive
integral values of fVal always 250 ticks? (within tolerance
of course). ie, between fVal of 250 and 251, there're
250 ticks. Is this true for all integral intervals of fVal?
0.004 cannot be represented by IEEE-754 exactly, because it is not a
combination of negative powers of two. Therefore, your computations
may seldom yield an integer value when the mathematically exact result
is an integer value.

Things would be different for 0.00390625 (256 ticks per integral value)
because this value can be represented exactly.
3. How about for a smaller number like 0.00004.
No difference.
Sorry to have asked so many questions. I'm not too good with
this IEEE float thing.

You may want to understand it, to avoid surprises. Binary floating point
arithmetic doesn't work the same way as normal arithmetic, mostly because
most real numbers cannot be represented exactly.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 13 '05 #14

John Smith

Arthur J. O'Dwyer wrote:

HTH,
-Arthur

It does help. Understanding refined. Thanks.

JS

Nov 13 '05 #15

Eric Sosman

"Dik T. Winter" wrote:

In article <ae**************************@posting.google.com > bi*****@hotmail.com (Andy) writes:
> But here's a real question, given the code below
>
> unsigned long ticks = 0; /* 32-bit */
> float fVal; /* 32-bit IEEE-754 */
>
> while(++ticks) {
> fVal = (float)ticks * 0.004;
> }
>

[...]
Another problem with your code is that when ticks exceeds 2**24 the
number is no longer exactly represenatable as a floating point number,
so all bets are off.

Not "all" bets, but it certainly scuttles any hope of
strict monotonicity.

Here's the scenario: `ticks' counts up to a large power
of two, about 2**24 or 2**25 (I'd have to pull out my IEEE
reference to be sure of the value, but the exact location
of this boundary isn't essential to understanding the effect).
Up to this point, the expression `(float)ticks' has produced
an exact conversion: the result is exactly equal to the
original value of `ticks'.

But at the next upward count a problem arises: `ticks'
now has one more significant bit than a `float' can handle.
(Imagine counting upwards in decimal arithmetic with three
significant digits: 999 is fine, 1000==1e3 is fine, but
1001 has too many digits.) So the conversion is inexact,
and if "round to even" is in effect the result will be a
hair too small -- in fact, it will be the same result as was
obtained from the preceding exact conversion. That is, `ticks'
increased but `(float)ticks' did not.

On the next count, the problem disappears momentarily:
the low-order bit of `ticks' is now a zero, so the number of
significant bits is once again within the capacity of `float'.
The conversion is again exact -- but look what's happened: the
value `(float)ticks' has advanced two steps at once. You've
entered a regime where `(float)ticks' "sticks" at one value
for two ticks before advancing to the correct result; it
"increments every other time."

As you go higher still, `ticks' will eventually attain
two more significant bits than a `float' can handle, and
will "stick" at one value for four cycles before increasing.
And then you'll get to three bits too many and an eight-
cycle plateau, and so on. (Consider the decimal analogy
again: after you reach 1000000==100e4, you've got to count
upward many more times before reaching 101e4==1010000.)

However, the cure for your case seems obvious: Whether
you know it or not, you're actually employing `double'
arithmetic in this expression because the constant `0.004'
has type `double'. So why convert to `float', losing a few
bits in the process, only to go ahead and re-convert that
mangled value to a `double'? Just make `fVal' a `double'
to begin with and get rid of the `(float)' cast, and you
should be immune to this effect.

There may be other problems elsewhere, of course, but
this problem, at least, will cease to bother you.

--
Er*********@sun.com

Nov 13 '05 #16

Dan Pop

In <3F***************@sun.com> Eric Sosman <Er*********@sun.com> writes:

However, the cure for your case seems obvious: Whether
you know it or not, you're actually employing `double'
arithmetic in this expression because the constant `0.004'
has type `double'. So why convert to `float', losing a few
bits in the process, only to go ahead and re-convert that
mangled value to a `double'? Just make `fVal' a `double'
to begin with and get rid of the `(float)' cast, and you
should be immune to this effect.

Given its target platform, what he may really want to do is to replace
0.004 by 0.004f so that double precision arithmetic is completely
avoided. That is, unless his application has plenty of spare cpu cycles
to burn...

Single precision IEEE-754 floating point is already painfully slow
on an 8-bit micro with no hardware floating point support. Any trick
that allows avoiding floating point completely is a big win (and a big
saver of ROM memory space).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 13 '05 #17

Eric Sosman

Dan Pop wrote:

In <3F***************@sun.com> Eric Sosman <Er*********@sun.com> writes:
However, the cure for your case seems obvious: Whether
you know it or not, you're actually employing `double'
arithmetic in this expression because the constant `0.004'
has type `double'. So why convert to `float', losing a few
bits in the process, only to go ahead and re-convert that
mangled value to a `double'? Just make `fVal' a `double'
to begin with and get rid of the `(float)' cast, and you
should be immune to this effect.

Given its target platform, what he may really want to do is to replace
0.004 by 0.004f so that double precision arithmetic is completely
avoided. That is, unless his application has plenty of spare cpu cycles
to burn...

Single precision IEEE-754 floating point is already painfully slow
on an 8-bit micro with no hardware floating point support. Any trick
that allows avoiding floating point completely is a big win (and a big
saver of ROM memory space).

I'm not familiar with his platform; maybe `double'
is out of the question. If so, I don't see how he can
avoid the "stair-step" problem that occurs with large
counts, except possibly by breaking the count into two
pieces and extracting two `float' quantities instead
of one. E.g.,

float hi, lo;
lo = (ticks & 0xFFFF) * 0.004f;
hi = (ticks >> 16) * (0.004f * 65536);

Of course, then he's stuck with two `float' values and the
necessity to handle them both, more or less doubling the
amount of work that needs to be done with them elsewhere.
`double' might, perhaps, turn out to be cheaper after all.

To the O.P.: What is the purpose of this floating-point
result? What do you do with it; what decisions are based
upon it? Perhaps we can come up with a way to avoid floating-
point altogether, and stay strictly in integer-land.

--
Er*********@sun.com

Nov 13 '05 #18

Christian Bau

In article <ae**************************@posting.google.com >,
bi*****@hotmail.com (Andy) wrote:

The compiler is Keil for Intel 8051 and derivatives. It's
an embedded compiler. It was actually my mistake. I think the
code is actually working, but the test code I wrote had a race
condition on the variable gcGCLK with the ISR that actually
increments this variable. The actual code is appended at the
end of my message.
But here's a real question, given the code below

unsigned long ticks = 0; /* 32-bit */
float fVal; /* 32-bit IEEE-754 */

while(++ticks) {
fVal = (float)ticks * 0.004;
}

1. Can I expect fVal to always not decreasing as ticks is
incremented? (until of course when ticks wraps around
to 0)
2. Does fVal covers all the integral values? ie, could it
go from say 56 to 60 skipping 57, 58, and 59?
3. In this example, each integral value equals to 250
ticks. Are all intervals between any two consecutive
integral values of fVal always 250 ticks? (within tolerance
of course). ie, between fVal of 250 and 251, there're
250 ticks. Is this true for all integral intervals of fVal?
3. How about for a smaller number like 0.00004.

You will have a problem with large numbers. IEEE 32 bit float has 24 bit
for the mantissa including the leading "1" in the mantissa which is
never stored.

So for floating point numbers 1 <= x < 2, the resolution is 2^-23. That
means the difference between x and the next largest floating point
number is 2^-23. For 2^23 <= x < 2^24 the resolution is 1. For 2^31 <= x
< 2^32 the resolution is 2^8 = 256, so the difference between one
floating point number and the next larger floating point number is 256.
256 * 0.004 = 1.024 > 1 so yes, you will not cover all integral values
when x is large.

Is there a good reason why you don't write

ticks / 250

?

Nov 13 '05 #19

Dik T. Winter

In article <3F***************@sun.com> Er*********@Sun.COM writes:

Dan Pop wrote: ....
Single precision IEEE-754 floating point is already painfully slow
on an 8-bit micro with no hardware floating point support. Any trick
that allows avoiding floating point completely is a big win (and a big
saver of ROM memory space).

I'm not familiar with his platform; maybe `double'
is out of the question.

Yup. Actually normal fp is also out of the question...
If so, I don't see how he can
avoid the "stair-step" problem that occurs with large
counts, except possibly by breaking the count into two
pieces and extracting two `float' quantities instead
of one. E.g.,

float hi, lo;
lo = (ticks & 0xFFFF) * 0.004f;
hi = (ticks >> 16) * (0.004f * 65536);

This ignores something he wants. When ticks is a multiple of 250
the exact integer value should be produced as a floating point number.
What I wrote in a previous post still stands:
(float)(ticks / 250) + (float)(ticks % 250) * 0.004f.
I am looking for a way to do "ticks / 250" faster on an 8-bit micro
than just division (which, again, is slow).
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Nov 13 '05 #20

Andy

Thanks for your great answers. As I understand it, as the tick
gets greater then 2^24, then all I'm loosing are the lower 8-bits
of the tick. Then since 256 is only 1 second, the worse thing
that happens is that the result will be rounded to the nearest
second or a little more than that. I will not even loose
anything close to a minute or an hour. Would I?

TIA
Andy
"Dik T. Winter" <Di********@cwi.nl> wrote in message news:<Hp********@cwi.nl>...

In article <ae**************************@posting.google.com > bi*****@hotmail.com (Andy) writes:
> But here's a real question, given the code below
>
> unsigned long ticks = 0; /* 32-bit */
> float fVal; /* 32-bit IEEE-754 */
>
> while(++ticks) {
> fVal = (float)ticks * 0.004;
> }
>
> 1. Can I expect fVal to always not decreasing as ticks is
> incremented? (until of course when ticks wraps around
> to 0)

Yes.
> 2. Does fVal covers all the integral values? ie, could it
> go from say 56 to 60 skipping 57, 58, and 59?

No. It will not go from 56 to 60, but it may go from slightly less than
59 to slightly more than 59, skipping 59 itself.
> 3. In this example, each integral value equals to 250
> ticks. Are all intervals between any two consecutive
> integral values of fVal always 250 ticks? (within tolerance
> of course). ie, between fVal of 250 and 251, there're
> 250 ticks. Is this true for all integral intervals of fVal?

Because of the above observation, no, not necessarily.
> 3. How about for a smaller number like 0.00004.

Similar answer. 0.04 and 0.00004 are not exactly representable as
floating point numbers, so rounding occurs both when the representation
of those numbers is created and when this value is used in the
multiplication. A better way would be to calculate:
(float)ticks / 250.0
but that may be slower on your system. (250.0 is exactly representable,
so IEEE mandates that when ticks is an integer that is a multiple of
250 the result should be exact.)

Another problem with your code is that when ticks exceeds 2**24 the
number is no longer exactly represenatable as a floating point number,
so all bets are off.

To get the best possible answer you need something like:
(float)(ticks / 250) + (float)(ticks % 250)/250.0

Nov 13 '05 #21

Andy

Using double is pretty much out of the question. I can tolerate
a stair-step of say less than 1 minute when the number gets huge.
As long as the error is relatively a small percentage of the
actual count. Is the worst case error 250/(2^32)? Or how do
you calculate the worse case error?
By the way, I'm kinda slow replying to the posts because
I'm using the google newfeed service. It has response time
of about 9 hours...

TIA
Andy
"Dik T. Winter" <Di********@cwi.nl> wrote in message news:<Hp********@cwi.nl>...

In article <3F***************@sun.com> Er*********@Sun.COM writes:
> Dan Pop wrote:

...
> > Single precision IEEE-754 floating point is already painfully slow
> > on an 8-bit micro with no hardware floating point support. Any trick
> > that allows avoiding floating point completely is a big win (and a big
> > saver of ROM memory space).

>
> I'm not familiar with his platform; maybe `double'
> is out of the question.

Yup. Actually normal fp is also out of the question...
> If so, I don't see how he can
> avoid the "stair-step" problem that occurs with large
> counts, except possibly by breaking the count into two
> pieces and extracting two `float' quantities instead
> of one. E.g.,
>
> float hi, lo;
> lo = (ticks & 0xFFFF) * 0.004f;
> hi = (ticks >> 16) * (0.004f * 65536);

This ignores something he wants. When ticks is a multiple of 250
the exact integer value should be produced as a floating point number.
What I wrote in a previous post still stands:
(float)(ticks / 250) + (float)(ticks % 250) * 0.004f.
I am looking for a way to do "ticks / 250" faster on an 8-bit micro
than just division (which, again, is slow).

Nov 13 '05 #22

Andy

Yes. I do not want to wast CPU cycles. My intend is not really
to cover all the integral values when the number gets huge. If I
only loose one second for anything greater than 2^24 (that's >18 hours
BTW), then that's ok. With 32-bits, I should be able to cover
something like 198 days and if the error is even one minute out
of 180 days, then that's fine, but one day is not. What's the
maximum error I can expect?

TIA
Andy

Christian Bau <ch***********@cbau.freeserve.co.uk> wrote in message news:<ch*********************************@slb-newsm1.svr.pol.co.uk>...

when x is large.

Is there a good reason why you don't write

ticks / 250

?

Nov 13 '05 #23

Eric Sosman

Andy wrote:

Thanks for your great answers. As I understand it, as the tick
gets greater then 2^24, then all I'm loosing are the lower 8-bits
of the tick. Then since 256 is only 1 second, the worse thing
that happens is that the result will be rounded to the nearest
second or a little more than that. I will not even loose
anything close to a minute or an hour. Would I?

For the benefit of those who (like me) do not entirely
understand exactly what you're trying to do, could you
describe what these "ticks" are supposed to be and what
you are trying to do with them?

... and if you're trying to convert a 256 Hz "tick" to
seconds, multiplying by a floating-point value is surely a
poor way to proceed. Even an integer division is overkill;
a simple shift-and-mask will do all that's necessary. If
you're willing to think about fixed-point arithmetic, even
that tiny amount of work is more than required!

So, what's the goal?

--
Er*********@sun.com

Nov 13 '05 #24

Christian Bau

In article <ae**************************@posting.google.com >,
bi*****@hotmail.com (Andy) wrote:

Yes. I do not want to wast CPU cycles. My intend is not really
to cover all the integral values when the number gets huge. If I
only loose one second for anything greater than 2^24 (that's >18 hours
BTW), then that's ok. With 32-bits, I should be able to cover
something like 198 days and if the error is even one minute out
of 180 days, then that's fine, but one day is not. What's the
maximum error I can expect?

Couldn't you just use two separate counters for seconds and ticks?

You are multiplying ticks by 0.004, so every 250 times you would add a
second. You could do something like this:

static unsigned long whole_seconds = 0;
static unsigned int sub_seconds = 0;
static unsigned long last_ticks;

Set last_ticks to ticks when you start. Then whenever you check ticks,
you do the following:
ticks = <calculate current time>
while (last_ticks != ticks) {
++last_ticks;
if (++sub_seconds == 250) { sub_seconds = 0; ++whole_seconds; }
}

No floating point arithmetic; that should be a bit faster on an 8051.

Nov 13 '05 #25

J. J. Farrell

bi*****@hotmail.com (Andy) wrote in message news:<ae**************************@posting.google. com>...

Christian Bau <ch***********@cbau.freeserve.co.uk> wrote in message news:<ch*********************************@slb-newsm1.svr.pol.co.uk>...

Is there a good reason why you don't write

ticks / 250

Yes. I do not want to wast CPU cycles. My intend is not really
to cover all the integral values when the number gets huge. If I
only loose one second for anything greater than 2^24 (that's >18 hours
BTW), then that's ok. With 32-bits, I should be able to cover
something like 198 days and if the error is even one minute out
of 180 days, then that's fine, but one day is not. What's the
maximum error I can expect?

Excuse me if this point is stupid, as I know next to nothing
about FP arithmetic. I don't see how ((float)ticks * 0.004f) can
possibly use fewer cycles than (float)(ticks / 250), particularly
on a system without hardware floating point arithmetic.

I confess I'm puzzled by this whole discussion. What are you trying
to achieve? Why are you using FP arithmetic instead of integer -
particularly if CPU cycles are important? Unless I'm really missing
something, you'd get perfect accuracy up to 136 years with integer.

I guess I'm missing something obvious ...

Nov 13 '05 #26

Andy

Eric,
My goal is to provide a generic timing device which would
provide accuracy (although not exact) from days down to
milliseconds. The idea is to have a free running 32-bit
timer (tick) that all others compare for timing. I'm using
multiplication because the idea is to not limit the
free running tick counter to 1 frequence (as in my previous
examples 4 milliseconds). Maybe a concrete example would help.

typedef unsigned long GCLK_T;
GCLK_T gzFreeTicks; /* this gets incremented in an ISR */
GCLK_T GCLK(void); /* provides atomic read of gzFreeTicks */

#define SECS_PER_TICK 0.004 /* seconds for each tick */
#define MSECS_TO_TICKS(MSECS) xxxx /* converting milliseconds to ticks */

GCLK_T ElapsedTicks(GCLK_T ticks) {
return(GCLK() - ticks);
}

unsigned long ElapsedSecs(GCLK_T ticks) {
return((float)ElapsedTicks(ticks) * (float)(SECS_PER_TICK));
}

/* has a endless loop with one 100 millisecond task */
void main(void) {
GCLK_T zMyTicks;

zMyTicks = GCLK();
while(1) {
/* this provides a very fast compare */
if (ElapsedTicks(zMyTicks) > MSECS_TO_TICKS(100)) {
Do100MillisecondTask();
zMyTicks = GCLK();
}
}
}

Hope this helps.
Andy

Eric Sosman <Er*********@sun.com> wrote in message news:<3F***************@sun.com>...

For the benefit of those who (like me) do not entirely
understand exactly what you're trying to do, could you
describe what these "ticks" are supposed to be and what
you are trying to do with them?

... and if you're trying to convert a 256 Hz "tick" to
seconds, multiplying by a floating-point value is surely a
poor way to proceed. Even an integer division is overkill;
a simple shift-and-mask will do all that's necessary. If
you're willing to think about fixed-point arithmetic, even
that tiny amount of work is more than required!

So, what's the goal?

Nov 13 '05 #27

Andy

Keeping a seconds counter is out of the question since then
you're forced to increment the ticks at a frequency exactly
divisable by one second. Please see my previous reply to
Eric and maybe you will get a good idea what I'm trying to
accomplish. But the basic idea is to allow the ticks to
be incremented at any frequency because so often, timers
are hard to come by. I do not want to dedicate an entire
timer just for this.
The equation

unsigned long elapsedSeconds;

seconds = (float)elasedSeconds * 0.004;

will always yield a valid number for all 32-bits of
elapsedSeconds, right? I mean it won't give a number that's
hours, days, or years away from the actual value when
elapsedSeconds is greater than 24-bits, right?
If that's the case, then I think I'm happy with it.

Andy
Christian Bau <ch***********@cbau.freeserve.co.uk> wrote in message news:<ch*********************************@slb-newsm1.svr.pol.co.uk>...

Couldn't you just use two separate counters for seconds and ticks?

You are multiplying ticks by 0.004, so every 250 times you would add a
second. You could do something like this:

static unsigned long whole_seconds = 0;
static unsigned int sub_seconds = 0;
static unsigned long last_ticks;

Set last_ticks to ticks when you start. Then whenever you check ticks,
you do the following:
ticks = <calculate current time>
while (last_ticks != ticks) {
++last_ticks;
if (++sub_seconds == 250) { sub_seconds = 0; ++whole_seconds; }
}

No floating point arithmetic; that should be a bit faster on an 8051.

Nov 13 '05 #28

Andy

I'm sorry, I was thinking of this equation

(float)(ticks / 250) + (float)(ticks % 250)/250.0

posted by another person when I wrote the reply. Please see my
reply to Eric to get an idea of what I'm trying to accomplish.

TIA
Andy
jj*@bcs.org.uk (J. J. Farrell) wrote in message news:<5c**************************@posting.google. com>...

bi*****@hotmail.com (Andy) wrote in message news:<ae**************************@posting.google. com>...
Christian Bau <ch***********@cbau.freeserve.co.uk> wrote in message news:<ch*********************************@slb-newsm1.svr.pol.co.uk>...

Is there a good reason why you don't write

ticks / 250

Yes. I do not want to wast CPU cycles. My intend is not really
to cover all the integral values when the number gets huge. If I
only loose one second for anything greater than 2^24 (that's >18 hours
BTW), then that's ok. With 32-bits, I should be able to cover
something like 198 days and if the error is even one minute out
of 180 days, then that's fine, but one day is not. What's the
maximum error I can expect?

Excuse me if this point is stupid, as I know next to nothing
about FP arithmetic. I don't see how ((float)ticks * 0.004f) can
possibly use fewer cycles than (float)(ticks / 250), particularly
on a system without hardware floating point arithmetic.

I confess I'm puzzled by this whole discussion. What are you trying
to achieve? Why are you using FP arithmetic instead of integer -
particularly if CPU cycles are important? Unless I'm really missing
something, you'd get perfect accuracy up to 136 years with integer.

I guess I'm missing something obvious ...

Nov 13 '05 #29

Andy

How about

float a,b,c;
assert (a > 1.0);
assert (a < pow(2,32)); /* full range (excluding 0) of unsigned long */
assert (b != 0); /* b is some small but representable non-zero value */
assert (b < 1.0);
c = a*b;
assert (c != 0);
assert (!isnan(c))

Please note float instead of double. And also, what kind of
error can I expect out the product of a*b?

TIA
Andy

"Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> wrote in message news:<Pi***********************************@unix48 .andrew.cmu.edu>...

On Thu, 4 Dec 2003, John Smith wrote:

Arthur J. O'Dwyer wrote:

First, as several others have pointed out, if we have

double a,b,c; /* initialized somehow */
assert(a > 1.0);
assert(b > 0.0);
assert(b < 1.0);
c = a*b;

then it is always the case that

assert(c != 0.0);

However, it is plausible that the OP might have initialized
'b' in such a way as to make him *think* it was a small positive
value, while in fact it had already been corrupted by round-off
error. For example, I think

<snip..>
HTH,
-Arthur

Nov 13 '05 #30

Eric Sosman

Andy wrote:

Eric,
My goal is to provide a generic timing device which would
provide accuracy (although not exact) from days down to
milliseconds. The idea is to have a free running 32-bit
timer (tick) that all others compare for timing. I'm using
multiplication because the idea is to not limit the
free running tick counter to 1 frequence (as in my previous
examples 4 milliseconds). Maybe a concrete example would help.

typedef unsigned long GCLK_T;
GCLK_T gzFreeTicks; /* this gets incremented in an ISR */
GCLK_T GCLK(void); /* provides atomic read of gzFreeTicks */

#define SECS_PER_TICK 0.004 /* seconds for each tick */
#define MSECS_TO_TICKS(MSECS) xxxx /* converting milliseconds to ticks */

GCLK_T ElapsedTicks(GCLK_T ticks) {
return(GCLK() - ticks);
}

unsigned long ElapsedSecs(GCLK_T ticks) {
return((float)ElapsedTicks(ticks) * (float)(SECS_PER_TICK));
}
Since you only care about the whole seconds, why bother
with floating-point at all?

unsigned long ElapsedSecs(GCLC_T ticks) {
return ticks / TICKS_PER_SEC; // new constant
/* or, if you want rounding: */
return (ticks + TICKS_PER_SEC / 2) / TICKS_PER_SEC;
}
/* has a endless loop with one 100 millisecond task */
void main(void) {

This is the first time I've seen `void main' used
properly.

--
Er*********@sun.com

Nov 13 '05 #31

Dan Pop

In <5c**************************@posting.google.com > jj*@bcs.org.uk (J. J. Farrell) writes:

bi*****@hotmail.com (Andy) wrote in message news:<ae**************************@posting.google. com>...
Christian Bau <ch***********@cbau.freeserve.co.uk> wrote in message news:<ch*********************************@slb-newsm1.svr.pol.co.uk>...
>
> Is there a good reason why you don't write
>
> ticks / 250

Yes. I do not want to wast CPU cycles. My intend is not really
to cover all the integral values when the number gets huge. If I
only loose one second for anything greater than 2^24 (that's >18 hours
BTW), then that's ok. With 32-bits, I should be able to cover
something like 198 days and if the error is even one minute out
of 180 days, then that's fine, but one day is not. What's the
maximum error I can expect?

Excuse me if this point is stupid, as I know next to nothing
about FP arithmetic. I don't see how ((float)ticks * 0.004f) can
possibly use fewer cycles than (float)(ticks / 250), particularly
on a system without hardware floating point arithmetic.

A system without hardware floating point arithmetic may not have hardware
support for 32-bit integer division, either. This is, indeed, the
case with the OP's system. So, if you perform integer division on
ticks, you have to perform 32-bit integer division. If you perform
single precision floating point multiplication, you have to actually
perform 24-bit integer multiplication plus a few other simple operations.
Also keep in mind that multiplication algorithms are usually faster than
division algorithms.

But the OP doesn't need any form of multiplication or division for
what he wants to achieve, everything can be done with integer addition,
using a tick counter and a second counter. When the tick counter reaches
the value 250, it is reset and the second counter is incremented. And
the tick counter can be a single byte, which is very convenient for an
8-bit microcontroller.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 13 '05 #32

Eric Sosman

Dan Pop wrote:

But the OP doesn't need any form of multiplication or division for
what he wants to achieve, everything can be done with integer addition,
using a tick counter and a second counter. When the tick counter reaches
the value 250, it is reset and the second counter is incremented. And
the tick counter can be a single byte, which is very convenient for an
8-bit microcontroller.

It's not clear to me that he gets access to each tick
as it occurs (if it does, keeping a pair of counters for
"seconds" and "ticks since last second" is easy). The
usage he showed involved reading an arbitrary tick value
and converting to seconds; division seems the straightforward
approach.

If even integer division is really expensive *and* if
the conversions are usually performed on tick values that
are "close together," a simple cache might help:

unsigned long ElapsedSecs(GCLK_T ticks) {
static unsigned long lastsecs = 0;
static GCLK_T loticks = 0;
static GCLK_T hiticks = TICKS_PER_SEC;

if (loticks <= ticks && ticks < hiticks) {
/* Still in the same second as last time;
* no arithmetic required.
*/
}
else if (hiticks <= ticks && ticks < hiticks + TICKS_PER_SEC) {
/* Advanced to the next second; the arithmetic
* is very simple.
*/
loticks = hiticks;
hiticks += TICKS_PER_SEC;
++lastsecs;
}
else {
/* Aw, shucks: Do it the hard way. With luck
* this won't happen very often.
*/
lastsecs = ticks / TICKS_PER_SEC;
loticks = lastsecs * TICKS_PER_SEC;
hiticks = loticks + TICKS_PER_SEC;
}

return lastsecs;
}

--
Er*********@sun.com

Nov 13 '05 #33

J. J. Farrell

bi*****@hotmail.com (Andy) wrote in message news:<ae**************************@posting.google. com>...

My goal is to provide a generic timing device which would
provide accuracy (although not exact) from days down to
milliseconds. The idea is to have a free running 32-bit
timer (tick) that all others compare for timing. I'm using
multiplication because the idea is to not limit the
free running tick counter to 1 frequence (as in my previous
examples 4 milliseconds). Maybe a concrete example would help.

typedef unsigned long GCLK_T;
GCLK_T gzFreeTicks; /* this gets incremented in an ISR */
GCLK_T GCLK(void); /* provides atomic read of gzFreeTicks */

#define SECS_PER_TICK 0.004 /* seconds for each tick */
#define MSECS_TO_TICKS(MSECS) xxxx /* converting milliseconds to ticks */

GCLK_T ElapsedTicks(GCLK_T ticks) {
return(GCLK() - ticks);
}

unsigned long ElapsedSecs(GCLK_T ticks) {
return((float)ElapsedTicks(ticks) * (float)(SECS_PER_TICK));
}

/* has a endless loop with one 100 millisecond task */
void main(void) {
GCLK_T zMyTicks;

zMyTicks = GCLK();
while(1) {
/* this provides a very fast compare */
if (ElapsedTicks(zMyTicks) > MSECS_TO_TICKS(100)) {
Do100MillisecondTask();
zMyTicks = GCLK();
}
}
}

You seem to be talking about a straightforward tick-based timer
system. I've seen many implementations done in a variety of ways,
but I've never seen anyone bring FP into it before.

I suggest stepping back and rethinking exactly what you are trying
to achieve. Depending on the granularity required, you can almost
certainly do everything you need with integer multiplication and
division. More likely than not, you can do it efficiently with
integer addition and subtraction. For example, your ISR could
keep a count of milliseconds by having another counter to hold
milliseconds and wrapping the tick counter to zero each time it
crosses a millisecond boundary. If your ISR timing is so tight
that this can't be done in the ISR, you could have some lower
priority code (or even the routines that use the counter) watch
for the tick count passing the millisecond boundary and do the
carry arithmetic by iterative subtraction from the tick count
(with suitable interlocking with the ISR). There are any number
of variants depending on your exact requirements and constraints,
but you shouldn't need to pain yourself with the overheads and
inaccuracies inherent in FP arithmetic done in software.

There are lots of examples of this sort of thing available on
the net - the clock and timer routines in some of the free UNIX-
like OSes (such as OpenBSD) might be useful.

Nov 13 '05 #34

Dik T. Winter

In article <3F***************@sun.com> Er*********@Sun.COM writes:

Dan Pop wrote:

But the OP doesn't need any form of multiplication or division for
what he wants to achieve, everything can be done with integer addition,
using a tick counter and a second counter. When the tick counter reaches
the value 250, it is reset and the second counter is incremented. And
the tick counter can be a single byte, which is very convenient for an
8-bit microcontroller.
It's not clear to me that he gets access to each tick
as it occurs (if it does, keeping a pair of counters for
"seconds" and "ticks since last second" is easy). The
usage he showed involved reading an arbitrary tick value
and converting to seconds; division seems the straightforward
approach.

Indeed, that is also the way I did read it (and I think he later did
confirm this). So he has in his hand a 32 bit tick counter, and he
wants to divide it by 250.
If even integer division is really expensive *and* if
the conversions are usually performed on tick values that
are "close together," a simple cache might help:

unsigned long ElapsedSecs(GCLK_T ticks) {
static unsigned long lastsecs = 0;
static GCLK_T loticks = 0;
static GCLK_T hiticks = TICKS_PER_SEC;

if (loticks <= ticks && ticks < hiticks) {
/* Still in the same second as last time;
* no arithmetic required.
*/
}
else if (hiticks <= ticks && ticks < hiticks + TICKS_PER_SEC) {
/* Advanced to the next second; the arithmetic
* is very simple.
*/
loticks = hiticks;
hiticks += TICKS_PER_SEC;
++lastsecs;
}
else { /* the assumption here is: no wrap around... */ /* Aw, shucks: Do it the hard way. With luck
* this won't happen very often.
*/
lastsecs = ticks / TICKS_PER_SEC; Better do the calculations with 'ticks - hiticks'. When this is fairly
small it may be done using only a few shifts and adds on 16-bit values. loticks = lastsecs * TICKS_PER_SEC;
hiticks = loticks + TICKS_PER_SEC;
}

Rewriting the body:
if (loticks <= ticks && ticks < hiticks) {
return lastsecs;
}
if (ticks < loticks) {
/* wraparound, do not know what to do now */
}
if (ticks >= hiticks) {
/* going a harder way, more than one second. */
if (ticks - hiticks <= 32765) {
unsigned short i = ticks - highticks;
unsigned short j, diff;
j = (i >> 5) - (i >> 7);
diff = (i + j) >> 8;
/* at most one too small. */
lastsecs += diff;
diff = (diff << 8) - (diff << 2) - (diff << 1);
loticks += diff;
hiticks += diff;
} else {
/* a bit more brute force here. only when the
difference can exceed 131 seconds. can be done
in a similar way with 32 bit values. */
}
}
if (ticks >= hiticks) {
loticks = hiticks;
hiticks += TICKS_PER_SEC;
++lastsecs;
}
return lastsecs;
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Nov 14 '05 #35

Andy

Yeah maybe you're right. All I need is a 32-bit millisecond
counter. That would give me almost 50 days. More than enough
for my needs.

Thanks
Andy

jj*@bcs.org.uk (J. J. Farrell) wrote in message news:<5c**************************@posting.google. com>...

bi*****@hotmail.com (Andy) wrote in message news:<ae**************************@posting.google. com>...

You seem to be talking about a straightforward tick-based timer
system. I've seen many implementations done in a variety of ways,
but I've never seen anyone bring FP into it before.

I suggest stepping back and rethinking exactly what you are trying
to achieve. Depending on the granularity required, you can almost
certainly do everything you need with integer multiplication and
division. More likely than not, you can do it efficiently with
integer addition and subtraction. For example, your ISR could
keep a count of milliseconds by having another counter to hold
milliseconds and wrapping the tick counter to zero each time it
crosses a millisecond boundary. If your ISR timing is so tight
that this can't be done in the ISR, you could have some lower
priority code (or even the routines that use the counter) watch
for the tick count passing the millisecond boundary and do the
carry arithmetic by iterative subtraction from the tick count
(with suitable interlocking with the ISR). There are any number
of variants depending on your exact requirements and constraints,
but you shouldn't need to pain yourself with the overheads and
inaccuracies inherent in FP arithmetic done in software.

There are lots of examples of this sort of thing available on
the net - the clock and timer routines in some of the free UNIX-
like OSes (such as OpenBSD) might be useful.

Nov 14 '05 #36

Dan Pop

In <Hp********@cwi.nl> "Dik T. Winter" <Di********@cwi.nl> writes:

In article <3F***************@sun.com> Er*********@Sun.COM writes:
Dan Pop wrote:

But the OP doesn't need any form of multiplication or division for
what he wants to achieve, everything can be done with integer addition,
using a tick counter and a second counter. When the tick counter reaches
the value 250, it is reset and the second counter is incremented. And
the tick counter can be a single byte, which is very convenient for an
8-bit microcontroller.

It's not clear to me that he gets access to each tick
as it occurs (if it does, keeping a pair of counters for
"seconds" and "ticks since last second" is easy). The
usage he showed involved reading an arbitrary tick value
and converting to seconds; division seems the straightforward
approach.

Indeed, that is also the way I did read it (and I think he later did
confirm this). So he has in his hand a 32 bit tick counter, and he
wants to divide it by 250.

The OP mentioned the 8051 microcontroller. The thing doesn't have any
kind of 32-bit hardware tick counter, hence my assumption that this
counter is implemented in software.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #37

Andy

Yes, the tick is incremented in an ISR. All this discussion about
using integer arithmetic is fine as long the tick resolution
is even multiples of a second, but happens say if you need to increment
the tick once every 17.321 milliseconds? How would you do that using
integer?
I guess what I'm interested in is what is the maximum error
can I expect for the formula

fElapsedSecs = ulElapsedTicks * 0.004;

If I'm correct, the max error is very small like < 1e-4% for
the full range of 32-bit value of ulElapsedTicks. Correct??

TIA
Andy

Da*****@cern.ch (Dan Pop) wrote in message news:<br**********@sunnews.cern.ch>...

The OP mentioned the 8051 microcontroller. The thing doesn't have any
kind of 32-bit hardware tick counter, hence my assumption that this
counter is implemented in software.

Dan

Nov 14 '05 #38

Eric Sosman

Andy wrote:

Yes, the tick is incremented in an ISR. All this discussion about
using integer arithmetic is fine as long the tick resolution
is even multiples of a second, but happens say if you need to increment
the tick once every 17.321 milliseconds? How would you do that using
integer?

#define USECS_PER_TICK 17321
static unsigned long seconds = 0, excess = 0;

/* At each tick: */
excess += USECS_PER_TICK;
if (excess >= 1000000) {
excess -= 1000000;
++seconds;
}
--
Er*********@sun.com

Nov 14 '05 #39

glen herrmannsfeldt

Andy wrote:

Yes, the tick is incremented in an ISR. All this discussion about
using integer arithmetic is fine as long the tick resolution
is even multiples of a second, but happens say if you need to increment
the tick once every 17.321 milliseconds? How would you do that using
integer?

Multiply by one integer, generating a product that might take more
bits than the original number, then divide by a second number.

Last I knew, the 8051 didn't have floating point, but the above
method is my preference, even for processors that do.

-- glen

Nov 14 '05 #40

glen herrmannsfeldt

Eric Sosman wrote:

Andy wrote:
Yes, the tick is incremented in an ISR. All this discussion about
using integer arithmetic is fine as long the tick resolution
is even multiples of a second, but happens say if you need to increment
the tick once every 17.321 milliseconds? How would you do that using
integer?
#define USECS_PER_TICK 17321
static unsigned long seconds = 0, excess = 0;

/* At each tick: */
excess += USECS_PER_TICK;
if (excess >= 1000000) {
excess -= 1000000;
++seconds;
}

Even better than the multiply/divide I suggested.

Just like a phase accumulator.

-- glen

Nov 14 '05 #41

Andy

That's good stuff Eric. I'm tempted to use this
method to keep a milliseconds counter, but the problem
is that if I do that, then I'll end up using unsigned
long divisions to convert the milliseconds count
to seconds. This requires 200 bytes more ROM
space than if I were to use float division. (I'm
working with 8K ROM at the present). Do you
know another way to divide an unsigned long value by
1000 using say, short or unsigned short operations?

TIA
Andy
Eric Sosman <Er*********@sun.com> wrote in message news:<3F***************@sun.com>...

Andy wrote:

Yes, the tick is incremented in an ISR. All this discussion about
using integer arithmetic is fine as long the tick resolution
is even multiples of a second, but happens say if you need to increment
the tick once every 17.321 milliseconds? How would you do that using
integer?

#define USECS_PER_TICK 17321
static unsigned long seconds = 0, excess = 0;

/* At each tick: */
excess += USECS_PER_TICK;
if (excess >= 1000000) {
excess -= 1000000;
++seconds;
}

Nov 14 '05 #42

glen herrmannsfeldt

Andy wrote:

That's good stuff Eric. I'm tempted to use this
method to keep a milliseconds counter, but the problem
is that if I do that, then I'll end up using unsigned
long divisions to convert the milliseconds count
to seconds. This requires 200 bytes more ROM
space than if I were to use float division. (I'm
working with 8K ROM at the present). Do you
know another way to divide an unsigned long value by
1000 using say, short or unsigned short operations?

(snip)

#define USECS_PER_TICK 17321
static unsigned long seconds = 0, excess = 0;

/* At each tick: */
excess += USECS_PER_TICK;
if (excess >= 1000000) {
excess -= 1000000;
++seconds;
}

Yes, the trick is to use a power of 2, or of 256, for the unit, so
that the long division is by a power of 2 or 256.

Instead of microseconds per tick, use a unit if 1/16777216 of a second.

At each tick, add 290598 to the accumulator, seconds will accumulate
three bytes from the least significant byte.

Though the above code doesn't require a division. The seconds are
incremented at the appropriate time, such that they are the result
of dividing a large number by 1000000, and excess is the remainder.

Using add with carry it is very easy to add multibyte numbers.

-- glen

Nov 14 '05 #43

Dan Pop

In <ae*************************@posting.google.com> bi*****@hotmail.com (Andy) writes:

That's good stuff Eric. I'm tempted to use this
method to keep a milliseconds counter, but the problem
is that if I do that, then I'll end up using unsigned
long divisions to convert the milliseconds count
to seconds.
I'm afraid I don't understand your problem. What millisecond count are
you talking about and why do you need to convert it to seconds?

Eric's code has a microsecond count and a second count. Do you need to
deal with fractions of a second or what?

Dan
Eric Sosman <Er*********@sun.com> wrote in message news:<3F***************@sun.com>...
Andy wrote:
>
> Yes, the tick is incremented in an ISR. All this discussion about
> using integer arithmetic is fine as long the tick resolution
> is even multiples of a second, but happens say if you need to increment
> the tick once every 17.321 milliseconds? How would you do that using
> integer?

#define USECS_PER_TICK 17321
static unsigned long seconds = 0, excess = 0;

/* At each tick: */
excess += USECS_PER_TICK;
if (excess >= 1000000) {
excess -= 1000000;
++seconds;
}

--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #44

Andy

What would you do then if you needed to know the milliseconds?
I'm lost...

TIA
Andy

glen herrmannsfeldt <ga*@ugcs.caltech.edu> wrote in message news:<ykwBb.7929$8y1.37484@attbi_s52>...

Yes, the trick is to use a power of 2, or of 256, for the unit, so
that the long division is by a power of 2 or 256.

Instead of microseconds per tick, use a unit if 1/16777216 of a second.

At each tick, add 290598 to the accumulator, seconds will accumulate
three bytes from the least significant byte.

Though the above code doesn't require a division. The seconds are
incremented at the appropriate time, such that they are the result
of dividing a large number by 1000000, and excess is the remainder.

Using add with carry it is very easy to add multibyte numbers.

-- glen

Nov 14 '05 #45

CBFalconer

*** RUDE top-posting fixed ***

Andy wrote:

Eric Sosman <Er*********@sun.com> wrote:
Andy wrote:

Yes, the tick is incremented in an ISR. All this discussion
about using integer arithmetic is fine as long the tick
resolution is even multiples of a second, but happens say if
you need to increment the tick once every 17.321 milliseconds?
How would you do that using integer?

#define USECS_PER_TICK 17321
static unsigned long seconds = 0, excess = 0;

/* At each tick: */
excess += USECS_PER_TICK;
if (excess >= 1000000) {
excess -= 1000000;
++seconds;
}

That's good stuff Eric. I'm tempted to use this
method to keep a milliseconds counter, but the problem
is that if I do that, then I'll end up using unsigned
long divisions to convert the milliseconds count
to seconds. This requires 200 bytes more ROM
space than if I were to use float division. (I'm
working with 8K ROM at the present). Do you
know another way to divide an unsigned long value by
1000 using say, short or unsigned short operations?

PLEASE STOP THE TOP-POSTING. It is very annoying and rude. Your
answer goes after the material to which you are replying, after
snipping non-germane stuff.

Simply change the constants to maintain a counter in 1/1024 second
intervals. This won't be too accurate, because your fundamental
resolution is about 17 millisecs. The idea is to maintain
something that can be converted to seconds with only a shift
(power of 2 factor).

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #46

Dan Pop

In <ae**************************@posting.google.com > bi*****@hotmail.com (Andy) writes:

What would you do then if you needed to know the milliseconds?

Use a microsecond counter and a millisecond counter. Update the
microsecond counter first and then update the millisecond counter.
Still no need for any multiplication or division, either integer of
floating point.

Sorry, but if you can't figure these things out yourself, after all the
advice you have received, you're in the wrong business.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #47

Andy

I was thinking of keeping only the excess counter and
not the second counter. I guess I should keep both
and compare to the millisecond counter if I need
millisecond resolution and compare to the second counter
if I need seconds resolution. That would work. I'm
sorry if I seemed stupid, but all along I was thinking
of using only one counter for both milliseconds and
seconds resolution. I didn't stop and think..

TIA
Andy
Da*****@cern.ch (Dan Pop) wrote in message news:<br**********@sunnews.cern.ch>...

In <ae*************************@posting.google.com> bi*****@hotmail.com

I'm afraid I don't understand your problem. What millisecond count are
you talking about and why do you need to convert it to seconds?

Eric's code has a microsecond count and a second count. Do you need to
deal with fractions of a second or what?

Dan
Eric Sosman <Er*********@sun.com> wrote in message news:<3F***************@sun.com>...
Andy wrote:
>
> Yes, the tick is incremented in an ISR. All this discussion about
> using integer arithmetic is fine as long the tick resolution
> is even multiples of a second, but happens say if you need to increment
> the tick once every 17.321 milliseconds? How would you do that using
> integer?

#define USECS_PER_TICK 17321
static unsigned long seconds = 0, excess = 0;

/* At each tick: */
excess += USECS_PER_TICK;
if (excess >= 1000000) {
excess -= 1000000;
++seconds;
}

Nov 14 '05 #48

Andy

Da*****@cern.ch (Dan Pop) wrote in message news:<br**********@sunnews.cern.ch>...

In <ae*************************@posting.google.com> bi*****@hotmail.com >
I'm afraid I don't understand your problem. What millisecond count are
you talking about and why do you need to convert it to seconds?

Eric's code has a microsecond count and a second count. Do you need to
deal with fractions of a second or what?

Yes, I'd need a way to measure from milliseconds to days.

Andy

Nov 14 '05 #49

Andy

Da*****@cern.ch (Dan Pop) wrote in message news:<br**********@sunnews.cern.ch>...

Sorry, but if you can't figure these things out yourself, after all the
advice you have received, you're in the wrong business.

Please stop wasting bandwidth with your senseless remarks.
If I understand you correctly, I would compare to the microseconds
counter if I need milliseconds resolution and I would compare
to the milliseconds counter if I need seconds resolution? But
ideally, I'd want a single counter to compare against for all
my timing needs from milliseconds, to seconds, to days. I guess
I can put these two counters in a struct, but then I'd have
a 64-bit timing variable. Too wastful..
If I missed your point completely, then maybe I'm too stupid for
this business and should look for something else soon...
Anyone needs a "no-good-for-low-level, but maybe good for high-level
programmer"? :-)

TIA
Andy

Nov 14 '05 #50

32-bit IEEE float multiplication

Similar topics