michalk@awpi.com (Brian K. Michalk) writes:
[color=blue]
> I have a perl app that is calculating the standard deviation of a 4000
> element 16 bit integer array, that has large dynamic content. I.e,
> the range spans a significant portion of the 16 bits.
>
> I am trying to increase the performance of this critical loop, and
> I've found that I am exceeding the 32 bit registers causing Perl to
> switch to an infinite precision math library. I've rewritten the loop
> such that I now have only one term that overflows 32 bits.
>
> I am at the point of inlining assembler to speed up this loop.
> However, the MMX terminology is confusing. I see there are 64bit MMX
> registers that can do multiply and accumulate, but I don't see any
> 64bit add commands. My loop is something like this:
> for (i=0; i<4000; i++) { sum += array[i]; }
> Of course my loop is slightly bigger than this, but even this toy
> example has terrible performance on real data when sum exceeds 32
> bits.
>
> Before I go into the time of inlining C (and assembler), are the MMX
> registers cabable of accumulating a 64 bit register with adding a
> 16(or 32) bit operand?
>
> By the way, I've already tried the GMP library, and it did not help.
> I also recompiled everything with the MMX optimizations turned on. Oh
> yeah, this is on Linux.[/color]
Sounds like you need a better algorithm for calculating the standard
deviation. The Welford update formula updates the mean (which can be 16
bits) rather than the usual formula of sum/n (which needs to be 16+log2(n)
bits)
For example, have a look at this code
From
http://www.cs.trinity.edu/~joldham0/...var-Welford.pl
[color=blue]
> #! /usr/bin/perl -w
> ##
> ## Oldham, Jeffrey D.
> ## 2000Mar13
> ## util
> ##
> ## Compute the Mean and Variance Using Welford's Formula
>
> ## Compute the mean and variance of a stream of floating-point
> ## numbers. The numbers should appear at the beginning of each line.
> $mean = 0.0;
> $variance = 0.0;
> $count = 0;
>
> # Compute the mean and maximum.
> while (<ARGV>) {
> ($number, $_) = split;
> ++$count;
> $diff = $number - $mean;
> $mean += $diff / $count;
> $variance += $diff * $diff * ($count -1) / $count;
> }
> $variance /= $count;
>
> printf("The mean is %g.\n", $mean);
> printf("The variance is %g.\n", $variance);
> printf("The standard deviation is %g.\n", sqrt($variance));[/color]
Also if you're desperate for speed then, since the standard deviation is
only an estimate, you can double the speed of your code by calculating the
standard deviation of every other number (ie use 2000 of them) and it will
be pretty good estimate. You could probably just use a few hundred samples
and still get a very good approximation.
P.S.
If you do code this algorithm in MMX, please post it here as more people
should understand how to calculate means and SDs with this more accurate
formula.
--
"Thinks: I can't think of a thinks. End of thinks routine": Blue Bottle
** Aunty Spam says: Remove the trailing x from the To: field to reply **