471,873 Members | 2,072 Online

# How to write a small graceful gcd function?

My small function works, but I have some questions. And I want to
listen to you on How it is implemented?

1. The function does not check if parameter x is larger or smaller than
parameter y.

2. Is it better to use unsigned int or unsigned long as the type of
parameters x and y? This change may remove the if statement.

/*The Greatest Common Divisor, GCD for short, of two positive integers
can be computed with Euclid's division algorithm. Let the given numbers
be a and b, a >= b. Euclid's division algorithm has the following
steps:
1. Compute the remainder c of dividing a by b.
2. If the remainder c is zero, b is the greatest common divisor.
3. If c is not zero, replace a with b and b with the remainder c. Go
back to step (1).

http://www.cs.mtu.edu/~shene/COURSES...hap04/gcd.html */

/*computes the GCD (Greatest Common Divisor) of positive integers x and
y with Euclid's algorithm. return the GCD, or -1 for failure. - jhl,
Jul 2006*/

int gcd(int x, int y){
if (x 0 && y 0){
while (x % y){
y = x + y;
x = y - x;
y = (y - x) % x;
}
} else {
y = -1; /*input invalid*/
}
return y;
}

lovecreatesbeauty

Jul 15 '06 #1
74 7224
Hi,
/*The Greatest Common Divisor, GCD for short, of two positive integers
can be computed with Euclid's division algorithm. Let the given numbers
be a and b, a >= b. Euclid's division algorithm has the following
steps:
1. Compute the remainder c of dividing a by b.
2. If the remainder c is zero, b is the greatest common divisor.
3. If c is not zero, replace a with b and b with the remainder c. Go
back to step (1).
int gcd (unsigned int a, unsigned int b)
{
unsigned int c;

if ( !a && !b )
return -1;

if (a b)
{
c = a;
a = b;
b = c;
}
while( (c=a%b) 0 )
{
a = b;
b = c;
}
return b;
}

Similar code in fortran is also available on the link u gave but I
could not understand why you have done this:
y = x + y;
x = y - x;
y = (y - x) % x;
Regards,

Nabeel Shaheen

lovecreatesbeauty wrote:
My small function works, but I have some questions. And I want to
listen to you on How it is implemented?

1. The function does not check if parameter x is larger or smaller than
parameter y.

2. Is it better to use unsigned int or unsigned long as the type of
parameters x and y? This change may remove the if statement.

/*The Greatest Common Divisor, GCD for short, of two positive integers
can be computed with Euclid's division algorithm. Let the given numbers
be a and b, a >= b. Euclid's division algorithm has the following
steps:
1. Compute the remainder c of dividing a by b.
2. If the remainder c is zero, b is the greatest common divisor.
3. If c is not zero, replace a with b and b with the remainder c. Go
back to step (1).

http://www.cs.mtu.edu/~shene/COURSES...hap04/gcd.html */

/*computes the GCD (Greatest Common Divisor) of positive integers x and
y with Euclid's algorithm. return the GCD, or -1 for failure. - jhl,
Jul 2006*/

int gcd(int x, int y){
if (x 0 && y 0){
while (x % y){
y = x + y;
x = y - x;
y = (y - x) % x;
}
} else {
y = -1; /*input invalid*/
}
return y;
}

lovecreatesbeauty
Jul 15 '06 #2
lovecreatesbeauty wrote:
My small function works, but I have some questions. And I want to
listen to you on How it is implemented?

1. The function does not check if parameter x is larger or smaller than
parameter y.

2. Is it better to use unsigned int or unsigned long as the type of
parameters x and y? This change may remove the if statement.

/*The Greatest Common Divisor, GCD for short, of two positive integers
can be computed with Euclid's division algorithm. Let the given numbers
be a and b, a >= b. Euclid's division algorithm has the following
steps:
1. Compute the remainder c of dividing a by b.
2. If the remainder c is zero, b is the greatest common divisor.
3. If c is not zero, replace a with b and b with the remainder c. Go
back to step (1).

http://www.cs.mtu.edu/~shene/COURSES...hap04/gcd.html */

/*computes the GCD (Greatest Common Divisor) of positive integers x and
y with Euclid's algorithm. return the GCD, or -1 for failure. - jhl,
Jul 2006*/

int gcd(int x, int y){
if (x 0 && y 0){
while (x % y){
y = x + y;
x = y - x;
y = (y - x) % x;
}
} else {
y = -1; /*input invalid*/
}
return y;
}

lovecreatesbeauty
int gcd( int m, int n ) // recursive
{
if(n==0) { return m; }
else
}

--
Julian V. Noble
Professor Emeritus of Physics
University of Virginia
Jul 15 '06 #3
"Julian V. Noble" <jv*@virginia.eduwrites:
[...]
int gcd( int m, int n ) // recursive
{
if(n==0) { return m; }
else
}
The temporary is unnecessary, and the formatting is just odd. Here's
how I'd write it:

int gcd(int m, int n)
{
if (n == 0) {
return m;
}
else {
return gcd(n, m % n);
}
}

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jul 15 '06 #4

Keith Thompson wrote:
"Julian V. Noble" <jv*@virginia.eduwrites:
[...]
int gcd( int m, int n ) // recursive
{
if(n==0) { return m; }
else
}

The temporary is unnecessary, and the formatting is just odd. Here's
how I'd write it:

int gcd(int m, int n)
{
if (n == 0) {
return m;
}
else {
return gcd(n, m % n);
}
}
Thanks.

The function accepts zeros and negative integers as their arguments.
How are users without much mathematical knowledge like me supposed to
provide correct input to use it?

If compare an integer with zero value, will the following changes be
better?

1.

/*...*/ /*some necessary check*/

if (!n) {
return m;
} else {
return gcd(n, m % n);
}

2.

/*...*/ /*some necessary check*/

if (n) {
return gcd(n, m % n);
} else {
return m;
}

lovecreatesbeauty

Jul 15 '06 #5
"lovecreatesbeauty" <lo***************@gmail.comwrites:
Keith Thompson wrote:
>"Julian V. Noble" <jv*@virginia.eduwrites:
[...]
int gcd( int m, int n ) // recursive
{
if(n==0) { return m; }
else
}

The temporary is unnecessary, and the formatting is just odd. Here's
how I'd write it:

int gcd(int m, int n)
{
if (n == 0) {
return m;
}
else {
return gcd(n, m % n);
}
}

Thanks.

The function accepts zeros and negative integers as their arguments.
How are users without much mathematical knowledge like me supposed to
provide correct input to use it?
The obvious solution to that is to change the arguments and result
from int to unsigned.

Different mathematical functions have different domains and ranges
(values that make sense for arguments and results). C, unlike some
other languages, doesn't let you declare a type with a specified range
of values -- but in this case, unsigned happens to be just what you
want.
If compare an integer with zero value, will the following changes be
better?

1.

/*...*/ /*some necessary check*/

if (!n) {
return m;
} else {
return gcd(n, m % n);
}

2.

/*...*/ /*some necessary check*/

if (n) {
return gcd(n, m % n);
} else {
return m;
}
In my opinion, not at all. Using "if (n)" or "if (!n)" makes sense if
n is a condition, something that can be logically true or false.
Here, n is a numeric value; comparing it to 0 isn't checking whether
it's true or false, it's just comparing it to 0.

It means exactly the same thing to the compiler, of course, but
clarity for the human reader is just as important.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jul 15 '06 #6
Keith Thompson wrote:
"lovecreatesbeauty" <lo***************@gmail.comwrites:
Keith Thompson wrote:
"Julian V. Noble" <jv*@virginia.eduwrites:
[...]
int gcd( int m, int n ) // recursive
{
if(n==0) { return m; }
else
}

The temporary is unnecessary, and the formatting is just odd. Here's
how I'd write it:

int gcd(int m, int n)
{
if (n == 0) {
return m;
}
else {
return gcd(n, m % n);
}
}
Thanks.

The function accepts zeros and negative integers as their arguments.
How are users without much mathematical knowledge like me supposed to
provide correct input to use it?

The obvious solution to that is to change the arguments and result
from int to unsigned.

Different mathematical functions have different domains and ranges
(values that make sense for arguments and results). C, unlike some
other languages, doesn't let you declare a type with a specified range
of values -- but in this case, unsigned happens to be just what you
want.
The recursive way becomes the worst one and can not be improved to add
more parameter validation to it. If the function accepts input from
other automatic software systems, then someone should still keep an eye
on it, because the result may be wrong without warning.

int gcd(int x, int y){
if (x 0 && y 0){
while (x % y){
y = x + y;
x = y - x;
y = (y - x) % x;
}
} else {
y = -1; /*input invalid*/
}
return y;
}

int gcd3(int m, int n){
if (n == 0){
return m;
} else {
return gcd3(n, m % n);
}
}

\$ ./a.out
gcd(8, 8): 8
gcd3(8, 8): 8
\$ ./a.out
gcd(0, 8): -1
gcd3(0, 8): 8
\$ ./a.out
gcd(8, 0): -1
gcd3(8, 0): 8
\$

lovecreatesbeauty

Jul 16 '06 #7
/*
The small and graceful versions are not as robust as those that
use a bit more code and also perform some sanity checks.
IMO-YMMV.
*/
/*
recursive variants:
*/
/* Recursive, using Knuth's subtraction trick: */
int gcd(int a, int b)
{
return a == b ? a : a b ? gcd(b, a - b) : gcd(b, b - a);
}
/* Recursive via simplest definition: */
int gcd(int a, int b)
{
return b ? gcd(b, a % b) : a;
}
/* Slight variant of one directly above: */
int gcd(int a, int b)
{
if (!b)
return a;
return gcd(b, a % b);
}

/*
Iterative variants:
*/
/* Iterative version with Knuth's subtraction method: */
int gcd(int a, int b)
{
while (a) {
if (a < b) {
int t = a;
a = b;
b = t;
}
a = a - b;
}
return b;
}
/* Iterative via fundamental definition: */
int gcd(int a, int b)
{
while (b) {
int t = b;
b = a % b;
a = t;
}
return a;
}
/* Not quite as safe as version directly above this one: */
int gcd(int a, int b)
{
do {
int r = a % b;
a = b;
b = r;
} while (b);
return a;
}
/* Sanity checks tossed in (a good idea): */
int gcd(int a, int b)
{
int t;
if (a < b)
t = a;
else
t = b;
while (a % t || b % t)
t = t - 1;
return t;
}

/*
With a few tricks and using a bigger type:
*/
/************************************************** *****/
/* This function uses the Euclidean Algorithm to */
/* calculate the greatest common divisor of two long */
/* double floating point numbers. */
/************************************************** *****/
/* Programmer: Danniel R. Corbit */
/* */
/* Copyright (C) 1992 by Danniel R. Corbit */
/************************************************** *****/
/* Reference: "Factorization and Primality Testing", */
/* by David M. Bressoud, */
/* Springer-Verlag 1989. */
/* pages 7-12. */
/************************************************** *****/
#include <math.h>
long double ldGcd(long double a, long double b)
{
int shiftcount = 0;
long double tmp;
int i;
/************************************************** *****************/
/* This zero testing stuff may seem odd, since zero is not likely. */
/* However, knowing that neither a nor b is zero will speed up */
/* later operations greatly by elimination of tests for zero. */
/************************************************** *****************/
if (a == 0.0e0L)
{
tmp = b;
}
else if (b == 0.0e0L)
{
tmp = a;
}
else /* Neither a NOR b is zero! */
{
/* Absolute values used. */
a = a 0.0e0L ? a : -a;
b = b 0.0e0L ? b : -b;

/************************************************** ************/
/* While all this fuss about numbers divisible by 2 may seem */
/* like quite a bother, half of the integers in the universe */
/* are evenly divisible by 2. Hence, on a random sample of */
/* input values, great benefit will be realized. The odds */
/* that at least one of a,b is even is 1 - (1/2)*(1/2) = .75 */
/* since the probability that both are odd is .25. */
/************************************************** ************/
/* If a & b are divisible by 2, gcd(a,b) = 2*gcd(a/2,b/2). */
/************************************************** ************/
while (fmodl(a,2.0e0L) == 0.0e0L && fmodl(b,2.0e0L) == 0.0e0L)
{
a /= 2.0e0L;
b /= 2.0e0L;
shiftcount++;
}

/************************************************** ************/
/* If the a is divisible by 2 and b is not divisible by 2, */
/* then gcd(a,b) = gcd(a/2,b). */
/************************************************** ************/
while (fmodl(a,2.0e0L) == 0.0e0L)
{
a /= 2.0e0L;
}

/************************************************** *****/
/* If b is divisible by 2 and a is not divisible by 2, */
/* then gcd(a,b) = gcd(a,b/2). */
/************************************************** *****/
while (fmodl(b,2.0e0L) == 0.0e0L)
{
b /= 2.0e0L;
}

/************************************************** ********************/
/* Make sure the numbers are in the proper order (swap if necessary).
*/
/************************************************** ********************/
if (b a)
{
tmp = a;
a = b;
b = tmp;
}

/****************************************/
/* Euclid's famous gcd algorithm: */
/* Iterate until the remainder is <= 1. */
/****************************************/
while (b 1.0e0L)
{
tmp = b;
b = fmodl(a,b);
a = tmp;
}
if (b == 0.0e0L)
tmp = a;
else
tmp = b;

/************************************************** *********************/
/* If we divided BOTH numbers by factors of 2, we must compensate now.
*/
/************************************************** *********************/
if (shiftcount 0 && tmp 0.0e0L)
for (i = 0; i < shiftcount; i++)
tmp += tmp;
}
return (tmp);
}
Jul 17 '06 #8

lovecreatesbeauty wrote:
My small function works, but I have some questions. And I want to
listen to you on How it is implemented?
Divisions are for chumps...

unsigned gcd(unsigned a, unsigned b)
{
unsigned k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
while (!(b & 1)) { b >>= 1; }
}
return a << k;
}

:-)

[untested, from memory, from my book too...]

Tom

Jul 17 '06 #9
"Tom St Denis" <to********@gmail.comwrote in message
>
lovecreatesbeauty wrote:
>My small function works, but I have some questions. And I want to
listen to you on How it is implemented?

Divisions are for chumps...

unsigned gcd(unsigned a, unsigned b)
{
unsigned k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
while (!(b & 1)) { b >>= 1; }
}
return a << k;
}

:-)
I guess that most of the time repeated subtraction will not be faster than
division with modern processors. I do like this implementation, though.
Of course, it assumes 2's complement machines, but that is pretty much a
forgone conclusion now-days.
Jul 17 '06 #10
Dann Corbit wrote:
I guess that most of the time repeated subtraction will not be faster than
division with modern processors. I do like this implementation, though.
Of course, it assumes 2's complement machines, but that is pretty much a
forgone conclusion now-days.
if b a and b has k bits then you loop at most k times [hint: b - a is
always either even or zero]

So are 16, 32 or 64 subtractions/shifts faster than a small set of
divisions? Most likely yes, specially on things like an ARM. It
really depends on the numbers and the platform.

Also my code doesn't link in the division support [for processors that
lack a divider] so it's a bit more compact. It's also Kernel save
[using divisions in the Kernel last I heard was a no no].

Tom

Jul 17 '06 #11
"Dann Corbit" <dc*****@connx.comwrites:
"Tom St Denis" <to********@gmail.comwrote in message
>unsigned gcd(unsigned a, unsigned b)
{
unsigned k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
while (!(b & 1)) { b >>= 1; }
}
return a << k;
}
[...]
Of course, it assumes 2's complement machines, but that is pretty much a
forgone conclusion now-days.
Does it need that assumption? To me it looks like all the
arithmetic is done with unsigned ints, which behave the same way
regardless of how negative integers are represented.
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
Jul 17 '06 #12
"Tom St Denis" <to********@gmail.comwrote in message
Dann Corbit wrote:
>I guess that most of the time repeated subtraction will not be faster
than
division with modern processors. I do like this implementation, though.
Of course, it assumes 2's complement machines, but that is pretty much a
forgone conclusion now-days.

if b a and b has k bits then you loop at most k times [hint: b - a is
always either even or zero]

So are 16, 32 or 64 subtractions/shifts faster than a small set of
divisions? Most likely yes, specially on things like an ARM. It
really depends on the numbers and the platform.

Also my code doesn't link in the division support [for processors that
lack a divider] so it's a bit more compact. It's also Kernel save
[using divisions in the Kernel last I heard was a no no].

Tom
/* Let me know how long this takes on your system: */

unsigned long long subtractions = 0;
unsigned long long gcd(unsigned long long a, unsigned long long b)
{
unsigned long long k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
subtractions++;
while (!(b & 1)) { b >>= 1; }
}
return a << k;
}

unsigned long long big=129746337890625;
unsigned long long med=8649755859375*7;

#include <stdio.h>
int main(void)
{
unsigned long long answer = gcd(big, med);
printf("gcd %llu and %llu = %llu\n. Number of subtractions was %llu\n",
return 0;
}
Jul 17 '06 #13
"Ben Pfaff" <bl*@cs.stanford.eduwrote in message
news:87************@benpfaff.org...
"Dann Corbit" <dc*****@connx.comwrites:
>"Tom St Denis" <to********@gmail.comwrote in message
>>unsigned gcd(unsigned a, unsigned b)
{
unsigned k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
while (!(b & 1)) { b >>= 1; }
}
return a << k;
}

[...]
>Of course, it assumes 2's complement machines, but that is pretty much a
forgone conclusion now-days.

Does it need that assumption? To me it looks like all the
arithmetic is done with unsigned ints, which behave the same way
regardless of how negative integers are represented.
Right, I was thinking of negative numbers, which this does not have to deal
with.
--
int main(void){char
p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int
putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof
p-1;putchar(p[i]\
);}return 0;}

Jul 17 '06 #14
Dann Corbit wrote:
k = 0;
while (a && b && !(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (a && !(a & 1)) { a >>= 1; }
while (b&&!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
subtractions++;
while (b && !(b & 1)) { b >>= 1; }
}
return a << k;
}

The bug you think you noted was because the internal loop wasn't
stopping. With the fixes you get

gcd 129746337890625 and 60548291015625 = 8649755859375
.. Number of subtractions was 4

Which is what gp gives me. [Note: I *did* say it was from memory and
off the top of my head]

Tom

Jul 17 '06 #15

"Dann Corbit" <dc*****@connx.comwrote in message
news:e9**********@nntp.aioe.org...
"Tom St Denis" <to********@gmail.comwrote in message
>Dann Corbit wrote:
>>I guess that most of the time repeated subtraction will not be faster
than
division with modern processors. I do like this implementation, though.
Of course, it assumes 2's complement machines, but that is pretty much a
forgone conclusion now-days.

if b a and b has k bits then you loop at most k times [hint: b - a is
always either even or zero]

So are 16, 32 or 64 subtractions/shifts faster than a small set of
divisions? Most likely yes, specially on things like an ARM. It
really depends on the numbers and the platform.

Also my code doesn't link in the division support [for processors that
lack a divider] so it's a bit more compact. It's also Kernel save
[using divisions in the Kernel last I heard was a no no].

Tom
/* Let me know how long this takes on your system: */

unsigned long long subtractions = 0;
unsigned long long gcd(unsigned long long a, unsigned long long b)
{
unsigned long long k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
subtractions++;
while (!(b & 1)) { b >>= 1; }
Better use:
while (b && !(b & 1)) { b >>= 1; }

Or there will be problems.
}
return a << k;
}

unsigned long long big=129746337890625;
unsigned long long med=8649755859375*7;

#include <stdio.h>
int main(void)
{
unsigned long long answer = gcd(big, med);
printf("gcd %llu and %llu = %llu\n. Number of subtractions was
%llu\n",
return 0;
}

Jul 17 '06 #16
/*
Now that I actually understand how it works, I like the subtraction trick a
lot better. In this instance, (lots of small repeated factors) it takes the
same number of iterations as the modulo version
*/
unsigned long long modulos = 0;
unsigned long long subtractions = 0;
unsigned long long gcdm(unsigned long long a, unsigned long long b)
{
unsigned long long k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b % a;
modulos++;
while (b && !(b & 1)) { b >>= 1; }
}
return a << k;
}
unsigned long long gcds(unsigned long long a, unsigned long long b)
{
unsigned long long k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
subtractions++;
while (b && !(b & 1)) { b >>= 1; }
}
return a << k;
}
unsigned long long big=129746337890625;
unsigned long long med=8649755859375*7;

#include <stdio.h>
int main(void)
{
unsigned long long answer = gcdm(big, med);
printf("gcdm %llu and %llu = %llu\n. Number of modulos was %llu\n",
printf("gcds %llu and %llu = %llu\n. Number of subtractions was
%llu\n",
return 0;
}

Jul 17 '06 #17
Tom St Denis wrote:

The bug you think you noted was because the internal loop wasn't
stopping. With the fixes you get

gcd 129746337890625 and 60548291015625 = 8649755859375
. Number of subtractions was 4

Which is what gp gives me. [Note: I *did* say it was from memory and
off the top of my head]
Of course a=0 will kill this too. So the fix would be

unsigned gcd(unsigned a, unsigned b)
{
unsigned k, t;
if (!a) return b;
if (!b) return a;

k = 0;
while (a && b && !(a & 1 || b & 1)) { ++k; a >>= 1; b >>= 1; }
while (a && !(a & 1)) { a >>= 1; }
while (b && !(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a ;
while (b && !(b & 1)) { b >>= 1; }
}
return a<<k;
}

That iterates a max of log_2(max(a,b)) times. It's fully linear time.
Whereas the division way is quadratic [unless you get into interpolated
divisions...]

Tom

Jul 17 '06 #18
"Tom St Denis" <to********@gmail.comwrote in message
Tom St Denis wrote:

The bug you think you noted was because the internal loop wasn't
stopping. With the fixes you get

gcd 129746337890625 and 60548291015625 = 8649755859375
. Number of subtractions was 4

Which is what gp gives me. [Note: I *did* say it was from memory and
off the top of my head]

Of course a=0 will kill this too. So the fix would be

unsigned gcd(unsigned a, unsigned b)
{
unsigned k, t;
if (!a) return b;
if (!b) return a;

k = 0;
while (a && b && !(a & 1 || b & 1)) { ++k; a >>= 1; b >>= 1; }
while (a && !(a & 1)) { a >>= 1; }
while (b && !(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a ;
while (b && !(b & 1)) { b >>= 1; }
}
return a<<k;
}

That iterates a max of log_2(max(a,b)) times. It's fully linear time.
Whereas the division way is quadratic [unless you get into interpolated
divisions...]
How can the division way be quadratic? It takes out repeated factors in
every iteration. The above example takes 2 modulus cycles.

gcdm 129746337890625 and 60548291015625 = 8649755859375
.. Number of modulos was 2
gcds 129746337890625 and 60548291015625 = 8649755859375
.. Number of subtractions was 2

C:\tmp>factor 60548291015625
first trying brute force division by small primes
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 7

C:\tmp>factor 129746337890625
first trying brute force division by small primes
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 3
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5
PRIME FACTOR 5

Jul 17 '06 #19
Dann Corbit wrote:
How can the division way be quadratic? It takes out repeated factors in
every iteration. The above example takes 2 modulus cycles.
You assume division is atomic and equivalent to subtraction?

.... really?

Try saying that for 1000 it numbers.

Tom

Jul 17 '06 #20
"Dann Corbit" <dc*****@connx.comwrites:
/*
Now that I actually understand how it works, I like the subtraction trick a
lot better. In this instance, (lots of small repeated factors) it takes the
same number of iterations as the modulo version
*/
Are we talking about the same binary GCD algorithm that's in
http://en.wikipedia.org/wiki/Binary_GCD_algorithm

(Every algorithm is in Knuth if you look hard enough.)
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
Jul 17 '06 #21
"Ben Pfaff" <bl*@cs.stanford.eduwrote in message
news:87************@benpfaff.org...
"Dann Corbit" <dc*****@connx.comwrites:
>/*
Now that I actually understand how it works, I like the subtraction trick
a
lot better. In this instance, (lots of small repeated factors) it takes
the
same number of iterations as the modulo version
*/

Are we talking about the same binary GCD algorithm that's in
http://en.wikipedia.org/wiki/Binary_GCD_algorithm

(Every algorithm is in Knuth if you look hard enough.)
He wasn't aware of using a priority queue to do a merge. I sent him some
email about it a long time ago. Lots better than the snowplow.
--
int main(void){char
p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int
putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof
p-1;putchar(p[i]\
);}return 0;}

Jul 17 '06 #22
Ben Pfaff wrote:
Are we talking about the same binary GCD algorithm that's in
http://en.wikipedia.org/wiki/Binary_GCD_algorithm

(Every algorithm is in Knuth if you look hard enough.)
Hmm, lots of them are, but not all. Specially when you start moving
off into more crypto like problems [e.g. elliptic curves, real
exponentiation[*]]. Also he misses a lot more practical stuff than
you'd think. If we all implemented multiplication by his book we
wouldn't be using comba techniques for instance.
[*] I realize he talks about addition chains but I don't recall how
much he talks about sliding windows for instance.

Definitely worth having them. I bought the entire set when I was still
in high scool. Served me well through college and afterwards so far.

Tom

Jul 17 '06 #23
"Tom St Denis" <to********@gmail.comwrote in message
Dann Corbit wrote:
>How can the division way be quadratic? It takes out repeated factors in
every iteration. The above example takes 2 modulus cycles.

You assume division is atomic and equivalent to subtraction?

... really?
On modern CPUs in hardware, it is. Modulus will be slower by a small,
constant factor. Since the algorithm itself has the same performance, and
since the loop does not execute very many times, I guess that the difference
will be very small.
Try saying that for 1000 it numbers.
I guess that you mean for 1000 digit numbers. If you have to emulate the
math in software, it might be a significant difference.
Tom

Jul 17 '06 #24

Dann Corbit wrote:
"Tom St Denis" <to********@gmail.comwrote in message
Dann Corbit wrote:
How can the division way be quadratic? It takes out repeated factors in
every iteration. The above example takes 2 modulus cycles.
You assume division is atomic and equivalent to subtraction?

... really?

On modern CPUs in hardware, it is. Modulus will be slower by a small,
constant factor. Since the algorithm itself has the same performance, and
since the loop does not execute very many times, I guess that the difference
will be very small.
First off, that's not even remotely true. Division is a LOT slower
than subtraction. Or put it another way, if they took the same number
of cycles you'd be looking at a 37MHz processor.

Second, division is often dozens if not hundreds of cycles.

Third, most processors don't even have division opcodes [mips, ARM, PPC
anyone?]
Try saying that for 1000 it numbers.

I guess that you mean for 1000 digit numbers. If you have to emulate the
math in software, it might be a significant difference.
digit bit whatever.

Division is normally quadratic unless you start getting fancy with the
Karatsuba.

Division: O(n^2)

GCD: O(cn^2) for any [say] small c

What does the O() of GCD reduce to?

Subtraction is fully linear O(n). So GCD == O(cn) means it's also
linear.

*In practice* for small numbers they may act comparably. Specially if
a mod b is small. Which is why it's not entirely a bad approach.

For small types, unless you can prove otherwise it's usually best to
use the binary approach. It's more "portable" [in that you don't need
division] and usually just as fast[*]

Tom
[*] in fact it's guaranteed to be no slower on platforms without a
division opcode.

Jul 17 '06 #25
Tom St Denis <to********@gmail.comwrote:
>Third, most processors don't even have division opcodes [mips, ARM, PPC
anyone?]
The original MIPS design did not have division, but division was
provided right from the first commercial offering, the R2000.

It could be that there is no division in some of the processors aimed
at the embedded market; I'm only familiar with the MIPS general purpose
chips.
--
"It is important to remember that when it comes to law, computers
never make copies, only human beings make copies. Computers are given
commands, not permission. Only people can be given permission."
Jul 17 '06 #26
Walter Roberson wrote:
The original MIPS design did not have division, but division was
provided right from the first commercial offering, the R2000.

It could be that there is no division in some of the processors aimed
at the embedded market; I'm only familiar with the MIPS general purpose
chips.
And likely it's not one cycle right?

Kinda violates the MIPS RISC principles ...

carry" hehehehe...

Though the 4Km series is nice in that regard.

tom

Jul 17 '06 #27
"Tom St Denis" <to********@gmail.comwrote in message
>
Dann Corbit wrote:
>"Tom St Denis" <to********@gmail.comwrote in message
Dann Corbit wrote:
How can the division way be quadratic? It takes out repeated factors
in
every iteration. The above example takes 2 modulus cycles.

You assume division is atomic and equivalent to subtraction?

... really?

On modern CPUs in hardware, it is. Modulus will be slower by a small,
constant factor. Since the algorithm itself has the same performance,
and
since the loop does not execute very many times, I guess that the
difference
will be very small.

First off, that's not even remotely true. Division is a LOT slower
than subtraction. Or put it another way, if they took the same number
of cycles you'd be looking at a 37MHz processor.

Second, division is often dozens if not hundreds of cycles.

Third, most processors don't even have division opcodes [mips, ARM, PPC
anyone?]
Try saying that for 1000 it numbers.

I guess that you mean for 1000 digit numbers. If you have to emulate the
math in software, it might be a significant difference.

digit bit whatever.

Division is normally quadratic unless you start getting fancy with the
Karatsuba.

Division is not quadratic in hardware.
It will take a definite, fixed number of cycles at most. (17 for a 64 bit
AMD CPU, IIRC).

Subtraction will be faster, but only by a small, fixed constant.

If your hardware does not support division, then the subtraction method
Division: O(n^2)

GCD: O(cn^2) for any [say] small c

What does the O() of GCD reduce to?

Subtraction is fully linear O(n). So GCD == O(cn) means it's also
linear.

*In practice* for small numbers they may act comparably. Specially if
a mod b is small. Which is why it's not entirely a bad approach.

For small types, unless you can prove otherwise it's usually best to
use the binary approach. It's more "portable" [in that you don't need
division] and usually just as fast[*]

Tom
[*] in fact it's guaranteed to be no slower on platforms without a
division opcode.
/*
Now that I actually understand how it works, I like the subtraction trick a
lot better. In this instance, (lots of small repeated factors) it takes the
same number of iterations as the modulo version
*/
unsigned long long modulos = 0;
unsigned long long subtractions = 0;
unsigned long long gcdm(unsigned long long a, unsigned long long b)
{
unsigned long long k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b % a;
modulos++;
while (b && !(b & 1)) { b >>= 1; }
}
return a << k;
}
unsigned long long gcds(unsigned long long a, unsigned long long b)
{
unsigned long long k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
subtractions++;
while (b && !(b & 1)) { b >>= 1; }
}
return a << k;
}
unsigned long long big=129746337890625;
unsigned long long med=8649755859375*7;

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
unsigned long long randvals[1000000];
int main(void)
{
clock_t start;
clock_t end;
size_t index;
for (index = 0; index < 1000000; index++)
{
randvals[index] = rand();
randvals[index] <<= 32;
randvals[index] += rand();
}
start = clock();
for (index = 0; index < 1000000-1; index++)
end = clock();
printf("clocks to do one million modular gcd's = %u\n", (unsigned)
end-start) ;
start = clock();
for (index = 0; index < 1000000-1; index++)
end = clock();
printf("clocks to do one million subtraction gcd's = %u\n", (unsigned)
end-start) ;
return 0;
}

/*
clocks to do one million modular gcd's = 1031
clocks to do one million subtraction gcd's = 703
Press any key to continue . . .
*/
Jul 17 '06 #28
Dann Corbit wrote:

Division is not quadratic in hardware.
It will take a definite, fixed number of cycles at most. (17 for a 64 bit
AMD CPU, IIRC).
Actually you're still wrong. It's either quadratic in time or space
[or both]. You can make it O(1) time by taking O(n^2) space but that's
still "quadratic". You linearize it by taking linear space O(n) for
O(n) which is NOT quadratic [just not practical for the size of numbers
you work with in processors] by using interpolation [e.g. Karatsuba]

hint: Why do you think multipliers are fast. It isn't because they
violated the O(n^2) rule. It's because the multipliers are big[*].
They take up nearly O(n^2) space to do the multiplication in a trivial
amount of time.
Subtraction will be faster, but only by a small, fixed constant.

If your hardware does not support division, then the subtraction method
Or if you have to work with larger numbers. Even with an O(1) division
opcode, the division of n-digit numbers is O(n^2) time. So even if
divide == sub for time on a single digit it'd be slower for larger
numbers [proof: division by shift/subtract is slower than a single
subtraction].

Tom
[*] Some processors use a Karatsuba trick but so far I haven't heard of
any x86 series using it as a fact....

Jul 17 '06 #29
"Ben Pfaff" <bl*@cs.stanford.eduwrote in message
news:87************@benpfaff.org...
"Dann Corbit" <dc*****@connx.comwrites:
>/*
Now that I actually understand how it works, I like the subtraction trick
a
lot better. In this instance, (lots of small repeated factors) it takes
the
same number of iterations as the modulo version
*/

Are we talking about the same binary GCD algorithm that's in
http://en.wikipedia.org/wiki/Binary_GCD_algorithm

(Every algorithm is in Knuth if you look hard enough.)
Tom's version seems somewhat faster than the web article version.
The timings below are after profile guided optimization.

/*
Now that I actually understand how it works, I like the subtraction trick a
lot better. In this instance, (lots of small repeated factors) it takes the
same number of iterations as the modulo version
*/
unsigned long long modulos = 0;
unsigned long long subtractions = 0;
unsigned long long gcdm(unsigned long long a, unsigned long long b)
{
unsigned long long k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b % a;
while (b && !(b & 1)) { b >>= 1; }
}
return a << k;
}
unsigned long long gcds(unsigned long long a, unsigned long long b)
{
unsigned long long k, t;

k = 0;
while (!(a & 1 || b & 1)) {
++k; a >>= 1; b >>= 1;
}
while (!(a & 1)) { a >>= 1; }
while (!(b & 1)) { b >>= 1; }

while (b) {
if (a b) { t = a; a = b; b = t; }
b = b - a;
while (b && !(b & 1)) { b >>= 1; }
}
return a << k;
}

unsigned long long gcd(unsigned long long u, unsigned long long v) {
unsigned long long k = 0;
if (u == 0)
return v;
if (v == 0)
return u;
while ((u & 1) == 0 && (v & 1) == 0) { /* while both u and v are even
*/
u >>= 1; /* shift u right, dividing it by 2 */
v >>= 1; /* shift v right, dividing it by 2 */
k++; /* add a power of 2 to the final result */
}
/* At this point either u or v (or both) is odd */
do {
if ((u & 1) == 0) /* if u is even */
u >>= 1; /* divide u by 2 */
else if ((v & 1) == 0) /* else if v is even */
v >>= 1; /* divide v by 2 */
else if (u >= v) /* u and v are both odd */
u = (u-v) >1;
else /* u and v both odd, v u */
v = (v-u) >1;
} while (u 0);
return v << k; /* returns v * 2^k */
}

unsigned long long big=129746337890625;
unsigned long long med=8649755859375*7;

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
unsigned long long randvals[1000000];
int main(void)
{
clock_t start;
clock_t end;
size_t index;
for (index = 0; index < 1000000; index++)
{
randvals[index] = rand();
randvals[index] <<= 32;
randvals[index] += rand();
}
start = clock();
for (index = 0; index < 1000000-1; index++)
end = clock();
printf("clocks to do one million modular gcd's = %u, summed GCD =
start = clock();
for (index = 0; index < 1000000-1; index++)
end = clock();
printf("clocks to do one million subtraction gcd's = %u, summed GCD =
start = clock();
for (index = 0; index < 1000000-1; index++)
end = clock();
printf("clocks to do one million subtraction gcd's {web article version} =
%u, summed GCD = %llu\n", (unsigned) end-start, answer) ;
return 0;
}

/*
clocks to do one million modular gcd's = 1031, summed GCD = 9203554
clocks to do one million subtraction gcd's = 766, summed GCD = 9203554
clocks to do one million subtraction gcd's {web article version} = 1109,
summed GCD = 9203554
Press any key to continue . . .

*/

--
int main(void){char
p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int
putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof
p-1;putchar(p[i]\
);}return 0;}

Jul 17 '06 #30
"Tom St Denis" <to********@gmail.comwrites:
Ben Pfaff wrote:
>Are we talking about the same binary GCD algorithm that's in
http://en.wikipedia.org/wiki/Binary_GCD_algorithm

(Every algorithm is in Knuth if you look hard enough.)

Hmm, lots of them are, but not all. [...]
For what it's worth, that was meant as a joke.
--
"Give me a couple of years and a large research grant,
and I'll give you a receipt." --Richard Heathfield
Jul 17 '06 #31

"Ben Pfaff" <bl*@cs.stanford.eduwrote in message
news:87************@benpfaff.org...
"Tom St Denis" <to********@gmail.comwrites:
>Ben Pfaff wrote:
>>Are we talking about the same binary GCD algorithm that's in
http://en.wikipedia.org/wiki/Binary_GCD_algorithm

(Every algorithm is in Knuth if you look hard enough.)

Hmm, lots of them are, but not all. [...]

For what it's worth, that was meant as a joke.
--
"Give me a couple of years and a large research grant,
and I'll give you a receipt." --Richard Heathfield

Jul 17 '06 #32
"Ben Pfaff" <bl*@cs.stanford.eduwrote in message
news:87************@benpfaff.org...
"Tom St Denis" <to********@gmail.comwrites:
>Ben Pfaff wrote:
>>Are we talking about the same binary GCD algorithm that's in
http://en.wikipedia.org/wiki/Binary_GCD_algorithm

(Every algorithm is in Knuth if you look hard enough.)

Hmm, lots of them are, but not all. [...]

For what it's worth, that was meant as a joke.
It's pretty darn close to correct. It is very unusual to need an algorithm
that is not covered in one of the volumes.

As an aside, some of his algorithms are superior to those in common use...
{And TAOCP was written when?!}

E.g. Merge insertion sorting is fabulous.
--
"Give me a couple of years and a large research grant,
and I'll give you a receipt." --Richard Heathfield

Jul 17 '06 #33
"Tom St Denis" <to********@gmail.comwrote in message
Dann Corbit wrote:

Division is not quadratic in hardware.
It will take a definite, fixed number of cycles at most. (17 for a 64 bit
AMD CPU, IIRC).

Actually you're still wrong. It's either quadratic in time or space
[or both]. You can make it O(1) time by taking O(n^2) space but that's
still "quadratic". You linearize it by taking linear space O(n) for
O(n) which is NOT quadratic [just not practical for the size of numbers
you work with in processors] by using interpolation [e.g. Karatsuba]

hint: Why do you think multipliers are fast. It isn't because they
violated the O(n^2) rule. It's because the multipliers are big[*].
They take up nearly O(n^2) space to do the multiplication in a trivial
amount of time.
I'm sorry, but I'm not wrong. You fundamentally misunderstand what O(f(n))
means.

If something takes a fixed number of cycles to complete at most, then it is
not O(log(n)) or O(n^2) or anything else. It is O(1).

Subtraction, division, multiplication, modulus, etc. are ALL O(1) on modern
64 bit hardware if the integer values are also 64 bit.
>Subtraction will be faster, but only by a small, fixed constant.

If your hardware does not support division, then the subtraction method

Or if you have to work with larger numbers. Even with an O(1) division
opcode, the division of n-digit numbers is O(n^2) time. So even if
divide == sub for time on a single digit it'd be slower for larger
numbers [proof: division by shift/subtract is slower than a single
subtraction].
I am not arguing this point. I also agree that the subtraction method is
better. The notion that the hardware version of the algorithm is O(n^2) is
pure fantasy on your part, though.
Tom
[*] Some processors use a Karatsuba trick but so far I haven't heard of
any x86 series using it as a fact....

Jul 17 '06 #34
"Dann Corbit" <dc*****@connx.comwrites:
[...]
I'm sorry, but I'm not wrong. You fundamentally misunderstand what O(f(n))
means.

If something takes a fixed number of cycles to complete at most, then it is
not O(log(n)) or O(n^2) or anything else. It is O(1).

Subtraction, division, multiplication, modulus, etc. are ALL O(1) on modern
64 bit hardware if the integer values are also 64 bit.
In this case, n is fixed at 64, so O(n) and O(1) are effectively the
same thing.

If you want to handle n 64 (multiplication of arbitrarily large
numbers), it's probably going to exceed O(1).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jul 17 '06 #35
Dann Corbit wrote:
I'm sorry, but I'm not wrong. You fundamentally misunderstand what O(f(n))
means.

If something takes a fixed number of cycles to complete at most, then it is
not O(log(n)) or O(n^2) or anything else. It is O(1).
No, that's wrong. In the case of multiplication they traded space for
performance. That's still a O(n^2) algorithm even though it may take
O(1) time.

What do you think processors just magically multiply 64 bit quantities
with a O(1) space algorithm?

If that were the case we'd be doing RSA in x86 processors in a dozen
cycles with a single opcode.
Subtraction, division, multiplication, modulus, etc. are ALL O(1) on modern
64 bit hardware if the integer values are also 64 bit.
They may take O(1) TIME [which even then is debatable] but they
certainly are not O(1) algorithms.

Think of O() as a unit of resources. Multiplication takes n^2
bit-multiplications to compute for n bits [where here a mult is and
AND]. You can compute it serially in n^2 time or if you have n^2
parallel multipliers in O(1) time. The algorithm still takes O(n^2)
resources.

By your logic, expanding a basic multiplier would always be more
efficient than Karatsuba since "it's linear" which simply isn't true.
Or if you have to work with larger numbers. Even with an O(1) division
opcode, the division of n-digit numbers is O(n^2) time. So even if
divide == sub for time on a single digit it'd be slower for larger
numbers [proof: division by shift/subtract is slower than a single
subtraction].

I am not arguing this point. I also agree that the subtraction method is
better. The notion that the hardware version of the algorithm is O(n^2) is
pure fantasy on your part, though.
I said the ALGORITHM is quadratic. Not the implementation. For all I
know your processor HAS a linear time divider [a real one] and
therefore the algorithm is linear. But using the typical notions of a
processor that algorithm is fully quadratic.

Think about it, replace "unsigned long" with "bigint_t". The ALGORITHM
is the same just the numbers changed size. Why is it now quadratic and
before it wasn't?

Tom

Jul 17 '06 #36
"Tom St Denis" <to********@gmail.comwrites:
Dann Corbit wrote:
>I'm sorry, but I'm not wrong. You fundamentally misunderstand what O(f(n))
means.

If something takes a fixed number of cycles to complete at most, then it is
not O(log(n)) or O(n^2) or anything else. It is O(1).

No, that's wrong. In the case of multiplication they traded space for
performance. That's still a O(n^2) algorithm even though it may take
O(1) time.
You seem to believe that you multiply time by space to obtain the
cost of an algorithm. I've never seen anyone do that before.
Merge sort takes O(n lg n) time and O(n) additional space, so
does that make it an O(n^2 lg n) algorithm?

I could accept that multiplication of fixed-size integers can be
implemented given O(1) time and O(n^2) space. That doesn't make
it "an O(n^2) algorithm"; that's a meaningless thing to say. It
makes it an O(1) time, O(n^2) space algorithm.
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
Jul 17 '06 #37
Ben Pfaff wrote:
You seem to believe that you multiply time by space to obtain the
cost of an algorithm. I've never seen anyone do that before.
Merge sort takes O(n lg n) time and O(n) additional space, so
does that make it an O(n^2 lg n) algorithm?
No, because it doesn't take n space per unit of time.

Basic multiplication is O(n^2) because there are n^2 steps. Whether
they happen in parallel or not doesn't matter.

Karatsuba is a O(n^1.583) time algorithm because there are n^1.583
multiplications. Whether you do them all in parallel doesn't matter
(and in fact in hardware that's usually the case).

The complexity of an algorithm doesn't change just because you trade
memory or space for time.

Besides, you could do all compares in parallel. Therefore, merge sort
is a O(ln n) algorithm. But wait, space complexity doesn't matter. So
therefore all sorting should be O(1) complexity!!!
I could accept that multiplication of fixed-size integers can be
implemented given O(1) time and O(n^2) space. That doesn't make
it "an O(n^2) algorithm"; that's a meaningless thing to say. It
makes it an O(1) time, O(n^2) space algorithm.
Well first off O() refers to complexity. If you TRADE space for time
that doesn't lower the complexity. It lowers the time. I'll grant you
the TIME complexity of a O(n^2) space multiplier is O(1).

Tom

Jul 17 '06 #38
"Tom St Denis" <to********@gmail.comwrote in message
Dann Corbit wrote:
>I'm sorry, but I'm not wrong. You fundamentally misunderstand what
O(f(n))
means.

If something takes a fixed number of cycles to complete at most, then it
is
not O(log(n)) or O(n^2) or anything else. It is O(1).

No, that's wrong. In the case of multiplication they traded space for
performance. That's still a O(n^2) algorithm even though it may take
O(1) time.
If you use the martian definition.
What do you think processors just magically multiply 64 bit quantities
with a O(1) space algorithm?
Space is constant also. Do you think that the table grows and shrinks as I
multiply different numbers? It does not.
If that were the case we'd be doing RSA in x86 processors in a dozen
cycles with a single opcode.
RSA has to worry about 512 bit numbers and larger, which are not even in
this conversation.
>Subtraction, division, multiplication, modulus, etc. are ALL O(1) on
modern
64 bit hardware if the integer values are also 64 bit.

They may take O(1) TIME [which even then is debatable] but they
certainly are not O(1) algorithms.
O(1) means that it completes in constant time. I guess that A*B or A/C or
A-C will complete in constant time if A,B,C are hardware integers.
Think of O() as a unit of resources. Multiplication takes n^2
bit-multiplications to compute for n bits [where here a mult is and
AND]. You can compute it serially in n^2 time or if you have n^2
parallel multipliers in O(1) time. The algorithm still takes O(n^2)
resources.
The size of the lookup table is fixed. The time to complete the unit of
work is fixed. Both are O(1) by the raw definition of the terms.

If you are talking about implementing arbitrary precision operations, then
your statements make sense. They make no sense at all in this context.
By your logic, expanding a basic multiplier would always be more
efficient than Karatsuba since "it's linear" which simply isn't true.
Something that executes in a fixed amount of time is O(1). That's the
definition of O(1). An O(1) algorithm could take a century to complete
(that's a fixed amount of time) and an exponential algorithm could take a
nanosecond to complete (because of the problem given to the algorithm).
Or if you have to work with larger numbers. Even with an O(1) division
opcode, the division of n-digit numbers is O(n^2) time. So even if
divide == sub for time on a single digit it'd be slower for larger
numbers [proof: division by shift/subtract is slower than a single
subtraction].

I am not arguing this point. I also agree that the subtraction method is
better. The notion that the hardware version of the algorithm is O(n^2)
is
pure fantasy on your part, though.

I said the ALGORITHM is quadratic. Not the implementation. For all I
know your processor HAS a linear time divider [a real one] and
therefore the algorithm is linear. But using the typical notions of a
processor that algorithm is fully quadratic.
The algorithm is not quadratic. The divider finishes in constant time.
Think about it, replace "unsigned long" with "bigint_t". The ALGORITHM
is the same just the numbers changed size. Why is it now quadratic and
before it wasn't?
Why should I replace unsigned long long with arbitrary precision? I made it
very clear that I was discussing the algorithm for hardware implementations.
Tom

Jul 18 '06 #39
Dann Corbit wrote:
No, that's wrong. In the case of multiplication they traded space for
performance. That's still a O(n^2) algorithm even though it may take
O(1) time.

If you use the martian definition.
I don't see the problem here. The algorithm has a complexity of O(n^2)
whether that is space or time it's still that complex.

Why do things get more efficient as you use more resources?
What do you think processors just magically multiply 64 bit quantities
with a O(1) space algorithm?

Space is constant also. Do you think that the table grows and shrinks as I
multiply different numbers? It does not.
Different SIZED NUMBERS. Yes, in fact the size grows quadratically
with the size of the input if you want to keep O(1) time.

What? You think a 16-bit and 64-bit multiplier are the same size for
constant time?
If that were the case we'd be doing RSA in x86 processors in a dozen
cycles with a single opcode.

RSA has to worry about 512 bit numbers and larger, which are not even in
this conversation.
They're the same thing. Fundamentally a "multiplication algorithm" is
used even in the cpu. You're treating the processor as a huge black
box where everything is linear time/space. That's just not the case.
[hint: if it were processors would have more multipliers]
Subtraction, division, multiplication, modulus, etc. are ALL O(1) on
modern
64 bit hardware if the integer values are also 64 bit.
They may take O(1) TIME [which even then is debatable] but they
certainly are not O(1) algorithms.

O(1) means that it completes in constant time. I guess that A*B or A/C or
A-C will complete in constant time if A,B,C are hardware integers.
TIME is not the only measure of complexity.
Think of O() as a unit of resources. Multiplication takes n^2
bit-multiplications to compute for n bits [where here a mult is and
AND]. You can compute it serially in n^2 time or if you have n^2
parallel multipliers in O(1) time. The algorithm still takes O(n^2)
resources.

The size of the lookup table is fixed. The time to complete the unit of
work is fixed. Both are O(1) by the raw definition of the terms.
You're either ignorant of computer science or really obtuse. O()
notation is in terms of the size of the input. And no, for O(1) time
multiplication the table size is neither fixed, nor linearly dependent
on the size of the input.
If you are talking about implementing arbitrary precision operations, then
your statements make sense. They make no sense at all in this context.
It applies just as much to processor components as it does software.
You're just really naive and think otherwise [regardless, gcdm is still
quadratic unless you specify a different way to divide so it doesn't
matter]
By your logic, expanding a basic multiplier would always be more
efficient than Karatsuba since "it's linear" which simply isn't true.

Something that executes in a fixed amount of time is O(1). That's the
definition of O(1). An O(1) algorithm could take a century to complete
(that's a fixed amount of time) and an exponential algorithm could take a
nanosecond to complete (because of the problem given to the algorithm).
TIME is not the only measure of complexity.
I said the ALGORITHM is quadratic. Not the implementation. For all I
know your processor HAS a linear time divider [a real one] and
therefore the algorithm is linear. But using the typical notions of a
processor that algorithm is fully quadratic.

The algorithm is not quadratic. The divider finishes in constant time.
This is impossible. The algorithm isn't constant . The implementation
may be [due to a specific multiplier] but as you wrote it the algorithm
itself is quadratic time because division is of O(n^2) complexity.
Think about it, replace "unsigned long" with "bigint_t". The ALGORITHM
is the same just the numbers changed size. Why is it now quadratic and
before it wasn't?

Why should I replace unsigned long long with arbitrary precision? I made it
very clear that I was discussing the algorithm for hardware implementations.
Ok but unless you use a specific division opcode which is of O(1)
complexity [none exist btw] then the algorithm itself isn't of O(1)
complexity.

You're confusing runtime with the actual implementation details. Your
multiplier is fast because it takes O(n^2) space. It's still the same
O(n^2) multiplication algorithm you learned as a 6 year old just all of
the mults are done in parallel.

Complexity is a measure of both space and time not just time.

I suppose it's too much to tell you that addition is O(n) complexity
too then right?

....

Tom

Jul 18 '06 #40
"Tom St Denis" <to********@gmail.comwrote in message
Dann Corbit wrote:
No, that's wrong. In the case of multiplication they traded space for
performance. That's still a O(n^2) algorithm even though it may take
O(1) time.

If you use the martian definition.

I don't see the problem here. The algorithm has a complexity of O(n^2)
whether that is space or time it's still that complex.
If an algorithm finishes in a fixed time and never takes more than a limited
number of clocks to complete, then it is a constant time algorithm. That's
the definition. I am talking about a hardware multiply of a hardware
integer on some C implementation.
Why do things get more efficient as you use more resources?
There is no claim on my part of increased efficiency with greater resources.
What do you think processors just magically multiply 64 bit quantities
with a O(1) space algorithm?

Space is constant also. Do you think that the table grows and shrinks as
I
multiply different numbers? It does not.

Different SIZED NUMBERS. Yes, in fact the size grows quadratically
with the size of the input if you want to keep O(1) time.
All the integers involved are 64 bit (or less). The algorithm I supplied
(and the one that you supplied) do not use variable sized registers or
numbers. It is an implementation that uses integers of a single, constant
size.
What? You think a 16-bit and 64-bit multiplier are the same size for
constant time?
In what way does the implementation that you supplied change the size if its
registers as it runs? The answer is, 'the integers in this algorithm are
all of the same size and they do not change width during the operation of
the algorithm'.

I think that a 64 bit hardware number uses an ALU and mutiplies in constant
time.
I have not made any claims about changing the size of the multiplier.

The algorithm that YOU supplied uses a constant sized hardware integer.

I think that you are smart enough to know that you are wrong. If not, then
I don't see any point in continuing the discussion.

It's OK to be wrong. I'm wrong a lot. But when you are wrong and you know
you are wrong and you insist that you are right it looks pretty strange.

[snip]
Jul 18 '06 #41
Dann Corbit wrote:
If an algorithm finishes in a fixed time and never takes more than a limited
number of clocks to complete, then it is a constant time algorithm. That's
the definition. I am talking about a hardware multiply of a hardware
integer on some C implementation.
I never said the algorithm was quadratic time. I said it's O(n^2)
complexity. It's linear because the processor trades space for
division time (even then though division algos usually have early outs
so they're not constant time anyways but let's disregard reality for
this conversation since "facts" have waved bye bye long ago).
Different SIZED NUMBERS. Yes, in fact the size grows quadratically
with the size of the input if you want to keep O(1) time.

All the integers involved are 64 bit (or less). The algorithm I supplied
(and the one that you supplied) do not use variable sized registers or
numbers. It is an implementation that uses integers of a single, constant
size.
Your implementation may be constant time (though I wouldn't say that in
a portable C group) but the algorithm is not constant complexity.
I think that you are smart enough to know that you are wrong. If not, then
I don't see any point in continuing the discussion.
Your asserting something I'm not trying to maintain. I said your
algorithm is O(n^2) complexity. I never said it takes that time [or if
I did originally that was in error, and I've been saying otherwise for
the last 10 replies or so...]
It's OK to be wrong. I'm wrong a lot. But when you are wrong and you know
you are wrong and you insist that you are right it looks pretty strange.
Well given that bignum math is one of my fortes, I'm comfortable in my
position. It's you who doesn't really seem to understand complexity
notation or discussion.

If we used your logic, sorting N numbers with bubble sort is N^2 steps,
but what if N is constant. Well now buble sort is constant time.
Therefore, bubble sort is the most efficient sort algorithm.

No, bubble sort is still N^2 but your instance is constant time because
the size doesn't change. Similarly, if you always multiply 64-bit
quantities you have a constant time, but the complexity is not constant
since multiplication is not constant complexity. Big-Oh notation is
about asymptotics and in our case about the complexity of something as
the size of the input changes.

Getting back to it. Division is a O(n^2) complexity problem when
implemented serially. There is no getting around that. You have n
bits and you have to perform n n-bit subtractions. That's n^2 bit
operations. You may implement various operations in parallel but that
doesn't change the complexity just the time.

Anyways enough ranting. You're too hung up on the specifics of the
problem and are ignoring the actual meaning of what I'm trying to say.
So think whatever the hell you want. Doesn't really matter I guess.

Tom

Jul 18 '06 #42
"Tom St Denis" <to********@gmail.comwrote in message
Dann Corbit wrote:
>If an algorithm finishes in a fixed time and never takes more than a
limited
number of clocks to complete, then it is a constant time algorithm.
That's
the definition. I am talking about a hardware multiply of a hardware
integer on some C implementation.

I never said the algorithm was quadratic time. I said it's O(n^2)
complexity. It's linear because the processor trades space for
division time (even then though division algos usually have early outs
so they're not constant time anyways but let's disregard reality for
this conversation since "facts" have waved bye bye long ago).
Different SIZED NUMBERS. Yes, in fact the size grows quadratically
with the size of the input if you want to keep O(1) time.

All the integers involved are 64 bit (or less). The algorithm I supplied
(and the one that you supplied) do not use variable sized registers or
numbers. It is an implementation that uses integers of a single,
constant
size.

Your implementation may be constant time (though I wouldn't say that in
a portable C group) but the algorithm is not constant complexity.
>I think that you are smart enough to know that you are wrong. If not,
then
I don't see any point in continuing the discussion.

Your asserting something I'm not trying to maintain. I said your
algorithm is O(n^2) complexity. I never said it takes that time [or if
I did originally that was in error, and I've been saying otherwise for
the last 10 replies or so...]
>It's OK to be wrong. I'm wrong a lot. But when you are wrong and you
know
you are wrong and you insist that you are right it looks pretty strange.

Well given that bignum math is one of my fortes, I'm comfortable in my
position. It's you who doesn't really seem to understand complexity
notation or discussion.

If we used your logic, sorting N numbers with bubble sort is N^2 steps,
but what if N is constant. Well now buble sort is constant time.
Therefore, bubble sort is the most efficient sort algorithm.

No, bubble sort is still N^2 but your instance is constant time because
the size doesn't change. Similarly, if you always multiply 64-bit
quantities you have a constant time, but the complexity is not constant
since multiplication is not constant complexity. Big-Oh notation is
about asymptotics and in our case about the complexity of something as
the size of the input changes.

Getting back to it. Division is a O(n^2) complexity problem when
implemented serially. There is no getting around that. You have n
bits and you have to perform n n-bit subtractions. That's n^2 bit
operations. You may implement various operations in parallel but that
doesn't change the complexity just the time.

Anyways enough ranting. You're too hung up on the specifics of the
problem and are ignoring the actual meaning of what I'm trying to say.
So think whatever the hell you want. Doesn't really matter I guess.
According to your definition this algorithm:

int multiply(int a, int b)
{
return a*b;
}
is O(n^2)

and this algorithm:

int subtract(int a, int b)
{
return a-b;
}
is O(log(n))

It just shows that you have no idea what the terms mean.

Both of those algorithms are O(1).

Now (on the other hand) if we were implementing multiple precision
algorithms to perform the fundamental operations, then (depending on the
implementation) your argument might have some validity.

As it is, you have just demonstrated you preposterous ignorance for all time
in USENET, to be salted away into the archives for posterity.
Congratulations.
At least some future denizens of earth may get a good chuckle from it.
Jul 18 '06 #43
Dann Corbit wrote:
According to your definition this algorithm:

int multiply(int a, int b)
{
return a*b;
}
is O(n^2)
*complexity*. Yes, that's true.

You can't use O() notation for constant sized data sets. You're
basically saying "as 64 =oo this function takes constant time".
That's nonsense.

However, for the general term "int" without upper bounds the algorithm
has a O(n^2) complexity.
and this algorithm:

int subtract(int a, int b)
{
return a-b;
}
is O(log(n))
Actually, addition and subtraction have O(n) complexity.
It just shows that you have no idea what the terms mean.
Funny, I was just going to say the same thing.
Both of those algorithms are O(1).
No. They're not. And for the very reason you think, O() notation does
not apply.
Now (on the other hand) if we were implementing multiple precision
algorithms to perform the fundamental operations, then (depending on the
implementation) your argument might have some validity.
Might ... does ... whatever.
As it is, you have just demonstrated you preposterous ignorance for all time
in USENET, to be salted away into the archives for posterity.
Congratulations.
At least some future denizens of earth may get a good chuckle from it.
First off, supposing I was wrong [which in this regard I'm not], this
is far from the worst post I've ever made. I've been on usenet since I
was ~16 or so and have enough good and bad posts to be rather "famous",
at least in the sci.crypt world.

I don't get why you're arguing this. You can't seem to discuss
complexity with any sense of proficiency.

Tom

Jul 18 '06 #44
"Tom St Denis" <to********@gmail.comwrites:
Dann Corbit wrote:
[snip]

I wonder if both of you would consider taking this elsewhere.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jul 18 '06 #45
Dann Corbit wrote:
I'm sorry, but I'm not wrong. You fundamentally misunderstand what O(f(n))
means.

Subtraction, division, multiplication, modulus, etc. are ALL O(1) on modern
64 bit hardware if the integer values are also 64 bit.
I'll add my voice to the chorus: you fundamentally misunderstand
what it means.

To say that an algorithm is O(n^2) means that as n approaches
infinity, the complexity of the algorithm approaches n^2 (perhaps
multiplied by some constant).

It is meaningless to assert "Operation X is O(1) on two 64-bit
numbers". Big-O notation is for describing how the metrics of
an algorithm grow as the size of the parameters grows. It
cannot be used to assess one specific instance.

To say that multiplication is O(n^2) is to say that a 128-bit
multiplication takes roughly 4 times as many bit operations
as a 64-bit multiplication. It says nothing about how long a 64-bit
multiplication takes in real time, nor about how a 64-bit
multiplication compares to a 64-bit addition.

A further note: in this case, "n" is the size of the number (eg.
the number of bits) and complexity is taken as the number of
bit-operations needed.

Other types of complexity can also be measured by this notation,
eg. time-complexity and space-complexity. Also, complexity can
be measured against other parameters besides the size of the inputs.

Jul 18 '06 #46

lovecreatesbeauty wrote:
Keith Thompson wrote:
"lovecreatesbeauty" <lo***************@gmail.comwrites:
Keith Thompson wrote:
>"Julian V. Noble" <jv*@virginia.eduwrites:
>[...]
int gcd( int m, int n ) // recursive
{
if(n==0) { return m; }
else
}
>>
>The temporary is unnecessary, and the formatting is just odd. Here's
>how I'd write it:
>>
>int gcd(int m, int n)
>{
> if (n == 0) {
> return m;
> }
> else {
> return gcd(n, m % n);
> }
>}
>
Thanks.
>
The function accepts zeros and negative integers as their arguments.
How are users without much mathematical knowledge like me supposed to
provide correct input to use it?
The obvious solution to that is to change the arguments and result
from int to unsigned.

Different mathematical functions have different domains and ranges
(values that make sense for arguments and results). C, unlike some
other languages, doesn't let you declare a type with a specified range
of values -- but in this case, unsigned happens to be just what you
want.

The recursive way becomes the worst one and can not be improved to add
more parameter validation to it. If the function accepts input from
other automatic software systems, then someone should still keep an eye
on it, because the result may be wrong without warning.

int gcd(int x, int y){
if (x 0 && y 0){
while (x % y){
y = x + y;
x = y - x;
y = (y - x) % x;
}
} else {
y = -1; /*input invalid*/
}
return y;
}

int gcd3(int m, int n){
if (n == 0){
return m;
} else {
return gcd3(n, m % n);
}
}
...
The greatest common divisor is perfectly well
defined for negative integers: gcd(-4,-6) == 2.

int gcd(int a, int b)
{ return b ? gcd(b,a%b) : a<0 ? -a : a ? a : 1; }

Change the 1 to a -1 if you want gcd(0,0) to
return an "error indicator".

By the way, any halfway decent compiler will turn
a recursive version like the one shown here into
an iterative one. No need to micro-optimize.

Jul 18 '06 #47
On 17 Jul 2006 20:33:49 -0700, in comp.lang.c , "Tom St Denis"
<to********@gmail.comwrote:
>You can't use O() notation for constant sized data sets. You're
basically saying "as 64 =oo this function takes constant time".
That's nonsense.
You apparently have absolutely no idea what this notation means.
Perhaps you should read some books, and then creep quietly back.
>I don't get why you're arguing this. You can't seem to discuss
complexity with any sense of proficiency.
Most amusing.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Jul 18 '06 #48
Mark McIntyre wrote:
On 17 Jul 2006 20:33:49 -0700, in comp.lang.c , "Tom St Denis"
<to********@gmail.comwrote:
You can't use O() notation for constant sized data sets. You're
basically saying "as 64 =oo this function takes constant time".
That's nonsense.

You apparently have absolutely no idea what this notation means.
Perhaps you should read some books, and then creep quietly back.
Um ok. Show me where O() notation applies to a constant sized input.

I realize this is clc and not some math group but computer scientists
should understand how to speak about complexity. Saying multiplication
is O(1) a surefire sign that you don't know what the heck you're

Tom

Jul 18 '06 #49
Tom St Denis wrote:
Mark McIntyre wrote:
>On 17 Jul 2006 20:33:49 -0700, in comp.lang.c , "Tom St Denis"
<to********@gmail.comwrote:
>You can't use O() notation for constant sized data sets. You're
basically saying "as 64 =oo this function takes constant time".
That's nonsense.

You apparently have absolutely no idea what this notation means.
Perhaps you should read some books, and then creep quietly back.

Um ok. Show me where O() notation applies to a constant sized input.

I realize this is clc and not some math group but computer scientists
should understand how to speak about complexity. Saying multiplication
is O(1) a surefire sign that you don't know what the heck you're
Isn't saying "multiplication is O(1)" saying "multiplication takes
constant time", independent of the operands?

Which - for fixed-size arguments, eg 32-bit int(egers) such as are
found on many machines & in many programming languages - is, I thought,
true.

Of course (as someone alluded to) if the operands are of variable
size, not fixed in advance, as might be found in an unbounded-precision
library or (sort of equivalently) in Lisp/Pop11/Smalltalk general arithmetic,
then multiplication is ... unlikely ... to be O(1).

[It /could/ be O(1). But I'd prefer not to use that implementation.]

--
Chris "I'd rather owe(1) than owe(10)" Dollin
"People are part of the design. It's dangerous to forget that." /Star Cops/

Jul 18 '06 #50

### This discussion thread is closed

Replies have been disabled for this discussion.