473,785 Members | 2,824 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Long Num speed

For a while now i have been "playing" with a little C program to
compute the factorial of large numbers. Currently it takes aboy 1
second per 1000 multiplications , that is 25000P1000 will take about a
second. It will be longer for 50000P1000 as expected, since more digits
will be in the answer. Now, on the Num Analyses forum/Group there is a
post reaporting that that person wrot java code that computed 1000000!
in about a second. That is about 10000 times faste than I would expect
my code to do it. So the two possiblilities are:
1) I am doing something terribly wrong
2) The othr person is lying

At the moment i am inclined to believe that its number 1.

I am posing my code below, I would like to hear your opinions about
why it is slow and how i can improove its speed.

I know that there are public BIGNUM libraries which are already
optimized for such calculations, but I dont want to use them, bcause i
want to approach this problem on a lower level. I am mostly interested
to find out how to get this code perform faster or what alternative
algorythms i should consider. The factorial calculation is just a test
program.

=============== ====start paste========== =============== =

#include<stdio. h>
#include<stdlib .h>
#include<math.h >

#define al 1024*20
#define base 1000
typedef long int IntegerArrayTyp e;

struct AEI{
IntegerArrayTyp e data[al];
long int digits;
};

void pack(IntegerArr ayType i, struct AEI *N1);
void Amult(struct AEI * A, struct AEI * B, struct AEI * C);
void AEIprintf(struc t AEI * N1);
void AEIfprintf(FILE * fp, struct AEI * N1);
int main(void)
{

struct AEI *N1, *MO, *Ans;
long i = 0, j = 0, ii, NUM, iii;
FILE *ff;

N1=malloc(sizeo f(struct AEI));
MO=malloc(sizeo f(struct AEI));
Ans=malloc(size of(struct AEI));
while (i < al){
N1->data[i] = 0;
MO->data[i] = 0;
Ans->data[i]=0;
i++;
}

printf("Enter integer to Factorialize: ");
scanf("%ld", &NUM);

pack(1, N1);
pack(1, Ans);
ff = fopen("Results. txt", "w");
printf("you entered: %ld", NUM);

i=1;
while(i < NUM ){

iii=0;
while(iii<NUM && iii<1000){
ii = 1;
while (ii < al)
{
MO->data[ii] = 0;
ii++;
}
pack((i+iii), MO);
Amult(N1, MO, N1);
iii++;
}
i+=iii;
Amult(Ans, N1, Ans);
printf("\nProgr ess is: %d",i);
pack(1, N1);
}
if(ff!=NULL){
fprintf(ff,"\n% d\n",i-1);
AEIfprintf(ff, Ans);
}
fclose(ff);

printf("\nProgr ess: 100\%");

return 0;
}
void AEIprintf(struc t AEI *N1){

float fieldLength;
double temp;
char format1[8];
long j, FL0;
j = N1->digits-1;
FL0=(long)log10 ((float)base);
fieldLength = (float)log10((f loat)base);
temp = modf(fieldLengt h, &fieldLength );
format1[0] = '%';
format1[1] = '0';
format1[2] = fieldLength + 48;
format1[3] = 'd';
format1[4] = 0x00;

printf("%*d", FL0, N1->data[j]);
j--;

while (j >= 0)
{
printf(format1, N1->data[j]);

j--;
}

return;
}
void AEIfprintf(FILE * fp, struct AEI *N1){
long j = N1->digits-1;

double fieldLength, temp;
char format0[8], format1[8];

fieldLength = (int)log10(base );
temp = modf(fieldLengt h, &fieldLength );

format0[0] = '%';
format0[1] = fieldLength + 48;
format0[2] = 'd';
format0[3] = 0x00;
format1[0] = '%';
format1[1] = '0';
format1[2] = fieldLength + 48;
format1[3] = 'd';
format1[4] = 0x00;

fprintf(fp,form at0, N1->data[j]);
j--;

while (j >= 0){
fprintf(fp, format1, N1->data[j]);
j--;
}
return;
}

void pack(IntegerArr ayType i, struct AEI * N1)
{
long t = 1, i1, j = 0;

while (t == 1){
i1 = i % base;
N1->data[j] = i1;
i = (i - i1) / base;
j++;
if (i == 0)
t = 0;
}
N1->digits=j;
return;
}


void Amult(struct AEI * A, struct AEI * B, struct AEI * C){
/*C = A * B; */
long i, ii,d, result, carry=0, digits=0;
struct AEI *Ans;
Ans=malloc(size of(struct AEI));
i=0;
d= (A->digits+B->digits-1);
while(i<d){
Ans->data[i]=carry;
carry=0;
ii=0;
while(ii<=i){
if(B->data[ii]!=0){
Ans->data[i]+=A->data[i-ii]*B->data[ii];
carry+=Ans->data[i]/base;
Ans->data[i]=Ans->data[i]%base;
}
ii++;
}
carry+=Ans->data[i]/base;
Ans->data[i]=Ans->data[i]%base;

i++;
}
if(carry!=0){
d++;
Ans->data[i]=carry;
}

C->digits=d;
i=0;
while(i<d){
C->data[i]=Ans->data[i];
i++;
}
return;
}

=============== =====end paste========== =============== ===

I tried to indent the code with spaces instead of tabs, but if some
parts end up not properly indented, I hope no one will hold it against
me.

Thanks ahead

Nov 1 '06 #1
35 2738
fermineutron wrote:
For a while now i have been "playing" with a little C program to
compute the factorial of large numbers. Currently it takes aboy 1
second per 1000 multiplications , that is 25000P1000 will take about a
second. It will be longer for 50000P1000 as expected, since more digits
will be in the answer. Now, on the Num Analyses forum/Group there is a
post reaporting that that person wrot java code that computed 1000000!
in about a second. That is about 10000 times faste than I would expect
my code to do it. So the two possiblilities are:
1) I am doing something terribly wrong
2) The othr person is lying

At the moment i am inclined to believe that its number 1.

I am posing my code below, I would like to hear your opinions about
why it is slow and how i can improove its speed.
Before anyone reviews the code, have you profiled it?

If not, why not? If you have, where were the bottlenecks?

--
Ian Collins.
Nov 1 '06 #2
fermineutron:
Now, on the Num Analyses forum/Group there is a
post reaporting that that person wrot java code that computed 1000000!
in about a second. That is about 10000 times faste than I would expect
my code to do it. So the two possiblilities are:
1) I am doing something terribly wrong
2) The othr person is lying

The burden of proof is on the Java dude.

Probability suggests that he's lying, because only a very small proportion of
proficient programmers waste their time on mickey-mouse hold-my-hand
languages such as Java. Also, Java is slow.

Either that or the algorithm's something stupid like:

char const *Func(void)
{
return "23495072395732 584712579750932 750923750932759 2387509";
}

Ask the Java dude if you can see his code. If he refuses, assume that he's a
liar, then egg his house.

--

Frederick Gotham
Nov 1 '06 #3

Ian Collins wrote:
Before anyone reviews the code, have you profiled it?

If not, why not? If you have, where were the bottlenecks?
I profilled it, but there were no obvious bottlenecks which i would not
anticipate to be there by design.

here is the profiler output

http://igorpetrusky.awardspace.com/Temp/RunStats.html

I was thinking that maybe there is some other algorythm that is better
than mine for the long int arithemetic?

Factorial calculation is just a driver for the multiplication function.

Nov 2 '06 #4
On 1 Nov 2006 14:28:20 -0800, "fermineutr on" <fr**********@y ahoo.com>
wrote:
>For a while now i have been "playing" with a little C program to
compute the factorial of large numbers. Currently it takes aboy 1
second per 1000 multiplications , that is 25000P1000 will take about a
second. It will be longer for 50000P1000 as expected, since more digits
will be in the answer. Now, on the Num Analyses forum/Group there is a
post reaporting that that person wrot java code that computed 1000000!
in about a second. That is about 10000 times faste than I would expect
my code to do it. So the two possiblilities are:
1) I am doing something terribly wrong
2) The othr person is lying

At the moment i am inclined to believe that its number 1.

I am posing my code below, I would like to hear your opinions about
why it is slow and how i can improove its speed.

I know that there are public BIGNUM libraries which are already
optimized for such calculations, but I dont want to use them, bcause i
want to approach this problem on a lower level. I am mostly interested
to find out how to get this code perform faster or what alternative
algorythms i should consider. The factorial calculation is just a test
program.
I don't know if my comment in Amult addresses the question you voiced
but you need to pay attention to your compiler diagnostics.
>
============== =====start paste========== =============== =

#include<stdio .h>
#include<stdli b.h>
#include<math. h>

#define al 1024*20
#define base 1000
typedef long int IntegerArrayTyp e;

struct AEI{
IntegerArrayTyp e data[al];
long int digits;
};

void pack(IntegerArr ayType i, struct AEI *N1);
void Amult(struct AEI * A, struct AEI * B, struct AEI * C);
void AEIprintf(struc t AEI * N1);
void AEIfprintf(FILE * fp, struct AEI * N1);
int main(void)
{

struct AEI *N1, *MO, *Ans;
long i = 0, j = 0, ii, NUM, iii;
FILE *ff;

N1=malloc(sizeo f(struct AEI));
MO=malloc(sizeo f(struct AEI));
Ans=malloc(size of(struct AEI));
while (i < al){
N1->data[i] = 0;
MO->data[i] = 0;
Ans->data[i]=0;
i++;
}

printf("Enter integer to Factorialize: ");
Please learn to indent consistently. This is like trying to roller
skate on cobblestones.

snip rest of function until
>
printf("\nProgr ess: 100\%");
The correct way to include a '%' in a printf string is %%, not \%.
Didn't your compiler complain about an undefined escape sequence?
>
return 0;
}
void AEIprintf(struc t AEI *N1){

float fieldLength;
double temp;
char format1[8];
long j, FL0;
j = N1->digits-1;
FL0=(long)log10 ((float)base);
fieldLength = (float)log10((f loat)base);
temp = modf(fieldLengt h, &fieldLength );
This is a constraint violation. The second argument should be a
double*, not a float*. During execution it will invoke undefined
behavior. Didn't your compiler complain about a type mismatch?
>

format1[0] = '%';
format1[1] = '0';
format1[2] = fieldLength + 48;
What is 48. You probably mean '0' and should use that.
> format1[3] = 'd';
format1[4] = 0x00;
While the hex value is correct, '\0' is more in keeping with common C
idiom.
>
printf("%*d", FL0, N1->data[j]);
j--;

while (j >= 0)
{
printf(format1, N1->data[j]);

j--;
}

return;
}

snip
>void Amult(struct AEI * A, struct AEI * B, struct AEI * C){
/*C = A * B; */
long i, ii,d, result, carry=0, digits=0;
struct AEI *Ans;
Ans=malloc(size of(struct AEI));
i=0;
d= (A->digits+B->digits-1);
while(i<d){
Ans->data[i]=carry;
carry=0;
ii=0;
while(ii<=i){
if(B->data[ii]!=0){
Ans->data[i]+=A->data[i-ii]*B->data[ii];
carry+=Ans->data[i]/base;
Ans->data[i]=Ans->data[i]%base;
}
ii++;
}
carry+=Ans->data[i]/base;
Ans->data[i]=Ans->data[i]%base;

i++;
}
if(carry!=0){
d++;
Ans->data[i]=carry;
}

C->digits=d;
i=0;
while(i<d){
C->data[i]=Ans->data[i];
You could have simply said *C = *Ans and let the compiler do this
possibly more efficiently.

Is there a reason you did not do the work in C directly (and avoid the
time spent in malloc and copying *Ans to *C)?
i++;
Why do you go to so much trouble to avoid for loops?
for (i = 0; i < d; i++)
C->...;
}
return;
Where do you free Ans? You are causing repeated memory leaks.
>}

============== ======end paste========== =============== ===

I tried to indent the code with spaces instead of tabs, but if some
parts end up not properly indented, I hope no one will hold it against
me.

Thanks ahead

Remove del for email
Nov 2 '06 #5
Barry Schwarz said:
On 1 Nov 2006 14:28:20 -0800, "fermineutr on" <fr**********@y ahoo.com>
wrote:
<snip>
>>
float fieldLength;
double temp;
char format1[8];
long j, FL0;
j = N1->digits-1;
FL0=(long)log 10((float)base) ;
fieldLength = (float)log10((f loat)base);
temp = modf(fieldLengt h, &fieldLength );

This is a constraint violation. The second argument should be a
double*, not a float*. During execution it will invoke undefined
behavior. Didn't your compiler complain about a type mismatch?
In my experience, fermineutron pays no heed to correctness issues, and is
concerned only with speed. He has a track record of ignoring corrections. I
think you're wasting your time, Barry.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Nov 2 '06 #6
Hi
The math software in my ti89 calculator with a 12 MHz 68k CPU can
calculate all 613 digits of 299! in 1 second.
It then takes 5 seconds to convert that to a string for display. But
for giggles I once wrote my own display routine that does the integer
to string conversion in 1/3 of a second.

Frederick Gotham wrote:
fermineutron:
Now, on the Num Analyses forum/Group there is a
post reaporting that that person wrot java code that computed 1000000!
in about a second. That is about 10000 times faste than I would expect
my code to do it. So the two possiblilities are:
1) I am doing something terribly wrong
2) The othr person is lying


The burden of proof is on the Java dude.

Probability suggests that he's lying, because only a very small proportion of
proficient programmers waste their time on mickey-mouse hold-my-hand
languages such as Java. Also, Java is slow.

Either that or the algorithm's something stupid like:

char const *Func(void)
{
return "23495072395732 584712579750932 750923750932759 2387509";
}

Ask the Java dude if you can see his code. If he refuses, assume that he's a
liar, then egg his house.

--

Frederick Gotham
Nov 2 '06 #7
fermineutron wrote:
Ian Collins wrote:
Before anyone reviews the code, have you profiled it?

If not, why not? If you have, where were the bottlenecks?
Its a factorial calculation. The bottleneck is in the big integer
multiply, which itself should have a bottleneck in the platform's
multiply.
I profilled it, but there were no obvious bottlenecks which i would not
anticipate to be there by design.

here is the profiler output

http://igorpetrusky.awardspace.com/Temp/RunStats.html

I was thinking that maybe there is some other algorythm that is better
than mine for the long int arithemetic?
Probably so. Consider that you are doing nothing more than computing
the straight product of the numbers using no arithmetic short cuts at
all.

So here's what comes off the top of my head: Ask yourself the
following problem. How many factors of 2 are there in 1000000! ?
Certainly every other number is even. But every 4th number has 2
factors of 2, and every 8th number has 3 factors of two in it. So the
answer is:

f(2) = floor(1000000/2) + floor(1000000/4) + floor(1000000/8) + ...

Similarly we can figure out the number of factors of 3s, 5s, 7s, 11s,
and all the primes less than 1000000, as f(3), f(5), etc. Then the
result you are looking for is:

1000000! = pow(2,f(2))*pow (3,f(3))*pow(5, f(5))*...

Now, the question is -- what makes us think this will be any faster?
Well, the pow() function can be computed with successive squaring
tricks. Squaring faster than straight multiplying because (a*q^r + b)
^ 2 = a^2*q^(2*r) + b^2 + 2*a*b*(q^r). And the resulting big number
multiplies that you have to perform here can be accelerated using any
number of big number multiply acceleration tricks (Karatsuba,
Toom-Cook, or DFTs.).

I don't know how much faster, if any, doing things this way would be.
If you find a faster way in the literature, I would be interested in
know it.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 2 '06 #8
fermineutron wrote:
Ian Collins wrote:
Before anyone reviews the code, have you profiled it?

If not, why not? If you have, where were the bottlenecks?
Its a factorial calculation. The bottleneck is in the big integer
multiply, which itself should have a bottleneck in the platform's
multiply.
I profilled it, but there were no obvious bottlenecks which i would not
anticipate to be there by design.

here is the profiler output

http://igorpetrusky.awardspace.com/Temp/RunStats.html

I was thinking that maybe there is some other algorythm that is better
than mine for the long int arithemetic?
Probably so. Consider that you are doing nothing more than computing
the straight product of the numbers using no arithmetic short cuts at
all.

So here's what comes off the top of my head: Ask yourself the
following problem. How many factors of 2 are there in 1000000! ?
Certainly every other number is even. But every 4th number has 2
factors of 2, and every 8th number has 3 factors of two in it. So the
answer is:

f(2) = floor(1000000/2) + floor(1000000/4) + floor(1000000/8) + ...

Similarly we can figure out the number of factors of 3s, 5s, 7s, 11s,
and all the primes less than 1000000, as f(3), f(5), etc. Then the
result you are looking for is:

1000000! = pow(2,f(2))*pow (3,f(3))*pow(5, f(5))*...

Now, the question is -- what makes us think this will be any faster?
Well, the pow() function can be computed with successive squaring
tricks. Squaring faster than straight multiplying because (a*q^r + b)
^ 2 = a^2*q^(2*r) + b^2 + 2*a*b*(q^r). And the resulting big number
multiplies that you have to perform here can be accelerated using any
number of big number multiply acceleration tricks (Karatsuba,
Toom-Cook, or DFTs.).

I don't know how much faster, if any, doing things this way would be.
If you find a faster way in the literature, I would be interested in
knowing it.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 2 '06 #9
Frederick Gotham wrote:
Probability suggests that he's lying, because only a very small proportion of
proficient programmers waste their time on mickey-mouse hold-my-hand
languages such as Java. Also, Java is slow.
Can we drop the insults, please?

--
Chris "unhashedup hashed up hashing" Dollin
"I'm still here and I'm holding the answers" - Karnataka, /Love and Affection/

Nov 2 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
2235
by: Willem | last post by:
When I run the follwing code using Python 2.3: from time import clock t1 = clock () for i in range (10000): a = int ('bbbbaaaa', 16) t2 = clock () for i in range (10000): a = long ('bbbbaaaa', 16) t3 = clock () print (t2-t1) / (t3-t2)
2
10068
by: OakRogbak_erPine | last post by:
My company is considering purchasing MS SQL Server to run an application on (SASIxp). I am mainly familiar with Oracle, so I was wondering how long it would take to copy a database. Basically we have database A and each night we want to replace database B with the contents of A. How long would this take say if we had a 10GB database or a 20GB database. What would be the technique to do this nightly, the Copy Database Wizard, Snapshot...
13
2541
by: Jeff Melvaine | last post by:
I note that I can write expressions like "1 << 100" and the result is stored as a long integer, which means it is stored as an integer of arbitrary length. I may need to use a large number of these, and am interested to know whether the storage efficiency of long integers is in danger of breaking my code if I use too many. Would I do better to write a class that defines bitwise operations on arrays of integers, each integer being assumed...
15
2360
by: cody | last post by:
We have a huge project, the solutuion spans 50 projects growing. Everytime I want to start the project I have to wait nearly over 1 minute for the compiler to complete building. This is unaccaptable. I thought about loading only the project I need into visual studio and not the whole solution. The problem is that the compiler tells me it cannot find the referenced dlls (project references) although they are all lying in their bin and obj...
45
7481
by: Trevor Best | last post by:
I did a test once using a looping variable, first dimmed as Integer, then as Long. I found the Integer was quicker at looping. I knew this to be true back in the 16 bit days where the CPU's (80286) word size was 16 bits same as an integer. Now with a 32 bit CPU I would have expected the long to be faster as it's the same size as the CPU's word size so wouldn't need sawing in half like a magician's assistant to calculate on like an...
0
1209
by: Claire | last post by:
My application has a thread reading byte arrays from an unmanaged dll(realtime controller monitoring). The array represents an unmanaged struct containing a series of header fields plus a variable sized array of upto 300 structs. Current version of c# doesn't support this sort of struct hence I just pass an array of bytes to the dll. When I read these in, I use a binaryreader to process the data to fill out the c# instance of my class...
3
2379
by: ajaksu | last post by:
Hello c.l.p.ers :) Running long(Decimal) is pretty slow, and the conversion is based on strings. I'm trying to figure out whether there is a good reason for using strings like in decimal.py (that reason would be bound to bite me down the road). This converts Decimal to long and is much faster in my test system (PIII 650MHz, but was written on a P133 a year ago :)). def dec2long(number):
1
2913
by: =?Utf-8?B?QWxCcnVBbg==?= | last post by:
I apparently posted this in a wrong group ... one intended for pre-.Net development using VB. Anyway... I have a solution containing a project of Web pages, a project that contains all my business objects, five Web Service projects and three Windows Services projects. I'm only working with a relatively small portion of the code ... primarily the business objects, Web pages and one of the Web Services ... but I need to make sure that...
28
4981
by: Bartc | last post by:
From an article about implementing a C99 to C90 translator... How does someone use integer arithmetic of at least 64 bits, and write in pure C90? As I understand C90 (the standard is very elusive), long int is guaranteed to be at least 32 bits only. So, do people rely on their known compiler limits, use clunky emulations, or do not bother with 64 bits when writing ultra-portable code?
0
9645
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9480
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10327
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9950
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8973
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7499
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5381
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3647
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2879
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.