473,324 Members | 2,581 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

strange problem of sorting

I am reading an old book - Programming Pearls 2nd edition recently. It
says, "Even though the general C++ program uses 50 times the memory
and CPU time of the specialized C program, it requires just half the
code and is much easier to extend to other problems." in a sorting
problem of the very first chapter.

I modified the codes in the answer part, ran it and found it is almost
the contrary case. The qsortints.c is the C code that uses the qsort
function defined in stdlib.h. The sortints.cpp uses STL sort. The
bitsort.c uses the method of bitmap in the book. To my surprise, the
bitmap method is slowest! The qsort method slower, the sort fastest.
It is totally different from what is said in the book!

I repeated the experiment using Visual Studio 2005 on a Windows XP
machine, and mingw32(gcc 4.2.0), the result is the same:
sort>qsort>bitmap.

Here goes the code.

/* qsortints.c -- Sort input set of integers using qsort */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define iMax 2000000
#define iSize 1000000

int myrand()
{
return rand()%32767;
}

int randint(int a, int b)
{
return a + ( 32767*myrand() + myrand()) % (b + 1 - a);
}

int intcomp(int *x, int *y)
{
return *x - *y;
}

int a[iMax];

int main()
{
int i, p, t;
clock_t t_begin, t_end;

srand((unsigned) time(NULL));

for (i = 0; i < iSize+2000; i++)
a[i] = i;

t_begin = clock();
for (i = 0; i < iSize; i++)
{
p = randint(i, iSize-1);
t = a[p]; a[p] = a[i]; a[i] = t;
}
qsort(a, iSize, sizeof(int), intcomp);
t_end = clock();
printf("%f\n", (t_end - t_begin)/1.0/CLOCKS_PER_SEC);

/*system("pause");

for (i = 0; i < iSize; i++)
printf("%d ", a[i]);*/

return 0;
}
/* Output is 1.270000 on my SuSE 10 with gcc -O3 */
/* End of qsortints.c */

/* sortints.cpp -- Sort input set of integers using STL set */
#include <iostream>
#include <ctime>
#include <algorithm>

using namespace std;

int myrand()
{
return rand()%32767;
}

int randint(int a, int b)
{
return a + ( 32767*myrand() + myrand()) % (b + 1 - a);
}

const int iMax = 2000000;
const int iSize = 1000000;

int a[iMax];

int main()
{
int i, p, t;
clock_t t_begin, t_end;
srand((unsigned) time(NULL));

for (i = 0; i < iSize+2000; i++)
a[i] = i;

t_begin = clock();
for (i = 0; i < iSize; i++)
{
p = randint(i, iSize-1);
t = a[p]; a[p] = a[i]; a[i] = t;
}
sort(a, a+iSize-1);
t_end = clock();
cout << (t_end - t_begin)/1.0/CLOCKS_PER_SEC << endl;

/*system("pause");

for (i = 0; i < iSize; i++)
cout << a[i] << " ";*/
}
/* Output is 0.38 on SuSE */
/* End of sortints.cpp */

/* bitsort.c Sort input set of integers using bitmap */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i)
{
a[i>>SHIFT] |= (1<<(i & MASK));
}

void clr(int i)
{
a[i>>SHIFT] &= ~(1<<(i & MASK));
}

int test(int i)
{
return a[i>>SHIFT] & (1<<(i & MASK));
}

int myrand()
{
return rand()%32767;
}

int randint(int a, int b)
{
return a+(32767*myrand()+myrand())%(b+1-a);
}

int main()
{
int i;
clock_t t_begin, t_end;

srand((unsigned)time(NULL));

for (i = 0; i < N; i++)
clr(i);

t_begin = clock();
for (i = 0; i< N-2000; i++)
set(randint(i, N-1));
t_end = clock();
printf("%f\n", (t_end - t_begin)/1.0/CLOCKS_PER_SEC);

/*for (i = 0; i < N; i++)
if (test(i))
printf("%d\n", i);*/

return 0;
}
/* Output is 1.45000 on SuSE */
/* End of bitsort.c */

Any suggestions?

Jul 6 '07 #1
7 2367
"abracadabra" writes:
>I am reading an old book - Programming Pearls 2nd edition recently. It
says, "Even though the general C++ program uses 50 times the memory
and CPU time of the specialized C program, it requires just half the
code and is much easier to extend to other problems." in a sorting
problem of the very first chapter.

I modified the codes in the answer part, ran it and found it is almost
the contrary case. The qsortints.c is the C code that uses the qsort
function defined in stdlib.h. The sortints.cpp uses STL sort. The
bitsort.c uses the method of bitmap in the book. To my surprise, the
bitmap method is slowest! The qsort method slower, the sort fastest.
It is totally different from what is said in the book!

I repeated the experiment using Visual Studio 2005 on a Windows XP
machine, and mingw32(gcc 4.2.0), the result is the same:
sort>qsort>bitmap.

Here goes the code.

/* qsortints.c -- Sort input set of integers using qsort */
#include <stdio.h>
massive snippage
printf("%d\n", i);*/

return 0;
}
/* Output is 1.45000 on SuSE */
/* End of bitsort.c */

Any suggestions?
My first question would be, how long did the three methods take in MKS
units? (clocks/sec in not an MKS unit). I think you may be trying to
extract too much information from a small sample. ISTM the author's point
was simply that STL sort was easier to use than qsort. Given a programmer
with sufficient background, I guess that is marginally true.
Jul 6 '07 #2
On 2007-07-06 16:22, abracadabra wrote:
I am reading an old book - Programming Pearls 2nd edition recently. It
says, "Even though the general C++ program uses 50 times the memory
and CPU time of the specialized C program, it requires just half the
code and is much easier to extend to other problems." in a sorting
problem of the very first chapter.

I modified the codes in the answer part, ran it and found it is almost
the contrary case. The qsortints.c is the C code that uses the qsort
function defined in stdlib.h. The sortints.cpp uses STL sort. The
bitsort.c uses the method of bitmap in the book. To my surprise, the
bitmap method is slowest! The qsort method slower, the sort fastest.
It is totally different from what is said in the book!

I repeated the experiment using Visual Studio 2005 on a Windows XP
machine, and mingw32(gcc 4.2.0), the result is the same:
sort>qsort>bitmap.

Here goes the code.
[snip]
Any suggestions?
The book is outdated?

It's really no surprise that std::sort is faster than qsort, it's much
easier to optimise and (to my understanding) often uses introspective
sort, which has slightly better worst case performance than quicksort
(though there's probably nothing stopping someone to use the same
algorithm in qsort).
--
Erik Wikström
Jul 6 '07 #3
Erik Wikström <Er***********@telia.comwrote:
It's really no surprise that std::sort is faster than qsort, it's much
easier to optimise
Yes, probably due to it being much easier to inline the comparison
function in std::sort() than in qsort().

--
Marcus Kwok
Replace 'invalid' with 'net' to reply
Jul 6 '07 #4
On Jul 6, 11:23 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
On 2007-07-06 16:22, abracadabra wrote:
I am reading an old book - Programming Pearls 2nd edition recently. It
says, "Even though the general C++ program uses 50 times the memory
and CPU time of the specialized C program, it requires just half the
code and is much easier to extend to other problems." in a sorting
problem of the very first chapter.
I modified the codes in the answer part, ran it and found it is almost
the contrary case. The qsortints.c is the C code that uses the qsort
function defined in stdlib.h. The sortints.cpp uses STL sort. The
bitsort.c uses the method of bitmap in the book. To my surprise, the
bitmap method is slowest! The qsort method slower, the sort fastest.
It is totally different from what is said in the book!
I repeated the experiment using Visual Studio 2005 on a Windows XP
machine, and mingw32(gcc 4.2.0), the result is the same:
sort>qsort>bitmap.
Here goes the code.

[snip]
Any suggestions?

The book is outdated?

It's really no surprise that std::sort is faster than qsort, it's much
easier to optimise and (to my understanding) often uses introspective
sort, which has slightly better worst case performance than quicksort
(though there's probably nothing stopping someone to use the same
algorithm in qsort).

--
Erik Wikström
Ah, yes. I typed a gcc -pg, and gprof. I saw lots of introsort
something in the output. But I don't know why the bitmap method is the
slowest. The book referred it as the fastest.

Jul 7 '07 #5
On Jul 6, 7:22 am, abracadabra <jerry_cq...@yahoo.comwrote:
I am reading an old book - Programming Pearls 2nd edition recently. It
says, "Even though the general C++ program uses 50 times the memory
and CPU time of the specialized C program, it requires just half the
code and is much easier to extend to other problems." in a sorting
problem of the very first chapter.

I modified the codes in the answer part, ran it and found it is almost
the contrary case. The qsortints.c is the C code that uses the qsort
function defined in stdlib.h. The sortints.cpp uses STL sort. The
bitsort.c uses the method of bitmap in the book. To my surprise, the
bitmap method is slowest! The qsort method slower, the sort fastest.
It is totally different from what is said in the book!

I repeated the experiment using Visual Studio 2005 on a Windows XP
machine, and mingw32(gcc 4.2.0), the result is the same:
sort>qsort>bitmap.

Here goes the code.

/* sortints.cpp -- Sort input set of integers using STL set */
#include <iostream>
#include <ctime>
#include <algorithm>

using namespace std;
....
int a[iMax];

int main()
{
int i, p, t;
clock_t t_begin, t_end;
srand((unsigned) time(NULL));

for (i = 0; i < iSize+2000; i++)
a[i] = i;

t_begin = clock();
for (i = 0; i < iSize; i++)
{
p = randint(i, iSize-1);
t = a[p]; a[p] = a[i]; a[i] = t;
}
sort(a, a+iSize-1);
t_end = clock();
....
/* End of sortints.cpp */

/* bitsort.c Sort input set of integers using bitmap */
....
int main()
{
int i;
clock_t t_begin, t_end;

srand((unsigned)time(NULL));

for (i = 0; i < N; i++)
clr(i);

t_begin = clock();
for (i = 0; i< N-2000; i++)
set(randint(i, N-1));
t_end = clock();
printf("%f\n", (t_end - t_begin)/1.0/CLOCKS_PER_SEC);

/*for (i = 0; i < N; i++)
if (test(i))
printf("%d\n", i);*/

return 0;}

/* Output is 1.45000 on SuSE */
/* End of bitsort.c */

Any suggestions?
These programs are including the time taken to generate and fill in
the array of ints - and not just the actual time spent to sort the
array; whereas a more accurate measurement of sorting performance
would exclude this kind of preparation time. Note that in the case of
the bit sort, it will be necessary to allocate an array for the
generated ints, since the current version both generates the int
values and "sorts" them at the same time.

I found that once the preparation time was excluded from the sorting
timings, that the bit sort program was about twice as fast as the STL
program in sorting the int values. Note that this outcome is not at
all surprising - since the bit sort is actually just flipping bits in
a bit vector and is not "sorting" the int values - at least not in the
way that most people usually think of sorting..

Greg

Jul 7 '07 #6
Because in the bitmap method, sorting and setting the integers are the
same step, I had to add the randint function into the time counting
for three method. I guess it might be the bit set operation that takes
a long time.

Abracadabra

Jul 8 '07 #7
Because in the bitmap method, sorting and setting the integers are the
same step, I had to add the randint function into the time counting
for three method. I guess it might be the bit set operation that takes
a long time.

Abracadabra

Jul 8 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: dont bother | last post by:
This is really driving me crazy. I have a dictionary feature_vectors{}. I try to sort its keys using #apply sorting on feature_vectors sorted_feature_vector=feature_vectors.keys()...
4
by: shyner | last post by:
Hi Everyone, I've been battling this for two days with no luck. I'm using SQL Server 2000. Here's the mystery: I've got a stored procedure that takes a single varchar parameter to determine...
1
by: Kepler | last post by:
I'm fighting a really strange bug that involves both a DataGrid and a Repeater disappearing on postback. The strange thing is that I've distilled the problem down to a simple program, that...
10
by: Dustan | last post by:
I'm hiding some of the details here, because I don't want to say what I'm actually doing. I have a special-purpose class with a __cmp__ method all set up and ready to go for sorting. Then I have...
20
by: SpreadTooThin | last post by:
I have a list and I need to do a custom sort on it... for example: a = #Although not necessarily in order def cmp(i,j): #to be defined in this thread. a.sort(cmp) print a
2
by: Paul Furman | last post by:
I don't know, maybe this isn't strange but someone else set up the shopping card coding I'm working with, the way it works is to get the time() in seconds like 1172693735 and that's the shopper_ID...
2
by: escher4096 | last post by:
Hi all, In C# 2.0, if I take a set of strings and toss them into an ArrayList and then sort them it does not always work as expected. I used some garbage data (I just grabbed the using...
7
by: christery | last post by:
Anyone got a clue to why ther is a T between date and time in the "formatted for sorting" or whatewer they call it, and a Z at the end after seconds- got it fixed for my cobol programmer by...
5
by: jrod11 | last post by:
hi, I found a jquery html table sorting code i have implemented. I am trying to figure out how to edit how many colums there are, but every time i remove code that I think controls how many colums...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.