473,574 Members | 3,159 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Benchmarks

The task: Write a program that reads a set of words from standard
input and prints the number of distinct words.

I came across a website that listed a few programs to accomplish this
task: http://unthought.net/c++/c_vs_c++.html (ignore all the language
flaming :-), and thought that all of them did unnecessary operations,
so I wrote my own. But for some reason, my version turned out slower
that ALL of the versions in the website, even though it seems to
perform less operations (yes, I benchmarked them on my own computer).

According to the website, the slowest version is:

#include <set>
#include <string>
#include <iostream>

int main(int argc, char **argv)
{
// Declare and Initialize some variables
std::string word;
std::set<std::s tringwordcount;
// Read words and insert in rb-tree
while (std::cin >word) wordcount.inser t(word);
// Print the result
std::cout << "Words: " << wordcount.size( ) << std::endl;
return 0;
}

My version is about 12 times slower than that. It uses lower-level
constructs. Here it is:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

struct SetNode
{
char *word;
struct SetNode *next;
};

// An unorderd set of words
//
static struct SetNode *gSet = 0;
static int gSetSize = 0;

#define kInitWordSize 32

// Returns a word read from stdin. The returned pointer must be
// deallocated with free().
//
static char *
ReadOneWord(voi d)
{
int ch = getchar();

while (ch != EOF && isspace(ch))
ch = getchar();
if (ch == EOF)
return 0;

char *word = (char *) malloc(kInitWor dSize);
if (!word)
return 0;

int size = kInitWordSize;
int i = 0;

while (ch != EOF && !isspace(ch)) {
if (i >= size) {
size *= 2;

char *newWord = (char *) realloc(word, size);
if (!newWord) {
free(word);
return 0;
}
word = newWord;
}

word[i++] = ch;
ch = getchar();
}

if (i >= size) {
size *= 2;

char *newWord = (char *) realloc(word, size);
if (!newWord) {
free(word);
return 0;
}
word = newWord;
}
word[i] = '\0';

return word;
}

// Inserts a word into the set if it isn't in the set.
// The passed string is expected to have been allocated with
// a memory allocation function, and it should be considered
// lost after passed to this function.
//
static void
InsertWord(char *aWord)
{
struct SetNode *node;

for (node = gSet; node; node = node->next) {
if (strcmp(node->word, aWord) == 0) {
free(aWord);
return;
}
}

node = (struct SetNode *) malloc(sizeof(s truct SetNode));
if (!node) {
free(aWord);
return;
}

node->word = aWord;
node->next = gSet;
gSet = node;
++gSetSize;
}

static void
DeleteSet(void)
{
struct SetNode *node = gSet;
struct SetNode *temp;

while (node) {
temp = node;
node = node->next;
free(temp->word);
free(temp);
}

gSet = 0;
gSetSize = 0;
}

int
main(void)
{
char *word;

while ((word = ReadOneWord()))
InsertWord(word );

printf("Words: %d\n", gSetSize);

// Skip cleanup for now...
//DeleteSet();
}

Any ideas as to what causes the big slowdown?

Sebastian

Nov 6 '08
120 5530
Nick Keighley wrote:
CAREFUL ALGORITH SELECTION CAN CREAM MICRO-OPTIMISATION
That's so true. However, we shouldn't completely dismiss
"micro-optimization" and "hacker-optimization" once we have come
up with the fastest possible overall algorithm. Even though two
implementations might both by eg. O(n lg n) with a rather small
constant factor, one of them might still be 10 times faster than
the other if it performs clever low-level tricks inside the
innermost loop (which might be run eg. millions of times).

Of course when dealing with hacker optimization, there's always
a point where it's too detrimental to the readability and
maintainability of the code in relation to the speed benefit it
gives. If the code becomes significantly more obfuscated while
giving a speed boost of 1%, it might not be worth the trouble.
Personally I prefer to write 100 lines of clear, readable and
maintainable code, even if it's 1% slower than 1000 lines of code
worthy of the IOCCC.
Nov 7 '08 #21
Phil Carmody <th************ *****@yahoo.co. ukwrites:
Juha Nieminen <no****@thanks. invalidwrites:
>user923005 wrote:
>>The hash table solution is O(n).

You would have hard time proving that.

He shouldn't. It should be pretty clear. No hash table needs
to examine each element more than once.
Oooops, suffering from Falconeritis - but you did over-snip rather.
The whole task is indeed o(n^2), but omega(n).

Phil
--
We must respect the other fellow's religion, but only in the sense and to the
extent that we respect his theory that his wife is beautiful and his children
smart. -- Henry Louis Mencken (1880-1956), American editor and critic
Nov 7 '08 #22
Juha Nieminen wrote:
[...]
Personally I prefer to write 100 lines of clear, readable and
maintainable code, even if it's 1% slower than 1000 lines of code
worthy of the IOCCC.
It's always easier to make a bug-free program fast than to
bug-fix a fast program. :)

Schobi
Nov 7 '08 #23
user923005 wrote:
Cuckoo hashing has guaranteed O(1) lookup and delete and amortized
O(1) insert.
Wikipedia says: "Lookup requires inspection of just two locations in
the hash table, which takes constant time in the worst case".

I don't understand how that is possible. Suppose an element is
inserted into the table. Then later a new element displaces this first
element to a second location. Then later another new element, to be put
in this second location, displaces this first element to a third
location. How exactly do you find this first element anymore? It's not
in either of the two alternative locations defined by the two hashing
functions for that element.
Nov 7 '08 #24
On Nov 7, 10:25 am, Juha Nieminen <nos...@thanks. invalidwrote:
Nick Keighley wrote:
CAREFUL ALGORITH SELECTION CAN CREAM MICRO-OPTIMISATION
That's so true. However, we shouldn't completely dismiss
"micro-optimization" and "hacker-optimization" once we have
come up with the fastest possible overall algorithm.
Yes, but this should only be undertaken when necessary, under
the control of a profiler---what optimizes on one platform may
slow down on another. And don't forget that a good compiler
will do many of these optimizations for you. A long time ago, I
tried replacing "h = 127 * h + *p" (in a hashing algorithm) with
"h = (h << 7 - h) + *p", knowing that my hardware didn't have
hardware multiply. The results were slower than the original;
the compiler also converted the multiplication into a shift and
subtract, and since it knew the final purpose in those
operations, was actually able to save one machine instruction
over an explicit shift and subtract.
Even though two implementations might both by eg. O(n lg n)
with a rather small constant factor, one of them might still
be 10 times faster than the other if it performs clever
low-level tricks inside the innermost loop (which might be run
eg. millions of times).
I've never seen a factor of 10 here. Even a factor of 2 is
exceptional (and usually indicative of a very poor optimizer in
the compiler).

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Nov 7 '08 #25
On Nov 7, 4:51*am, c...@tiac.net (Richard Harter) wrote:
Just as a note, the common claim that hash table are O(1) per
access whereas search trees are O(log n) per access is
misleading, ...computing the hash code ...
is necessarily greater
than log2(n)....
hash table * * *- O(log n)
Note that, for sufficiently pedantic O(.)
estimates, many of the usual O(.) measures are wrong.
A single arithmetic op may be considered to require
O(log N) time since log N bits are needed to represent
an index on a set of size N.

Returning to OP's problem, I wonder: are Compact
Hash Trees are in wide use? If there's interest
I'll post my implementation.

James Dow Allen
Nov 7 '08 #26
James Kanze wrote:
>Even though two implementations might both by eg. O(n lg n)
with a rather small constant factor, one of them might still
be 10 times faster than the other if it performs clever
low-level tricks inside the innermost loop (which might be run
eg. millions of times).

I've never seen a factor of 10 here. Even a factor of 2 is
exceptional (and usually indicative of a very poor optimizer in
the compiler).
There's just so much a compiler can do. There are cases where only the
programmer can understand what's going on (rather than the compiler) and
be able to optimize something.

For example, comparing two strings for inequality is O(1) if the
maximum size of the string is fixed (eg. you know that no string has
more than 30 characters), but still much slower than comparing two
integers. If you can somehow change the string comparison to an integer
comparison, and these comparisons are performed millions of times in the
innermost loop of your code, the program may speed up considerably. This
is a kind of optimization which is completely out or the realm of
compiler optimizations.
Nov 7 '08 #27
In article <79b10b2f-6abf-4440-9d04-f731e6b51bf2
@f40g2000pri.go oglegroups.com> , ja*********@gma il.com says...

[ ... ]
Even though two implementations might both by eg. O(n lg n)
with a rather small constant factor, one of them might still
be 10 times faster than the other if it performs clever
low-level tricks inside the innermost loop (which might be run
eg. millions of times).

I've never seen a factor of 10 here. Even a factor of 2 is
exceptional (and usually indicative of a very poor optimizer in
the compiler).
I've seen a factor of 10, and even a _little_ bit more -- but it was
_extremely_ hardware specific, depending almost entirely on the cache
organization of the processor. As the data was originally placed in
memory, two crucial items ended up contending for the same cache lines.
Rearranging the data allowed those items to live in different cache
lines.

In this case, the CPU involved wasn't even made when the compiler was
written, so I have a hard time blaming the compiler for the poor
optimization. Unfortunately, I don't know of any newer compilers that
would do much better (and when they do, it's mostly by accident...)

--
Later,
Jerry.

The universe is a figment of its own imagination.
Nov 7 '08 #28
In article <gf**********@h oshi.visyn.net> , sp******@gmx.de says...
Juha Nieminen wrote:
[...]
Personally I prefer to write 100 lines of clear, readable and
maintainable code, even if it's 1% slower than 1000 lines of code
worthy of the IOCCC.

It's always easier to make a bug-free program fast than to
bug-fix a fast program. :)
Oft repeated, but not really true. Sometimes a fast program still
contains a trivial bug, but I can remember one program in particular
that was bug-free as far as I know, but a horrible mess, to the point
that by _far_ the easiest way to improve it was write a set of
specifications for what it did, and start over. By the time I realized
that, however, I'd spent more time trying to optimize one small part
than it eventually took to rewrite the whole thing...

--
Later,
Jerry.

The universe is a figment of its own imagination.
Nov 7 '08 #29
On Fri, 07 Nov 2008 09:41:08 GMT, Juha Nieminen
<no****@thanks. invalidwrote:
>user923005 wrote:
>Cuckoo hashing has guaranteed O(1) lookup and delete and amortized
O(1) insert.

Wikipedia says: "Lookup requires inspection of just two locations in
the hash table, which takes constant time in the worst case".

I don't understand how that is possible. Suppose an element is
inserted into the table. Then later a new element displaces this first
element to a second location. Then later another new element, to be put
in this second location, displaces this first element to a third
location. How exactly do you find this first element anymore? It's not
in either of the two alternative locations defined by the two hashing
functions for that element.
It doesn't work like that, though you had me perplexed for a
moment. The first element goes back to its original location,
displacing the second. Thus, suppose that h0, h1, etc are
locations and the first three element are e0, e1, e2, and that
their address pairs are:

e0: h0,h1
e1: h0,h2
e2: h1,h3

Initially e0 is inserted in location h0. When e1 is inserted, e0
is moved to h1, and e1 is inserted in h0. The contents of the
table look like this:

h0: e1
h1: e0
h2: -
h3: -

Now e2 is inserted; it goes into location h1, displacing e0. In
turn e0 goes back to its alternate location h0, displacing e1,
which has to go to its alternate location, h2. The table now
looks like this:

h0: e0
h1: e2
h2: e1
h3: -


Richard Harter, cr*@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.
Nov 7 '08 #30

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1433
by: aaronwmail-usenet | last post by:
I've been wondering about benchmarks recently. What is a fair benchmark? How should benchmarks be vetted or judged? I decided to see what you folks thought, so for discussion I compared two priority queue implementations I published for Python in 1995 against the "heap" priority queue implementation added to the python distribution
0
1175
by: Dathan Vance Pattishall | last post by:
------=_NextPart_000_0401_01C35A8D.94584FD0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit At the conference a few months ago, the mysql team said that the benchmarks they where running would be published here http://www.mysql.com/benchmarks but it doesn't look like this is the link.
13
1857
by: Shane Wright | last post by:
Hi, I've been trying to spec a new server for my company's database for a few weeks and one of the biggest problems I've had is trying to find meaningful performance information about how PostgreSQL will perfom under various disk configurations. But, we have now taken the plunge and I'm in a position to do some benchmarking to actually...
15
6883
by: roberts.noah | last post by:
Are there any decent benchmarks of the difference between using c strings and std::string out there? Google isn't being friendly about it. Obviously this would be dependant on different implementations but I don't care. I would be happy to find ANY comparison at this point if it was valid and semi scientifically done.
4
1598
by: Dibur | last post by:
I have been told to set the performance benchmarks for all the db2 databases hosted on all servers. I have no clue as to how to approach this problem. Can this be done theoritically through extrapolation techniques.One thing that is important here is that all the servers are in production phase so practically carrying out load testing is not...
10
3032
by: StephQ | last post by:
I found that old post: http://groups.google.com/group/comp.lang.c++/browse_frm/thread/3a2562c9a5f8998/15519204726d01e8?lnk=gst&q=vector+no+surprise&rnum=2#15519204726d01e8 I just erased the #include <kubux.....lines. ****** old post for your convenince ******** You are right: #include <vector> #include <iostream>
80
3482
by: tech | last post by:
Hi, i have the following problem In file1.h namespace A { class Bar { void foo();
318
10901
by: King Raz | last post by:
The shootout site has benchmarks comparing different languages. It includes C# Mono vs Java but not C# .NET vs Java. So I went through all the benchmark on the site ... http://kingrazi.blogspot.com/2008/05/shootout-c-net-vs-java-benchmarks.html Just to keep the post on topic for my friends at comp.lang.c++, how do I play default windows...
84
571
by: s0suk3 | last post by:
The task: Write a program that reads a set of words from standard input and prints the number of distinct words. I came across a website that listed a few programs to accomplish this task: http://unthought.net/c++/c_vs_c++.html (ignore all the language flaming :-), and thought that all of them did unnecessary operations, so I wrote my own....
0
7760
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
1
7858
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6511
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5654
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5335
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3774
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
2273
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1369
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1099
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.