473,770 Members | 4,029 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to implement a Hash Table in C

Hi
can anybody tell me that which ds will be best suited to implement a
hash table in C/C++

thanx. in advanced

Aug 11 '07
139 14231
On Mon, 13 Aug 2007 21:40:52 +0100, in comp.lang.c , "Malcolm McLean"
<re*******@btin ternet.comwrote :
>"Ben Bacarisse" <be********@bsb .me.ukwrote in message
news:87******* *****@bsb.me.uk ...
>>
I am sorry that my report was not clear. I am not reacting to this
topic rationally.
It's surprising how often people react like that. I get similar accusations
all the time about 12 Common Atheist Arguments (refuted).
<OT>
What, you mean as opposed to the irrationality of the book itself?
</ot>
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Aug 13 '07 #31
"Malcolm McLean" <re*******@btin ternet.comwrite s:
"Ben Bacarisse" <be********@bsb .me.ukwrote in message
news:87******** ****@bsb.me.uk. ..
>>
>>Yes, that is another bug.
fixedalloca te should have an if(answer) before setting top to answer-
next.
Why not say that instead of all that vague talk about catastrophic
behaviour when the hash table gets full?

I am sorry that my report was not clear. I am not reacting to this
topic rationally. I find the idea if charging money for a book of
this quality hard to accept, and my anger at that has affected the
clarity of my postings.
It's surprising how often people react like that. I get similar
accusations all the time about 12 Common Atheist Arguments
(refuted).
Do you push that book in groups where only a critique of the grammar
is on topic? If so, then I can see the similarity. If not, then you
missed my point. I am trying very hard to be topical for c.l.c. That
explains, I think, what some have seen as nitpicking. Post in
comp.programmin g or comp.theory if you want to see some comments on
your book as a whole.
There of course there can't be a technical justification.
You have a strange sense of humour (I assume that was a joke?).

<snip>
>Will you accept that readers should get a refund when they see the
code you present for implementing a queue as a circular buffer?
Maybe it's just me incapable of seeing the problems with my own code,
but what is wrong with a circular buffer? Sure, it doesn't grown
dynamically, but generally you want to ration queue length anyway. If
it consistently has more coming in than you're taking out, there's
probably something wrong with your logic, or the machine is
underpowered.
<stay clam>...

No. You are thinking of one kind of queue (where you do want to
balance input rate and output rate) and you've generalised from that.
There are uses of queues where your last statement is simply wrong.
For example, in a breadth first search of a tree using a queue,
continued growth of the queue simply indicates something about the
shape of the tree, not an underpowered machine or faulty logic.

Of course, limiting the queue size is often done, and a circular
buffer is an excellent way to implement a bounded queue. My objection
was that the code you present does not implement a bounded queue.
What you wrote

int queue[100];
int head;
int tail;

void add(int x)
{
if(tail == 100] /* sic */
tail = 0;
queue[tail++] = x;
}

int remove(void)
{
int answer = queue[head];
head++;
if(head == 100)
head = 0;
return answer;
}

is just a circular buffer. A bounded queue would have 'full' and
'empty' tests. The add operation would fail if the queue were full,
and the remove operation (badly named, as has been pointed out) would
fail if it were empty.

What algorithms would your queue implementation be suitable for?

--
Ben.
Aug 14 '07 #32
Malcolm McLean said:
"Richard Heathfield" wrote...
<snip>
>>
Primes aren't just an advantage for quadratic probing - they're
essential.
The hash table chapter includes an example of quadratic probing, which
does as you say.
Well, I didn't buy the book, so I wouldn't know.
Maybe the primality test should be more rigorous? The worst code is
that which seems to work, then fails when someone includes it in a
life-support system.
The primality test should be *definitive*. There's no excuse for it not
to be. It took me less than 30 seconds - on a relatively ancient
machine - to generate all of the 5761455 primes in the range 1 to
100,000,000. And less than two seconds to generate all the 664579
primes in the range 1 to 10,000,000.

Just how big a hash table did your users want? If it has more than a
hundred million entries, I suspect that the cost of definitive
primality testing will be the least of your problems.

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 14 '07 #33
On Aug 12, 6:11 pm, Richard Heathfield <r...@see.sig.i nvalidwrote:
Malcolm McLean said:
There are certain advantages in making the hash table length prime.

If you use quadratic probing as your collision avoidance mechanism, a
prime hash table length would seem to me to be essential, not just a
"certain advantage".
A robust quadratic prober will switch to
linear probing after "a while" (since it
would otherwise be able to access only half
the indexes). For this reason, Richard's
comment would be more robust if "essential"
is replaced with "very highly desireable."
If, on the other hand, you treat the hash table as
an array of collections of some kind, then I see no advantage at all to
the hash table length being prime.
I think there is a good practical reason to
use prime size, unrelated to quadratic probing.

A hash function needs to accomplish two
things: scramble, and reduce to range {0,1,...,N-1}
where N is table size. Two birds can be
(partially) killed with one stone when the last
step is taking remainder (index %= N), but this
scrambles best when N is prime.

(Flamers will be happy to note that we "should"
have scrambled well first but, unless we're
extremely confident in our hash function, why
not get the scrambling benefit of prime-number
remaindering?)

Another interesting issue, related to hashing
though perhaps unrelated to table size, concerns
reversible hash functions ("scrambling bijections").
These are easy to implement on {0,1,2,...,N-1}
*if* N is prime.

James Dow Allen
Aug 14 '07 #34
James Dow Allen said:

<snip>
A robust quadratic prober will switch to
linear probing after "a while" (since it
would otherwise be able to access only half
the indexes). For this reason, Richard's
comment would be more robust if "essential"
is replaced with "very highly desireable."
I'll buy that. Here's your dollar.

<snip>
I think there is a good practical reason to
use prime size, unrelated to quadratic probing.

A hash function needs to accomplish two
things: scramble, and reduce to range {0,1,...,N-1}
where N is table size. Two birds can be
(partially) killed with one stone when the last
step is taking remainder (index %= N), but this
scrambles best when N is prime.
Two bucks in one day? Hmmm. I'm not talking to you any more without my
lawyer.

But seriously, there are also advantages in making the table size an
integer power of 2 (which, in all but one case, fails the primality
test with a vengeance), since the hash value can then be reduced to the
table size more rapidly.

Measure, measure, measure!

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 14 '07 #35
On Mon, 13 Aug 2007 17:41:33 +0000, Frodo Baggins wrote:
On Aug 12, 6:19 pm, Ben Bacarisse <ben.use...@bsb .me.ukwrote:
>"Malcolm McLean" <regniz...@btin ternet.comwrite s:
As for the derefencing null pointer, that will happen if the
constructor fails, and return NULL. That is standard behaviour
throughout the book. If objects cannot be created, the constructing
function returns a null pointer to show they have failed.
I couldn't see another. The code has been tested, but only on two
systems (a UNIX mainframe and Windows) so that doesn't mean no errors
remain, and of course typographical errors do creep in in the process
of reformatting for print - I wish I had a tool to format source code
automatically but I don't. The normal thing when you an error such as
dereference of a null is to say "your code derererences a null" rather
than to talk airly about "catastropi c behaviour" in an arrogant
manner.

I am sorry you do not like my tone. I have certainly got worked up
about this subject. Your fixed allocator has an error. The code
cannot return a null pointer without causing undefined behaviour.
Calm enough? You decide how seriously you categorise it.

I think you should fix that and the other clear errors[1] that have been
pointed out before you call other people arrogant. You say in the
introduction that bugs can be "corrected very quickly", but you don't
seem to have corrected anything.

I would suggest using splint (http://www.splint.org/) if you haven't
done so already.
I wouldn't. Many of the warning it finds are bogus. It claims that
in long k = 1; printf("%g\n", k / 2.0) the second argument to
printf() has a wrong type. It complains that the controlling
expression of if(!ferror(stdo ut)) isn't Boolean, and suggests me
a way to disable warnings when pointers are used as controlling
expressions. It complains about "assigning a character to an int"
when I write i = '\n', but then it remembers "anyway it's safe,
since actually '\n' has type int". If I turn up the warning level
enough, it will also complain about the signed operands in (1<<8)
in the macro expansion of isspace(x) found in ctype.h.
Or better, use it, but do not take what it says too seriously.
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 14 '07 #36
Malcolm McLean wrote:
"Ben Bacarisse" <be********@bsb .me.ukwrote in message
.... snip ...
>
>I am sorry you do not like my tone. I have certainly got worked up
about this subject. Your fixed allocator has an error. The code
cannot return a null pointer without causing undefined behaviour.
Calm enough? You decide how seriously you categorise it.

Yes, that is another bug. fixedallocate should have an if(answer)
before setting top to answer->next. Why not say that instead of
all that vague talk about catastrophic behaviour when the hash
table gets full? (The other bug complaint about the code getting
stuck in a loop when the table gets full is false. Which I don't
mind, we can all make mistakes, but at least admit that you are
wrong. Otherwise people wonder if you really understand hash
tables at all.)
I haven't looked at your code, but it appears to be bug infested
from the comments here. Take a look at my hashlib package, which
has been out for some years, and I have had zero bug reports since
the first revision (and that was for a minor memory loss on
closing). Very stable, and purely standard C content. See:

<http://cbfalconer.home .att.net/download>

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 14 '07 #37
Richard Heathfield wrote:
CBFalconer said:
>ravi wrote:
>>>
can anybody tell me that which ds will be best suited to implement
a hash table in C/C++

I am not sure what your question is.

It seems pretty clear to me. But he needs to decide what language
he's writing in before he starts worrying about difficult stuff.
>However, a complete, and
portable, hash table implementation under GPL

...does not answer the question, any more than "what's the best kind
of honey?" is answered by "here, have one of my honey sandwiches".
What question? What's a 'ds'? What's "C/C++"? In c.l.c I think
we can assume the language is intended to be C.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 14 '07 #38
Richard Heathfield wrote:
>
.... snip ...
>
For example, consider a hash table with ten elements, of which
elements 0, 3, 4, 5, 8 and 9 are full (which leaves elements 1, 2,
6 and 7 empty, so the table is only 60% full).

Now try to add an element at index 4, using quadratic probing, and
you will find yourself trying elements 4, 5, 8, 3, 0, 9, 0, 3, 8,
5, 4, 5, 8, 3, 0, 9, 0, 3, 8, 5, 4, 5, ... ad nauseam.

Using a prime number guarantees that you won't get into this cycle
unless the table is actually full (which you can check separately,
in advance).

Primes aren't just an advantage for quadratic probing - they're
essential.
That's a nice, compact, and clear explanation. I tried explaining
this to someone some time ago (1 year or more) without success. He
continued to insist that non-prime sizes were suitable.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 14 '07 #39
CBFalconer said:
Richard Heathfield wrote:
>CBFalconer said:
<snip>
>>However, a complete, and
portable, hash table implementation under GPL

...does not answer the question, any more than "what's the best kind
of honey?" is answered by "here, have one of my honey sandwiches".

What question?
It seems that he wishes to know what primitive data structures would be
most suitable for implementing a hash table. Elsethread, I have
answered this question already, so I won't repeat the answer here.
What's a 'ds'?
The consensus appears to be that he means "data structure".
What's "C/C++"? In c.l.c I think
we can assume the language is intended to be C.
You have answered your own question. :-)

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 14 '07 #40

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1372
by: Yat | last post by:
I want to use hash table in C. Have any build-in library in C so that I can use it without implement hash table myself? Thx
24
4309
by: kdotsky | last post by:
Hello, I am using some very large dictionaries with keys that are long strings (urls). For a large dictionary these keys start to take up a significant amount of memory. I do not need access to these keys -- I only need to be able to retrieve the value associated with a certain key, so I do not want to have the keys stored in memory. Could I just hash() the url strings first and use the resulting integer as the key? I think what I'm...
0
9602
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9439
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10071
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9882
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8905
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7431
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6690
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
3987
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3589
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.