473,796 Members | 2,520 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to implement a Hash Table in C

Hi
can anybody tell me that which ds will be best suited to implement a
hash table in C/C++

thanx. in advanced

Aug 11 '07
139 14240
Malcolm McLean wrote:
"CBFalconer " <cb********@yah oo.comwrote in message
>Eric Sosman wrote:
>>Malcolm McLean wrote:
... snip ...
>>>>
The objection is to a regular publishing such a book.
I have written a good book.

You have done no such thing. I have read good books, and I know.
... snip all further elucidation ...

I suspect you need go no further to make an enemy.

No, I'm happy to have my works condemned. Even with Ivor Rockbrain
it's more a case of "this post is pure abuse, I'd better not
tolerate it" than anything personal.

Keith's point that "the general consensus on clc is that the book
is no good, therefore it is no good" was false, and naturally I had
to reply to that, by giving the real explanation. But I don't blame
people for acting as they do. Kenny McCormack is essentially right
when he sees most of the bandwidth on clc as dominance games, but
where he is wrong is in imagining that life could be any different.
McCormack is a troll and a joke. Plonk it. Keith is generally
respected, and accurate, as is Eric Sosman.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 16 '07 #61
Kelsey Bjarnason wrote, On 16/08/07 17:21:

<snip>
So I can't type - compilers generally complain about mistyped identifiers. :)
Not if you mistype them consistently :-)
--
Flash Gordon
Dyslexic SW Developer.
At least the compiler ensures I spell identifiers consistently wrong.
Aug 16 '07 #62
On Thu, 16 Aug 2007 18:17:36 +0100, Flash Gordon wrote:
Kelsey Bjarnason wrote, On 16/08/07 17:21:

<snip>
>So I can't type - compilers generally complain about mistyped identifiers. :)

Not if you mistype them consistently :-)
If I could do that, I'd be happy. As it stands, I live in constant fear
that even the automated spell checkers around me will at some point rise
up against me for cruel and unusual punishment.

Well, not quite, but good goat, my non-formal writing does tend to contain
more than its fair share of typos.
Aug 16 '07 #63

"Ben Bacarisse" <be********@bsb .me.ukwrote in message
news:87******** ****@bsb.me.uk. ..
"Malcolm McLean" <re*******@btin ternet.comwrite s:

Since you complained about what you see as "trivial" criticisms, I
raised a bigger issue which you have not commented on, so I'll ask
yet again: what use is your (non) queue implementation? How could it
do anything but baffle a student? If you are worried about c.l.c
topicality, why not re-post on comp.programmin g where the wider issues
can be discussed?
I didn't complain about trivial criticisms. If there is typo that means a
closing parenthesis is a bracket, that's worth knowing and pointing out.
What I dispute is that several trivial criticisms of this sort add up to a
major criticism. I suppose there is a point at which they do, but I don't
think we're anywhere near there.

A lot of algorithm books make a big meal of the basic data structures. I
checked the Java queue classes before writing this post, and they were
pretty complicated. I takes two interfaces, the Collection and the Iterable,
and there are nine implementations - linked list queues, delay queues, and
so on. Then you've got add methods, which throw execeptions if the queue is
full, and offer methods which return an error code, so that you can decide
whether the queue filling up is or is not part of normal operation. It's
quite an impressive interface.

However the structures chapter is onyl chapter two. I want to describe the
simplicity of the structure instead of its complexity. I could have provided
something similar to the Java classes, using void pointers and access
functions. But it is unlikely that anyone would use such a function suite.
If you want a trivial queue you have a buffer, a circular buffer if you
don't want the inefficiency of shifting everything up on every remove
operation. Just as you build linked lists by incorporating a "next" pointer
in your structure, and trees built as you go. Whether this says something
good or somethign bad about C is a moot point. The fact is that no generic
basic data structure libraries ahve gained acceptance. If I was writing the
book in Java it would be different.

The whole chapter says "these are are basic data structures you will need,
here are some code fragments which show how to implement them". The reader
should be able to extend the expandable array to produce a dynamically
expanding queue, add functions to check for length and spare capacity, and
so on. An "empty" function is pretty patronising. The priority queue,
implemented efficiently, is the red black tree, but it is appropriate to
hold that back. Actually it is possible to speed up the deletion, I
believe - a web seach on "priority queue" and "efficiency " turns up a slew
of algorithms. The book is Basic Algorithms, covering a wide range of topics
in outline, not a massively detailed description of any one.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Aug 16 '07 #64

"Ben Pfaff" <bl*@cs.stanfor d.eduwrote in message
news:87******** ****@blp.benpfa ff.org...
"Malcolm McLean" <re*******@btin ternet.comwrite s:
>I could have provided something similar to the Java classes,
using void pointers and access functions. But it is unlikely
that anyone would use such a function suite. If you want a
trivial queue you have a buffer, a circular buffer if you don't
want the inefficiency of shifting everything up on every remove
operation.

A circular buffer can be encapsulated in C and still have some
benefits for code clarity, without reducing efficiency. At
least, that's my own evaluation of my own implementation of
circular queues in C:

http://cvs.savannah.gnu.org/viewvc/*...e=text%2Fplain

http://cvs.savannah.gnu.org/viewvc/*...e=text%2Fplain
Commentary is welcome of course.
>>
/* Initializes DEQUE as an empty deque of elements ELEM_SIZE
bytes in size, with an initial capacity of at least
CAPACITY. Returns the initial deque data array. */
void *
deque_init (struct deque *deque, size_t capacity, size_t elem_size)
{
void *data = NULL;
deque_init_null (deque);
if (capacity 0)
{
deque->capacity = 1;
while (deque->capacity < capacity)
deque->capacity <<= 1;
data = xnmalloc (deque->capacity, elem_size);
}
return data;
}

You need to check that capacity is less than 1 << (size_t_bits -1) or the
code will go into an infinite loop.
It's not a practical problem, of course, but it is typical of the sorts of
bugs that do creep into code. I was going to launch into an anti-size_t
tirade and the way in which it gives false promises of security, but I'm too
tired and I've got to prepare a mighty defence of another of my books which
is under attack on another ng.
--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
Aug 16 '07 #65
while (deque->capacity < capacity)
deque->capacity <<= 1;
You need to check that capacity is less than 1 << (size_t_bits -1) or
the code will go into an infinite loop.
This is a good point. Thank you. However, there's no way to
solve the problem that doesn't violate the postcondition stated
in the function header, so I think I'll "solve" it by adding an
assertion.

--
char a[]="\n .CJacehknorstu" ;int putchar(int);in t main(void){unsi gned long b[]
={0x67dffdff,0x 9aa9aa6a,0xa77f fda9,0x7da6aa6a ,0xa67f6aaa,0xa a9aa9f6,0x11f6} ,*p
=b,i=24;for(;p+ =!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
2:{i++;if(i)bre ak;else default:continu e;if(0)case 1:putchar(a[i&15]);break;}}}
Aug 16 '07 #66
Malcolm McLean wrote:
>
.... snip ...
>
/* Initializes DEQUE as an empty deque of elements ELEM_SIZE
bytes in size, with an initial capacity of at least
CAPACITY. Returns the initial deque data array. */
void *
deque_init (struct deque *deque, size_t capacity, size_t elem_size)
{
void *data = NULL;

deque_init_null (deque);
if (capacity 0) {
deque->capacity = 1;
while (deque->capacity < capacity)
deque->capacity <<= 1;
data = xnmalloc (deque->capacity, elem_size);
}
return data;
}
Makes no sense, lacking descriptions of struct deque,
deque_init_null , xnmalloc.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 17 '07 #67
On Aug 13, 8:50 pm, Richard Heathfield <r...@see.sig.i nvalidwrote:
Two bucks in one day? Hmmm. I'm not talking to
you any more without my lawyer.
But I thought you had a huge supply of rounding-error
halfpennies from your days hacking bank computers :-)
But seriously, there are also advantages in making the table size an
integer power of 2 (which, in all but one case, fails the primality
test with a vengeance), since the hash value can then be reduced to the
table size more rapidly.
Gak! You're not coding ala Falconer are you?
(Doing a division on every reprobe, instead
of a simple add-compare-subtract.) Most hashers
will "want to" divide by a prime for scrambling
anyway, so the index reduction is free.

(BTW, when memory is at a premium one may wish
to expand tables, when required, by less than 100%,
precluding any guarantee of power-of-two.)

Speaking of Falconer, his method (linear probing
with second hash value used for probe increment)
*requires* (even more so than quadratic probing,
see upthread) prime table size.

In my posting I mentioned "interestin g" recent
methods like Cuckoo and Compact Hash and got
little response. Is this because the
"Google Groups" kill-files me? (Or, because
people have taken to kill-filing
"James Dow Allen" specifically?)
I'm not talking to
you any more without my lawyer.
Well, have a nice day anyway!
James

Aug 17 '07 #68
On Aug 16, 11:58 pm, James Dow Allen <jdallen2...@ya hoo.comwrote:
On Aug 13, 8:50 pm, Richard Heathfield <r...@see.sig.i nvalidwrote:
Two bucks in one day? Hmmm. I'm not talking to
you any more without my lawyer.

But I thought you had a huge supply of rounding-error
halfpennies from your days hacking bank computers :-)
But seriously, there are also advantages in making the table size an
integer power of 2 (which, in all but one case, fails the primality
test with a vengeance), since the hash value can then be reduced to the
table size more rapidly.

Gak! You're not coding ala Falconer are you?
(Doing a division on every reprobe, instead
of a simple add-compare-subtract.) Most hashers
will "want to" divide by a prime for scrambling
anyway, so the index reduction is free.
When table size is an integral power of 2, the address can be
calculated from the hash with a single AND operation.
(BTW, when memory is at a premium one may wish
to expand tables, when required, by less than 100%,
precluding any guarantee of power-of-two.)

Speaking of Falconer, his method (linear probing
with second hash value used for probe increment)
*requires* (even more so than quadratic probing,
see upthread) prime table size.

In my posting I mentioned "interestin g" recent
methods like Cuckoo and Compact Hash and got
little response. Is this because the
"Google Groups" kill-files me? (Or, because
people have taken to kill-filing
"James Dow Allen" specifically?)
I mentioned cuckoo hash also.
This one is easy to understand:
http://www.mpi-inf.mpg.de/~sanders/programs/cuckoo/
It is written in C++ but would be trivial to rewrite in C.

I do not know of any useful implementation of compact hashing that has
a license that would allow me to use it, and I have not read the paper
on compact hashing yet.
I'm not talking to
you any more without my lawyer.

Well, have a nice day anyway!
James

Aug 17 '07 #69
/* Here is that cuckoo hash function converted to C */
/* Dann Corbit did the quick & dirty C conversion. */
/* */
/* K-ary cuckoo hashing (c) 2002-2003 by Peter Sanders */
/* this is a simple implementation for the purposes */
/* of measuring the performance of cuckoo hashing in abstract */
/* terms (number of probes per insert, storage efficiency) */
/* usage: compile using g++ or another ISO C++ compiler */
/* a.out <K<n<r<repeat<s eed for rng*/
/* there is also a constant tMax that defines the maximum number */
/* of trials for an insertion */
/* allocates space for n elements in K subtables, and repeats */
/* the follwing measurements repeat time: */
/* - find a hash function by filling full lookup tables */
/* with pseudo-random numbers. */
/* - insert elements i=0..n-1 into the table until this fails */
/* for the first time. (The cost of these insertions is not */
/* measured.) */
/* Every n/r successfully inserted elements, the follwing */
/* measurement is repeated n/r times: */
/* * a random element i2 is deleted */
/* * the hash table entries for i2 are filled with new random values
*/
/* * i2 is reinserted. */
/* Note that this is equivalent to removing a random element */
/* inserting a new element that */
/* has never been in the table before. */
/* The output is a table that gives */
/* - x the number of elements in the table at each measuremnt interval
*/
/* - the average number of probes for an insertion during the
measruements */
/* for the given number of inserted elements */
/* - K */
/* - n */
/* - repeat */
/* - seed */
#define DEBUGLEVEL 1
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <time.h>
#include "util.h"
#include "mt-real.c"

#define split 1
const int tMax = 10000; /* max number of random walk trials */
int K; /* number of probes */
int n; /* number of elements */
int m; /* number of elements per table */
int **hash; /* K times n array */
int **table; /* K times m array */
double limit(double x, double bound)
{
if (x bound) {
return bound;
} else if (x < -bound) {
return -bound;
} else
return x;
}
double cpuTime()
{
return clock() * 1e-6;
}

/* generate random int in 0..x-1 */
int rand0K(int x)
{
return (int) (genrand() * x);
}

/* insert element i into table */
/* return value: */
/* -1 failure */
/* otherwise number of hash function evaluation */
int insert(int i)
{
int forbidden = -1;
int j = rand0K(K);
int t;
for (t = 1; t <= tMax; t++) {
int p = hash[j][i];
int newI = table[j][p];
table[j][p] = i; /* play the cuckoo */
if (newI == -1)
return t; /* done */
forbidden = j;
i = newI; /* find new home for cuckoo victim */
j = rand0K(K - 1);
if (j == forbidden)
j = K - 1;
}
return tMax + 1; /* insertion failed */
}

/* delete element i from the table */
void delete(int i)
{
int j;
for (j = 0; j < K; j++) {
int p = hash[j][i];
if (table[j][p] == i) {
table[j][p] = -1;
return;
}
}
}

int main(int argc, char **argv)
{
int i,
j;
int r;
int step;
int repeat;

int seed;
double *sumT;
int *cf;
int rep;

assert(argc == 6);
K = atoi(argv[1]); /* number of probes */
n = atoi(argv[2]); /* number of elements */
r = atoi(argv[3]); /* number of measured densities */
repeat = atoi(argv[4]); /* how often to start from scratch */
seed = atoi(argv[5]);
m = (int) (n / K + 0.5);
step = n / r;
sgenrand(seed);
puts("# x tAvg(x) K N repeat seed");

/* allocate hash function and table */
/* and an empty table */
hash = calloc(K, sizeof(int *));
table = calloc(K, sizeof(int *));
for (j = 0; j < K; j++) {
hash[j] = calloc(n, sizeof(int));
table[j] = calloc(m, sizeof(int));
}

/* initialize statistics */
/* sumT[i] is the average time for size i*step */
sumT = calloc(r + 1, sizeof(double)) ;
cf = calloc(r + 1, sizeof(int));
for (i = 0; i < r; i++) {
sumT[i] = 0;
cf[i] = 0;
}

/* main loop */
for (rep = 0; rep < repeat; rep++) {
/* init hash function and empty table */
for (j = 0; j < K; j++) {
for (i = 0; i < n; i++) {
hash[j][i] = rand0K(m);
}
for (i = 0; i < m; i++) {
table[j][i] = -1;
}
}

/* fill table and keep measuring from time to time */
for (i = 0; i < n; i++) {
if (insert(i) tMax)
break; /* table is full */
/* measure in detail here */
if (((i + 1) % step) == 0) {
int i1;
for (i1 = 0; i1 < step; i1++) {
int t;
/* delete & reinsert a random element */
int i2 = rand0K(i);
delete(i2);
for (j = 0; j < K; j++)
hash[j][i2] = rand0K(m);
t = insert(i2);
cf[i / step] += (t tMax);
/* cout << t << endl; */
sumT[i / step] += t;
}
}
}
}

for (rep = 0; rep < r; rep++) {
printf("%d %f %d %d %d %d %d\n",
rep * step + step,
sumT[rep] / step / repeat,
K,
n,
repeat,
seed,
cf[rep]);
}
return 0;
}

/*
mt-real.c follows:
*/
/* A C-program for MT19937B: Real number version */
/* genrand() generate one pseudorandom number with double precision
*/
/* which is uniformly distributed on [0,1]-interval for each call.
*/
/* sgenrand(seed) set initial values to the working area of 624
words.*/
/* sgenrand(seed) must be called once before calling genrand()
*/
/* (seed is any integer except 0).
*/

/*
LICENCE CONDITIONS:

Matsumoto and Nishimura consent to GNU General
Public Licence

NOTE:
When you use it in your program, please let Matsumoto
<ma******@math. keio.ac.jpknow it.

Because of a machine-trouble, Matsumoto lost emails
which arrived during May 28-29.
*/
#include<stdio. h>

/* Period parameters */
#define N 624
#define M 397
#define MATRIX_A 0x9908b0df /* constant vector a */
#define UPPER_MASK 0x80000000 /* most significant w-r bits */
#define LOWER_MASK 0x7fffffff /* least significant r bits */

/* for tempering */
#define TEMPERING_MASK_ B 0x9d2c5680
#define TEMPERING_MASK_ C 0xefc60000
#define TEMPERING_SHIFT _U(y) (y >11)
#define TEMPERING_SHIFT _S(y) (y << 7)
#define TEMPERING_SHIFT _T(y) (y << 15)
#define TEMPERING_SHIFT _L(y) (y >18)

static unsigned long ptgfsr[N]; /* set initial seeds: N = 624 words */

void
sgenrand(unsign ed long seed) /* seed should not be 0 */
{
int k;

/* setting initial seeds to ptgfsr[N] using */
/* the generator Line 25 of Table 1 in */
/* [KNUTH 1981, The Art of Computer Programming */
/* Vol. 2 (2nd Ed.), pp102] */

ptgfsr[0]= seed & 0xffffffff;
for (k=1; k<N; k++)
ptgfsr[k] = (69069 * ptgfsr[k-1]) & 0xffffffff;
}

double
genrand()
{
unsigned long y;
static int k = 1;
static unsigned long mag01[2]={0x0, MATRIX_A};
/* mag01[x] = x * MATRIX_A for x=0,1 */

if(k == N){ /* generate N words at one time */
int kk;
for (kk=0;kk<N-M;kk++) {
y = (ptgfsr[kk]&UPPER_MASK)|(p tgfsr[kk+1]&LOWER_MASK) ;
ptgfsr[kk] = ptgfsr[kk+M] ^ (y >1) ^ mag01[y & 0x1];
}
for (;kk<N-1;kk++) {
y = (ptgfsr[kk]&UPPER_MASK)|(p tgfsr[kk+1]&LOWER_MASK) ;
ptgfsr[kk] = ptgfsr[kk+(M-N)] ^ (y >1) ^ mag01[y & 0x1];
}
y = (ptgfsr[N-1]&UPPER_MASK)|(p tgfsr[0]&LOWER_MASK) ;
ptgfsr[N-1] = ptgfsr[M-1] ^ (y >1) ^ mag01[y & 0x1];

k = 0;
}

y = ptgfsr[k++];
y ^= TEMPERING_SHIFT _U(y);
y ^= TEMPERING_SHIFT _S(y) & TEMPERING_MASK_ B;
y ^= TEMPERING_SHIFT _T(y) & TEMPERING_MASK_ C;
y &= 0xffffffff; /* you may delete this line if word size = 32 */
y ^= TEMPERING_SHIFT _L(y);

return ( (double)y / (unsigned long)0xffffffff );
}

/*
util.h follows:
*/
/*
// this files contains all the application independent little
// functions and macros used for the optimizer.
// In particular Peters debug macros and Dags stuff
// from dbasic.h cdefs, random,...

//////////////// stuff originally from
debug.h ///////////////////////////////
// (c) 1997 Peter Sanders
// some little utilities for debugging adapted
// to the paros conventions
*/

#ifndef UTIL
#define UTIL
#ifndef Max
#define Max(x,y) ((x)>=(y)?(x):( y))
#endif

#ifndef Min
#define Min(x,y) ((x)<=(y)?(x):( y))
#endif

#ifndef Abs
#define Abs(x) ((x) < 0 ? -(x) : (x))
#endif

#ifndef PI
#define PI 3.1415926535
#endif

#endif

Aug 17 '07 #70

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1373
by: Yat | last post by:
I want to use hash table in C. Have any build-in library in C so that I can use it without implement hash table myself? Thx
24
4313
by: kdotsky | last post by:
Hello, I am using some very large dictionaries with keys that are long strings (urls). For a large dictionary these keys start to take up a significant amount of memory. I do not need access to these keys -- I only need to be able to retrieve the value associated with a certain key, so I do not want to have the keys stored in memory. Could I just hash() the url strings first and use the resulting integer as the key? I think what I'm...
0
9683
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10457
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10231
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10176
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10013
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6792
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5443
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3733
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2927
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.