bjeremy wrote:

Nikolaos D. Bougalis wrote:
>bjeremy wrote:
>>Douglas Dude wrote:

Douglas Dude wrote:

how much time does it take to lok for a value - key in a hash_map ?

Thanks

I mean the O-notation. Please help me because I think it is O(1). Is

this correct ?

Lookups are O(1) on average

Assuming that the size of the hash table is fixed and that the hash function

equally distributes the data, then, on average, chains would be N/M items

long. That suggests that the order of lookups is an O(N/M) operation.

Two cases can be said to be O(1): if M is a linear function of N which means

that N/M = C, or if N is small and M is large. But it's really not accurate to

suggest that a hash table can, in general, perform lookups in constant time.

For more detailed analysis, look up "Algorithms in C++" by Robert Sedgewick

or volume 3 of "The Art of Computer Programming" by Donald Knuth.

-n

The hash buckets will be able to be accessed in constant time..

You can locate a bucket in constant time -- but, that in itself, is

meaningless. When dealing with chaining (although exactly the same argument

also applies to the other collision resolution strategies, and yields the same

conclusion) you end up having to iterate over the items that collided. And no

matter how you cut, slice or dice it, iterating a list of N items is an O(N)

operation. And such an operation cannot be constant.

the number of M depends on the frequency of your collisions... If you

handle collisions through chaining your hash may degrade in performance

to a linear search over M elements (all those that mapped to the

particular N location)...

Degradation will happen regardless of your collision resolution strategy as

the load factor of the hash table goes up.

But in practice average case you will

experience constant time lookup. Or if you like it better... In my

practice I have experienced constant time lookup before degradation. I

The thing is this: a hash table cuts search times by orders of magnitude

because it dramatically reduces the amount of items that must be inspected.

That can make it "feel" like a constant time operation. But that doesn't make

it a constant time operation.

It's common sense, really. Once you have more items that buckets, you

invariably end up having to perform some operations over N items, where N is

the number of items that have the same hash value. And no matter how one cuts,

slices or dices the operations in question, they will, at best, end up taking

linear time. And linear time is not constant time.

Of course, you could say "well, all my hash tables have 500 buckets and 1000

items, and every bucket has exactly 2 items, so according to your O(N/M) logic

that works out to O(2) which is a constant, so it's constant time! HA! HA!"

But really, you might as well say "my linked list always has a million items,

so iterating the list takes O(1000000) which means I can iterate my million

item list from start to finish in constant time." You might as well say that

because all pennies are coins and dimes are coins, 100 pennies and 100 dimes

are the same thing.

-n