bjeremy wrote:
Quote:
Nikolaos D. Bougalis wrote:
Quote:
>bjeremy wrote:
Quote:
>>Douglas Dude wrote:
>>>Douglas Dude wrote:
>>>>how much time does it take to lok for a value - key in a hash_map ?
>>>>Thanks
>>>I mean the O-notation. Please help me because I think it is O(1). Is
>>>this correct ?
>>Lookups are O(1) on average
> Assuming that the size of the hash table is fixed and that the hash function
>equally distributes the data, then, on average, chains would be N/M items
>long. That suggests that the order of lookups is an O(N/M) operation.
>>
> Two cases can be said to be O(1): if M is a linear function of N which means
>that N/M = C, or if N is small and M is large. But it's really not accurate to
>suggest that a hash table can, in general, perform lookups in constant time.
>>
> For more detailed analysis, look up "Algorithms in C++" by Robert Sedgewick
>or volume 3 of "The Art of Computer Programming" by Donald Knuth.
>>
> -n
>
The hash buckets will be able to be accessed in constant time..
You can locate a bucket in constant time -- but, that in itself, is
meaningless. When dealing with chaining (although exactly the same argument
also applies to the other collision resolution strategies, and yields the same
conclusion) you end up having to iterate over the items that collided. And no
matter how you cut, slice or dice it, iterating a list of N items is an O(N)
operation. And such an operation cannot be constant.
Quote:
the number of M depends on the frequency of your collisions... If you
handle collisions through chaining your hash may degrade in performance
to a linear search over M elements (all those that mapped to the
particular N location)...
Degradation will happen regardless of your collision resolution strategy as
the load factor of the hash table goes up.
Quote:
But in practice average case you will
experience constant time lookup. Or if you like it better... In my
practice I have experienced constant time lookup before degradation. I
The thing is this: a hash table cuts search times by orders of magnitude
because it dramatically reduces the amount of items that must be inspected.
That can make it "feel" like a constant time operation. But that doesn't make
it a constant time operation.
It's common sense, really. Once you have more items that buckets, you
invariably end up having to perform some operations over N items, where N is
the number of items that have the same hash value. And no matter how one cuts,
slices or dices the operations in question, they will, at best, end up taking
linear time. And linear time is not constant time.
Of course, you could say "well, all my hash tables have 500 buckets and 1000
items, and every bucket has exactly 2 items, so according to your O(N/M) logic
that works out to O(2) which is a constant, so it's constant time! HA! HA!"
But really, you might as well say "my linked list always has a million items,
so iterating the list takes O(1000000) which means I can iterate my million
item list from start to finish in constant time." You might as well say that
because all pennies are coins and dimes are coins, 100 pennies and 100 dimes
are the same thing.
-n