What are the advantages and disadvantages of the tree-based std::set versus
the hash-based tr1::unordered_set?
set advantages:
1) iterators remain valid after insert and erase (except for iterators
to the erased element).
2) Sets can be compared using operator== or operator<. The STL set
functions, (set_union, set_intersection, etc), easily work on sets.
3) Can do searches based on ordering (for instance, given a set of
strings, it is easy to find all the elements that start with 'e').
set disadvantages:
1) Searches, insertions, and deletions are O(log n).
2) Depends on operator< or another ordering predicate.
unordered_set advantages:
1) Searches, insertions, and deletions are amortized O(1).
2) Iterators remain valid after erase (except for iterators to the erased
element).
unordered_set disadvantages:
1) Depends on a GOOD hash function, as well as operator== or another
equivalence predicate.
2) Iterators are invalidated by insertions (this can be avoided by
calling rehash() before loading the table.)
3) It is more difficult to compare unordered_sets or to do union,
intersection, etc.
One thing I'm not sure about is the space issue. Sets require three
pointers and at least a bool for balance information for every element.
Unordered_sets require at least O(bucket_count) space for the buckets, but I
don't know what the space overhead is for the buckets or how much extra
space is required for each element of a bucket.
Joe Gottman