449,002 Members | 1,425 Online
Need help? Post your question and get tips & solutions from a community of 449,002 IT Pros & Developers. It's quick & easy.

# distance between two iterators in map

 P: n/a In STL's map implementation, the distance between two iterators it1, it2 takes O(K) to compute where K is the actual distance between the two iterators. In theory, a red black tree can do this in O(log K) time. Anyone knows if there is a way to do this in map<> or if map<> can be inherited/modified to do this? Thanks, --j [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 8 '06 #1
19 Replies

 P: n/a John wrote: In STL's map implementation, the distance between two iterators it1, it2 takes O(K) to compute where K is the actual distance between the two iterators. Where did you get this information? In theory, a red black tree can do this in O(log K) time. Anyone knows if there is a way to do this in map<> or if map<> can be inherited/modified to do this? First of all, are you sure you have to modify 'std::map'? Second, yes, you can and may inherit from 'std::map'. However, you will then need to also modify/inherit 'std::map::iterator'. Third, perhaps you should contact your library implementors and suggest any improvements you want them to make to the library. V -- Please remove capital As from my address when replying by mail Mar 8 '06 #2

 P: n/a Victor Bazarov wrote: John wrote: In STL's map implementation, the distance between two iterators it1, it2 takes O(K) to compute where K is the actual distance between the two iterators. Where did you get this information? Maybe from the standard [24.3.4/1]. [snip} Best Kai-Uwe Bux Mar 8 '06 #3

 P: n/a Kai-Uwe Bux wrote: Victor Bazarov wrote: John wrote: In STL's map implementation, the distance between two iterators it1, it2 takes O(K) to compute where K is the actual distance between the two iterators. Where did you get this information? Maybe from the standard [24.3.4/1]. Maybe. Maybe not. 24.3.4/1 essentially sets the _lowest_ standard for bidirectional iterators. While 'std::map's iterator is of 'bidirectional' category, nothing really prevents its real-world implementation from providing better than linear 'advance' and 'distance', does it? John says "implementation ... takes O(K)". Which implementation? Whose implementation? Catch my drift? V -- Please remove capital As from my address when replying by mail Mar 8 '06 #4

 P: n/a Victor Bazarov wrote: Kai-Uwe Bux wrote: Victor Bazarov wrote: John wrote: In STL's map implementation, the distance between two iterators it1, it2 takes O(K) to compute where K is the actual distance between the two iterators. Where did you get this information? Maybe from the standard [24.3.4/1]. Maybe. Maybe not. 24.3.4/1 essentially sets the _lowest_ standard for bidirectional iterators. Where in the wording do you find support for that opinion? The standard says: [...], the library provides two function templates advance and distance. ...; for input, forward, and bidirectional iterators they use ++ to provide linear time implementations. Seems to be pretty specific in saying what std::distance() is supposed to do. While 'std::map's iterator is of 'bidirectional' category, nothing really prevents its real-world implementation from providing better than linear 'advance' and 'distance', does it? Not sure, but maybe you can direct me to the point in the standard where it allows to do better. There could be some clause the standard stating that implementations are always allowed to provide specializations for their own containers/iterators that do better than the generic implementation specified for some algorithm. However, I am not aware of such language and would be grateful for a pointer. John says "implementation ... takes O(K)". Which implementation? Whose implementation? Catch my drift? Best Kai-Uwe Bux Mar 8 '06 #5

 P: n/a Victor Bazarov wrote: John wrote: In STL's map implementation, the distance between two iterators it1, it2 takes O(K) to compute where K is the actual distance between the two iterators. Where did you get this information? I guessed it :) That is the easy way out for coders, and I'm almost sure STL implementors at SGI took it (Prove me wrong! :) I wud be glad) In theory, a red black tree can do this in O(log K) time. Anyone knows if there is a way to do this in map<> or if map<> can be inherited/modified to do this? First of all, are you sure you have to modify 'std::map'? Unless someone tells me otherwise that distance can be calculated in O(log K) time on map. Second, yes, you can and may inherit from 'std::map'. However, you will then need to also modify/inherit 'std::map::iterator'. I think it will be a mess if I try to do that, I'll have to modify everything...inserts deletes...what is stored in the map structure... Thats why I asked, is there an easy way out... Third, perhaps you should contact your library implementors and suggest any improvements you want them to make to the library. You mean SGI? Since when did they start to listen to common folks? And not everyone needs fast "distance" computation between iterators. So not sure if this is worth the effort in the STL. Thanks, --j [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 8 '06 #6

 P: n/a In article <1141772730.133338.151370 @p10g2000cwp.googlegroups.com>, we**********@yahoo.com says... [ ... ] In theory, a red black tree can do this in O(log K) time. Anyone knows if there is a way to do this in map<> or if map<> can be inherited/modified to do this? I can see how you could make this proportional to something close to O(log n) in a perfectly balanced tree. I can see how you could derive some sort of rough estimate in O(Log n) on an RB tree as well, but I can't see how you'd do an accurate distance in O(log N) on an RB tree (or any other somewhat-imbalanced tree such as AVL). -- Later, Jerry. The universe is a figment of its own imagination. [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 8 '06 #7

 P: n/a Kai-Uwe Bux wrote: Victor Bazarov wrote: Maybe. Maybe not. 24.3.4/1 essentially sets the _lowest_ standard for bidirectional iterators. Where in the wording do you find support for that opinion? The standard says: [...], the library provides two function templates advance and distance. ...; for input, forward, and bidirectional iterators they use ++ to provide linear time implementations. Seems to be pretty specific in saying what std::distance() is supposed to do. An implementation is always free to do better than the specification. This is covered by the "as if"-rule: a user cannot detect whether the implementation really used 'operator++()' on the bidirectional iterator. Thus, an implementation can do whatever it wants as long as it meets the explicit constraints like being no worse than linear. Actually, some operations are specified by code snippets but the implementer of the library is not at all bound to use code which is even similar to these! -- - Efficient Artificial Intelligence Mar 8 '06 #8

 P: n/a Dietmar Kuehl wrote: Kai-Uwe Bux wrote: Victor Bazarov wrote: Maybe. Maybe not. 24.3.4/1 essentially sets the _lowest_ standard for bidirectional iterators. Where in the wording do you find support for that opinion? The standard says: [...], the library provides two function templates advance and [distance. ...; for input, forward, and bidirectional iterators they use ++ to provide linear time implementations. Seems to be pretty specific in saying what std::distance() is supposed to do. An implementation is always free to do better than the specification. This is covered by the "as if"-rule: a user cannot detect whether the implementation really used 'operator++()' on the bidirectional iterator. Thus, an implementation can do whatever it wants as long as it meets the explicit constraints like being no worse than linear. I pondered for a while whether the as if rule covers this case. However, I think that, as an interpretation of the standard, it is stretching the as if rule a little too far. The as if rule is concerned with the "semantic descriptions" and not with performance requirements. If you can quote the as if rule to do better than linear (because performance is not observable by definition of the standard), then why can't you take it as license to do worse than linear (because performance is not observable)? I should add that I certainly would want the standard to allow for doing better and for ruling out doing worse; also I would consider anything else a bug in the standard. Moreover, I am pretty sure that the intend of the standard (or its authors) goes in this direction. It just so happens that I cannot match the intent with the actual wording of the standard. Anyway, this discussion should be purely academic since (a) no sensible person would complain if an implementation does better than the standard requires/permits and (b) nobody would want to change the wording of the standard just to license better implementations running the risk of introducing worse inconsistencies. Best Kai-Uwe Bux Mar 9 '06 #9

 P: n/a Kai-Uwe Bux wrote: Dietmar Kuehl wrote: An implementation is always free to do better than the specification. This is covered by the "as if"-rule: a user cannot detect whether the implementation really used 'operator++()' on the bidirectional iterator. Thus, an implementation can do whatever it wants as long as it meets the explicit constraints like being no worse than linear. I pondered for a while whether the as if rule covers this case. However, I think that, as an interpretation of the standard, it is stretching the as if rule a little too far. The as if rule is concerned with the "semantic descriptions" Right. and not with performance requirements. Well, not exactly. It is pretty clear that if there is performance requirement, it has to be met for an implementation. The "as if" rule does not touch this. If you can quote the as if rule to do better than linear (because performance is not observable by definition of the standard), then why can't you take it as license to do worse than linear (because performance is not observable)? The performance requirements are always of the form "no worse than". They are not of the form "exactly this performance characteristic". The notation O(f(n)) just states that the asymptotic growth is less than 'c * f(n)'. If 'f(n)' is a linear function, e.g. 'n', the requirement is also met if the actual function is 'ln n'. IMO this settles the case. -- - Efficient Artificial Intelligence Mar 9 '06 #10

 P: n/a Kai-Uwe Bux wrote: I should add that I certainly would want the standard to allow for doing better and for ruling out doing worse; also I would consider anything else a bug in the standard. Moreover, I am pretty sure that the intend of the standard (or its authors) goes in this direction. It just so happens that I cannot match the intent with the actual wording of the standard. 17.3.1.3/5: "Complexity requirements specified in the library clauses are upper bounds, and implementations that provide better complexity guarantees satisfy the requirements." -- Pete Becker Roundhouse Consulting, Ltd. Mar 9 '06 #11

 P: n/a John wrote: Victor Bazarov wrote: John wrote: In STL's map implementation, the distance between two iterators it1, it2 takes O(K) to compute where K is the actual distance between the two iterators. Where did you get this information? I guessed it :) That is the easy way out for coders, and I'm almost sure STL implementors at SGI took it (Prove me wrong! :) I wud be glad) You don't have to guess -- in fact, in cases like these, you shouldn't guess. The standard says that the iterators for std::map are bidirectional iterators. Bidirectional iterators don't support addition or subtraction, and the distance algorithm is specified to be O(k) for them. In theory, a red black tree can do this in O(log K) time. Anyone knows if there is a way to do this in map<> or if map<> can be inherited/modified to do this? First of all, are you sure you have to modify 'std::map'? Unless someone tells me otherwise that distance can be calculated in O(log K) time on map. I don't think it can be, off hand. Or rather, support for doing so would increase the memory footprint of the map, and increase times for other operations, like insertion or deletion. Which isn't necessarily acceptable. Second, yes, you can and may inherit from 'std::map'. However, you will then need to also modify/inherit 'std::map::iterator'. I think it will be a mess if I try to do that, I'll have to modify everything...inserts deletes...what is stored in the map structure... Thats why I asked, is there an easy way out... I can't see anything less than implementing the whole thing yourself. You need a more complex data structure for the nodes, which affects just about everything else. -- James Kanze GABI Software Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 9 '06 #12

 P: n/a You can subtract tree iterators in O(log N) time if you store the left subtree size at each node. This will allow the ordinal position of a node to be computed by adding the sizes of subtrees that precede the node. For example, here is an in-order traversal of a tree, with the left-subtree sizes shown. A (left = 0) C (left = 1) E (left = 0) F (left = 0) G (left = 4) H (left = 0) J (left = 1) K (left = 0) M (left = 1) N (left = 3) Q (left = 0) R (left = 1) S (left = 0) T (left = 0) To locate node "M" we pass through nodes G and J, and compute the index of "M" as: 4 (G's left subtree) + 1 (G) + 1 (J's left subtree) + 1 (J) + 1 (M's left subtree) = 8. This computation is O(log N). Similarly, the left subtree sizes can be updated in O(log N) time as nodes are inserted or removed. [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 9 '06 #13

 P: n/a Pete Becker wrote: Kai-Uwe Bux wrote: I should add that I certainly would want the standard to allow for doing better and for ruling out doing worse; also I would consider anything else a bug in the standard. Moreover, I am pretty sure that the intend of the standard (or its authors) goes in this direction. It just so happens that I cannot match the intent with the actual wording of the standard. 17.3.1.3/5: "Complexity requirements specified in the library clauses are upper bounds, and implementations that provide better complexity guarantees satisfy the requirements." Thanks, that settles it. Best Kai-Uwe Bux Mar 9 '06 #14

 P: n/a In article <1141921279.131237.47120 @z34g2000cwc.googlegroups.com>, ke*********@gmail.com says... You can subtract tree iterators in O(log N) time if you store the left subtree size at each node. Yes, storing extra data allows you to do it. That, of course, raises the question of whether it's worth it. This data would have to be updated at essentially every rotation, so you'd have to know that distance was going to be used quite a bit to justify it. Considering the number of times I've (personally) seen distance used on maps, I doubt it's justified in any but a few relatively special cases -- though I may be in the minority. -- Later, Jerry. The universe is a figment of its own imagination. [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 10 '06 #15

 P: n/a James Kanze wrote: You don't have to guess -- in fact, in cases like these, you shouldn't guess. The standard says that the iterators for std::map are bidirectional iterators. Bidirectional iterators don't support addition or subtraction, and the distance algorithm is specified to be O(k) for them. Of course the library is perfectly entitled to make std::distance *better* than O(k) if it can. It already has to handle the special case that the iterators are random access iterators, so adding another special case for std::map iterators ought to be easy. I don't think it can be, off hand. Or rather, support for doing so would increase the memory footprint of the map, and increase times for other operations, like insertion or deletion. Which isn't necessarily acceptable. Agreed -- I can't see how inserting with a (correct) hint and erasing a single iterator can be amortized constant time. I *think* the rest of the requirements can be satisfied, though, as the obvious implementation simply adds another O(log N) term to an already O(log N) operation. The standard aside, though, the benefits of O(1) deletion and, in some circumstances, insertion, with O(N) iterator differences certainly seems a better general purpose design than O(log N) insertion, deletion, and iterator differences. I can't think of many situations when I've wanted to call std::distance on map iterators -- the size of the range returned by equal_range is the obvious case, and in this case the distance is often small. -- Richard Smith [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 10 '06 #16

 P: n/a John wrote: In STL's map implementation, the distance between two iterators it1, it2 takes O(K) to compute where K is the actual distance between the two iterators. In theory, a red black tree can do this in O(log K) time. Anyone knows if there is a way to do this in map<> or if map<> can be inherited/modified to do this? Thanks, --j [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] It's my understanding that most STL implementations use Red Black trees to implement map. Naturally the time taken to calculate the distance between 2 iterators is dependent on how the container is implemented. Now, as I understand it std::list normally uses a linked list making it fast when doing a lot of insertions and deletions of elements, but slow for calculating the distance between 2 iterators. Whereas std::vector normally effectively uses an array which gives fast random access and iterator distance calculation but slow insertion and deletion operations. So any design should pick the STL container best suited for the task. Finally, as you probably already know re balancing a Red Black tree on insertion and deletion operations is quicker than an AVL tree. JB Mar 10 '06 #17

 P: n/a Richard Smith wrote: Of course the library is perfectly entitled to make std::distance *better* than O(k) if it can. It already has to handle the special case that the iterators are random access iterators, so adding another special case for std::map iterators ought to be easy. Random access iterators can be recognized because iterator_traits::iterator_category is random_access_iterator_tag. There is no such easy identification of a map iterator. It's just a bidirectional iterator. -- Pete Becker Roundhouse Consulting, Ltd. [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 10 '06 #18

 P: n/a Pete Becker wrote: Random access iterators can be recognized because iterator_traits::iterator_category is random_access_iterator_tag. There is no such easy identification of a map iterator. It's just a bidirectional iterator. .... and still the library implementation can [probably] detect its iterator type used for 'std::map's: if the iterator type is not nested within the 'std::map' class template it is fairly easy to detect and a special overload for the 'std::distance()' function template can take advantage of the known structure, e.g.: namespace std { // ... template struct _Map_iterator { // funny stuff dealing with the iterator }; template struct map { typedef _Map_iterator<_K, _V, _C, _A> iterator; }; template std::size_t // just a short-hand; might use iterator_traits distance(_Map_iterator<_K, _V, _C, _A> begin, _Map_iterator<_K, _V, _C, _A> end) { // do fancy ln n algorithm to determine the distance } } There is no need to have special machinery for detecting the iterators. It is actually not even a really generic algorithm although it is a template: it just operates on a fixed data structure and could possibly even work on a non-templatized base class of the 'std::map's nodes which represent the links in the tree (although I'm not sure I would realize things this way because it sacrifices some type-safety; of course, if profiling shows that it is indicated, I would do it). That said, I'm not sure that any optimization of computing the distance or moving a pointer on 'std::map's is warranted: it is a rare use and to prepare the data structure for this would cost some performance even if the operations are not used. Thus, people would pay for something they don't really use. An optimization like this might be reasonable for a special purpose map where it is common that iterators are advanced in large step or the distance between iterators is frequently computed. -- - Efficient Artificial Intelligence [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 11 '06 #19

 P: n/a Pete Becker wrote: Richard Smith wrote: Of course the library is perfectly entitled to make std::distance *better* than O(k) if it can. It already has to handle the special case that the iterators are random access iterators, so adding another special case for std::map iterators ought to be easy. Random access iterators can be recognized because iterator_traits::iterator_category is random_access_iterator_tag. There is no such easy identification of a map iterator. It's just a bidirectional iterator. But I'm talking about what the library implementation might choose to do. If the implementation of std::map happens to allow a fast implemention of std::distance, then it would be easy to recognise the map's iterator type and overload accordingly: just overload one of its internal implementation functions and rely on partial ordering to find it. The library implementators can reliably do this even if an end user cannot. -- Richard Smith [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] Mar 13 '06 #20

### This discussion thread is closed

Replies have been disabled for this discussion.