By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
 464,596 Members | 984 Online
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,596 IT Pros & Developers. It's quick & easy.

# STL list.size() operation - O(1) or O(n) ?

 P: n/a Hi, I have had trouble determining whether the STL list.size() operation is O(1) or O(n). I know the list is a doubly-linked list, so if the size() operation begins at the head, then counts to the end - obviously its O(n). However, keeping track of inserts and deletes isn't that tough, so its conceivable (to me!) that size() could be O(1). For my application, the lists can be quite long (millions of elements), so the answer greatly impacts my design decisions. OR is this compiler dependent? (I am using gcc 3.2.2.) Thanks, Brett Jul 19 '05 #1
Share this Question
12 Replies

 P: n/a "Brett L. Moore" wrote... I have had trouble determining whether the STL list.size() operation is O(1) or O(n). I know the list is a doubly-linked list, so if the size() operation begins at the head, then counts to the end - obviously its O(n). However, keeping track of inserts and deletes isn't that tough, so its conceivable (to me!) that size() could be O(1). Actually, the Standard requires it to be of constant complexity. See table 65 and "Notes" after it. For my application, the lists can be quite long (millions of elements), so the answer greatly impacts my design decisions. OR is this compiler dependent? (I am using gcc 3.2.2.) No, it should not be compiler-dependent. Victor Jul 19 '05 #2

 P: n/a "Victor Bazarov" "Brett L. Moore" I have had trouble determining whether the STL list.size() operation is O(1) or O(n). I know the list is a doubly-linked list, so if the size() operation begins at the head, then counts to the end - obviously its O(n). However, keeping track of inserts and deletes isn't that tough, so its conceivable (to me!) that size() could be O(1). Actually, the Standard requires it to be of constant complexity. See table 65 and "Notes" after it. Are you sure? This would forsake constant time splicing. This does not correspond with waht Iread in Effective STL Item 4. For my application, the lists can be quite long (millions of elements), so the answer greatly impacts my design decisions. OR is this compiler dependent? (I am using gcc 3.2.2.) No, it should not be compiler-dependent. Again this does not correspond with what I read in Effective STL Item 4. Fraser. Jul 19 '05 #3

 P: n/a In article <12**************************@posting.google.com >, Brett L. Moore wrote: | Hi, | | I have had trouble determining whether the STL list.size() operation | is O(1) or O(n). I know the list is a doubly-linked list, so if the | size() operation begins at the head, then counts to the end - | obviously its O(n). However, keeping track of inserts and deletes | isn't that tough, so its conceivable (to me!) that size() could be | O(1). For my application, the lists can be quite long (millions of | elements), so the answer greatly impacts my design decisions. | | OR is this compiler dependent? (I am using gcc 3.2.2.) It is compiler dependent (unfortunately). The standard says this about size() on all of the containers: | should have constant complexity And actually the same note applies to the member swap function, and the max_size() member function. The "should" in the above quote does not mean "must". In standardeze it means "maybe, maybe not". In the case of list::swap, there are some implementations that have O(N) size. I'm not positive, but I think gcc may be one of them. I believe Dinkumware, and I know Metrowerks have a constant complexity list::size(). The subject has been debated on and off again over the years on comp.std.c++. You might try a search of that newsgroup for the words "list", "size" and maybe "complexity" for more information. A constant time size() forsake's constant time splicing in one out of the 5 possible splicing situations: splicing some elements from another list. That operation is (probably) O(some) if size is O(1). And the constant on the O(some) is quite small. The other 4 splicing situations remain O(1) even with an O(1) size: splice one from self splice one from other splice some from self splice all from other -- Howard Hinnant Metrowerks Jul 19 '05 #4

 P: n/a Brett L. Moore wrote: ... I have had trouble determining whether the STL list.size() operation is O(1) or O(n). I know the list is a doubly-linked list, so if the size() operation begins at the head, then counts to the end - obviously its O(n). However, keeping track of inserts and deletes isn't that tough, so its conceivable (to me!) that size() could be O(1). For my application, the lists can be quite long (millions of elements), so the answer greatly impacts my design decisions. OR is this compiler dependent? (I am using gcc 3.2.2.) ... This issue has been discussed at length in comp.std.c++. Try googling for it. In short - yes, it is implementation dependent. An implementation might choose to count list elements, thus making 'list<>::size' an O(1) operation, but the complexity of list-into-list splice will become O(n) in this case. Or it might choose to implement this splice as an O(1) operation, paying for it by O(n) for 'list<>::size'. -- Best regards, Andrey Tarasevich Brainbench C and C++ Programming MVP Jul 19 '05 #5

 P: n/a "Fraser Ross" wrote in message news:<3f******@news.greennet.net>... "Victor Bazarov" "Brett L. Moore" I have had trouble determining whether the STL list.size() operation is O(1) or O(n). I know the list is a doubly-linked list, so if the size() operation begins at the head, then counts to the end - obviously its O(n). However, keeping track of inserts and deletes isn't that tough, so its conceivable (to me!) that size() could be O(1). Actually, the Standard requires it to be of constant complexity. See table 65 and "Notes" after it. Are you sure? This would forsake constant time splicing. This does not correspond with waht Iread in Effective STL Item 4. It doesn't have to I don't think... You could probably rig it so that when you insert or remove something a counter in the list class is incremented or decremented. I don't know if that's actually how it's done, but it would probably be possible. Jul 19 '05 #6

 P: n/a Evan wrote: ... It doesn't have to I don't think... You could probably rig it so that when you insert or remove something a counter in the list class is incremented or decremented. I don't know if that's actually how it's done, but it would probably be possible. ... The problem is that when you splice a range of elements into a list you don't know the number of elements being inserted. And you don't really need to know it (which gives you a O(1) splicing), unless you want to update the element counter on-the-fly. If you choose to do the latter, you'll have to cope with the range-into-list splice being an O(n) operation. It is a trade-off. The implementation is forced to make a choice. -- Best regards, Andrey Tarasevich Brainbench C and C++ Programming MVP Jul 19 '05 #7

 P: n/a On Fri, 11 Jul 2003 14:52:33 -0700, Andrey Tarasevich wrote: Evan wrote: ... It doesn't have to I don't think... You could probably rig it so that when you insert or remove something a counter in the list class is incremented or decremented. I don't know if that's actually how it's done, but it would probably be possible. ...The problem is that when you splice a range of elements into a list youdon't know the number of elements being inserted. And you don't reallyneed to know it (which gives you a O(1) splicing), unless you want toupdate the element counter on-the-fly. If you choose to do the latter,you'll have to cope with the range-into-list splice being an O(n)operation.It is a trade-off. The implementation is forced to make a choice. Yes, but there are more choices than (A) counting the elements in the range when the splice is done, or (B) counting the elements in the result list every time its size is requested. For example, (C) keeping track of whether the size of a list is known or not, updating a known size whenever it can be done in constant time. Option (C) would defer a potentially very costly operation until it's actually needed, which with luck might be never. With option (C) size() would in general be O(1), unless you've recently spliced range of elements into the list (say) and not requested the list size after that, in which case it would be O(n); I'm not sure of the amortization involved here, it feels complicated whereas the concept is very simple. What it illustrates is, I believe, a design _defect_: the arguments of list::splice do not seem to offer (the implementation) any way to pass in the size of the range although that size might well be known by the client code. The decision seems to have been to "go safe" with a very low- level and general specification of a range. I think "go safe" is not compatible with a very low-level and general specification. In short, I think that that choice (efficiency versus safety) should be the library user's decision to make, not something forced and frozen, especially since there are so many other ways of introducing bugs. Cheers, - Alf Jul 19 '05 #8

 P: n/a Stuart Golodetz wrote: > ... > I have had trouble determining whether the STL list.size() operation > is O(1) or O(n). I know the list is a doubly-linked list, so if the > size() operation begins at the head, then counts to the end - > obviously its O(n). However, keeping track of inserts and deletes > isn't that tough, so its conceivable (to me!) that size() could be > O(1). For my application, the lists can be quite long (millions of > elements), so the answer greatly impacts my design decisions. > > OR is this compiler dependent? (I am using gcc 3.2.2.) > ... This issue has been discussed at length in comp.std.c++. Try googling for it. In short - yes, it is implementation dependent. An implementation might choose to count list elements, thus making 'list<>::size' an O(1) operation, but the complexity of list-into-list splice will become O(n) in this case. Or it might choose to implement this splice as an O(1) operation, paying for it by O(n) for 'list<>::size'. Why not have a flag noting whether or not a list-into-list splice has been carried out since the last call to size? Then if splices were never carried out, size would always be O(1), otherwise size would be O(n) for the call after the splice only, and then O(1) again. Then the cost of recalculating the size would be deferred until the next size call (i.e. you don't pay for recalculating the size unless you actually need to) and size's complexity would remain O(1) if you never did list-into-list splices (i.e. you don't pay for splicing if you don't use it). I don't know if any actual implementations do it this way? ... That would be a perfectly reasonable implementation technique. It won't change the formal complexity of these operations in general case, of course, but for most practical cases this approach looks very attractive. As I said before, this issue has already been discussed before. And not once. -- Best regards, Andrey Tarasevich Brainbench C and C++ Programming MVP Jul 19 '05 #9

 P: n/a In article <3f***************@News.CIS.DFN.DE>, Alf P. Steinbach wrote: | What it illustrates is, I believe, a design _defect_: the arguments | of list::splice do not seem to offer (the implementation) any way to pass | in the size of the range although that size might well be known by the | client code. | | The decision seems to have been to "go safe" with a very low- | level and general specification of a range. I think "go safe" is not | compatible with a very low-level and general specification. In short, | I think that that choice (efficiency versus safety) should be the | library user's decision to make, not something forced and frozen, | especially since there are so many other ways of introducing bugs. Agreed. It would be great to add an additional splice signature: void splice(iterator position, list& x, iterator first, iterator last, size_type n); Precondition: n == distance(first, last) I reviewed the places I'm using splice: Most of the time I'm splicing in an entire list, so this isn't an issue (it's always constant time). But in one place I'm splicing in a range. And in that particular example, I already know n before I call splice. It is a waste to have list::splice recompute it. ----------- Another aspect of this eternal debate that usually gets neglected is that an O(1) size can also aid several other list operations. It doesn't actually change the complexity of any of these other functions, but it does make them faster (sometimes significantly so), or enable strong exception safety in some circumstances: resize operator== operator!= operator= assign maybe sort For example, in resize you just have to flat out compute size if you don't already have it, so that you can decide whether you want to append or erase. In operator== you can check size first (if it is O(1)) and avoid the element to element compare if the size's are different. In operator= and assign you can use O(1) size to bump the exception safety from basic to strong if T::operator= doesn't throw, at no addtional computational or memory expense, while at the same time recycling existing nodes as needed. ----------- On the "lazy-size" concept: It seems to me like a significant amount of logic/checking/code in order to avoid this one operation under void splice(iterator position, list& x, iterator first, iterator last) { ... size_type delta_size = distance(first, last); ... } When you take into account increased code size, possibly increased per-container overhead, and possible increased run-time checking in most of the other list members, it seems like a heavy price to pay just to avoid computing the distance between first and last in this one function (which the standard says has linear complexity). ----------- The standard says that list::size() /should/ have constant complexity. It also says that deque::size() /should/ have constant complexity. Can you imagine the noise it would generate if someone decided to ship a deque (or vector or string or map) with a size() function that was O(N)? Personally I would rather get a compile time error if list::size() was O(N). At least then I would know that I need to distance(l.begin(), l.end()) and the computational cost would be clear and explicit. Same logic goes against list::operator[](size_type). It would be easy to implement, but unexpectedly expensive. ----------- All that being said, I'd still love to see: void splice(iterator position, list& x, iterator first, iterator last, size_type n); Precondition: n == distance(first, last) as an addition to the existing interface. I believe such an addition would settle this debate once and for all. And then we could make std::container::size() O(1), instead of "should be" O(1). -- Howard Hinnant Metrowerks Jul 19 '05 #10

 P: n/a On Sat, 12 Jul 2003 01:00:15 +0000, Howard Hinnant wrote: [snip]...... I agree with all of it, but: ----------- All that being said, I'd still love to see: void splice(iterator position, list& x, iterator first, iterator last, size_type n); Precondition: n == distance(first, last) as an addition to the existing interface. I believe such an addition would settle this debate once and for all. And then we could make std::container::size() O(1), instead of "should be" O(1). How would the client know n, again, distance would be a O(N) operation...... It would be better to just stick to the current solution. Actually, there are many ways to implement this particular splice O(N) for size() == (1). I'd like to know which one would be the best. Some that come to mind are: 1. Individually decrement size for the other list, and increment the current size, while splicing the individual elements (seems like inefficient). 2. Splice the range into a temporary list, and then the size for that list is known, so its fine. Again, here, we will have to use a special splice, which does not do anything to size, so it can be O(1). Then, we subtract, and add size() of the temporary list to the other list, and *this. 3. Same thing as (2), but in a loop. Regards, -Dhruv. Jul 19 '05 #11

 P: n/a In article , Dhruv wrote: | > ----------- | > | > All that being said, I'd still love to see: | > | > void splice(iterator position, list& x, | > iterator first, iterator last, size_type n); | > | > Precondition: n == distance(first, last) | > | > as an addition to the existing interface. I believe such an addition | > would settle this debate once and for all. And then we could make | > std::container::size() O(1), instead of "should be" O(1). | | How would the client know n, again, distance would be a O(N) | operation...... Often the work of defining the range to splice from is O(N) whether you count the number of elements or not. Please reread: In article <11************************@metrowerks.com>, Howard Hinnant wrote: | But in one place I'm splicing in a range. And in that particular | example, I already know n before I call splice. It is a waste to have | list::splice recompute it. In article , Dhruv continued: | It would be better to just stick to the current solution. In article <11************************@metrowerks.com>, Howard Hinnant wrote: | as an addition to the existing interface. How does this addition to the current interface make the situation worse? If you don't know n, just don't supply it. I'm not proposing to remove any existing signatures. And the addition does not create any ambiguities or other problems that I can see. void splice(iterator position, list& x); void splice(iterator position, list& x, iterator i); void splice(iterator position, list& x, iterator first, iterator last); void splice(iterator position, list& x, iterator first, iterator last, size_type n); // proposed | Actually, there | are many ways to implement this particular splice O(N) for size() == (1). | I'd like to know which one would be the best. | Some that come to mind are: | | 1. Individually decrement size for the other list, and increment the | current size, while splicing the individual elements (seems like | inefficient). Seems inefficient to me too. | 2. Splice the range into a temporary list, and then the size for that list | is known, so its fine. Again, here, we will have to use a special splice, | which does not do anything to size, so it can be O(1). Then, we subtract, | and add size() of the temporary list to the other list, and *this. How does the temporary list know the size? | 3. Same thing as (2), but in a loop. You've lost me on that one. The best way I know of to implement void splice(iterator position, list& x, iterator first, iterator last); (assuming O(1) size) is to first check if the range [first, last) is empty and if so, do nothing. Then if this != &x, compute the distance [first, last) and subtract that amount from x.size() and add it to this->size(). Then (regardless if this == &x or not) twiddle the links at first and at last to disconnect [first, last) from x, and connect it to *this. No need to visit each node, except for computing distance(first, last). No need for a temporary list. In the proposed variant, there is no need to compute distance(first, last) as it is supplied by the client. Though a debug build might compute distance(first, last) anyway just to make sure the client knows what he's talking about. -- Howard Hinnant Metrowerks Jul 19 '05 #12

 P: n/a On Mon, 14 Jul 2003 18:05:58 +0000, Howard Hinnant wrote: In article , Dhruv wrote: | > ----------- | > | > All that being said, I'd still love to see: | > | > void splice(iterator position, list& x, | > iterator first, iterator last, size_type n); | > | > Precondition: n == distance(first, last) | > | > as an addition to the existing interface. I believe such an addition | > would settle this debate once and for all. And then we could make | > std::container::size() O(1), instead of "should be" O(1). | | How would the client know n, again, distance would be a O(N) | operation...... Often the work of defining the range to splice from is O(N) whether you count the number of elements or not. Please reread: In article <11************************@metrowerks.com>, Howard Hinnant wrote: | But in one place I'm splicing in a range. And in that particular | example, I already know n before I call splice. It is a waste to have | list::splice recompute it. In article , Dhruv continued: | It would be better to just stick to the current solution. In article <11************************@metrowerks.com>, Howard Hinnant wrote: | as an addition to the existing interface. Oops, missed that part :-) How does this addition to the current interface make the situation worse? If you don't know n, just don't supply it. I'm not proposing to remove any existing signatures. And the addition does not create any ambiguities or other problems that I can see. void splice(iterator position, list& x); void splice(iterator position, list& x, iterator i); void splice(iterator position, list& x, iterator first, iterator last); void splice(iterator position, list& x, iterator first, iterator last, size_type n); // proposed | Actually, there | are many ways to implement this particular splice O(N) for size() == (1). | I'd like to know which one would be the best. | Some that come to mind are: | | 1. Individually decrement size for the other list, and increment the | current size, while splicing the individual elements (seems like | inefficient). Seems inefficient to me too. | 2. Splice the range into a temporary list, and then the size for that list | is known, so its fine. Again, here, we will have to use a special splice, | which does not do anything to size, so it can be O(1). Then, we subtract, | and add size() of the temporary list to the other list, and *this. How does the temporary list know the size? We provide a recalculate size function for the list, which will obviously be private. | 3. Same thing as (2), but in a loop. You've lost me on that one. I meant what you wrote. The best way I know of to implement void splice(iterator position, list& x, iterator first, iterator last); (assuming O(1) size) is to first check if the range [first, last) is empty and if so, do nothing. Then if this != &x, compute the distance [first, last) and subtract that amount from x.size() and add it to this->size(). Then (regardless if this == &x or not) twiddle the links at first and at last to disconnect [first, last) from x, and connect it to *this. No need to visit each node, except for computing distance(first, last). No need for a temporary list. In the proposed variant, there is no need to compute distance(first, last) as it is supplied by the client. Though a debug build might compute distance(first, last) anyway just to make sure the client knows what he's talking about. Ok, so when the client does know the size, it will be helpful. I was thinking that it would be highly unprobale that the client does know the distance between 2 nodes, because at least I haven't encountered any such situation. Maybe you have? Most of the time, I've been splicing a single element or, the whole list (ops. for which size is known), and when I splice in a subrange, then I have no idea of the size. Of cource, if someone does know the size, then this addition will help then a lot. Regards, -Dhruv. Jul 19 '05 #13

### This discussion thread is closed

Replies have been disabled for this discussion.