STL list.size() operation - O(1) or O(n) ?

Brett L. Moore

Hi,

I have had trouble determining whether the STL list.size() operation
is O(1) or O(n). I know the list is a doubly-linked list, so if the
size() operation begins at the head, then counts to the end -
obviously its O(n). However, keeping track of inserts and deletes
isn't that tough, so its conceivable (to me!) that size() could be
O(1). For my application, the lists can be quite long (millions of
elements), so the answer greatly impacts my design decisions.

OR is this compiler dependent? (I am using gcc 3.2.2.)

Thanks, Brett

Jul 19 '05 #1

Subscribe Post Reply

10879

Victor Bazarov

"Brett L. Moore" <br*********@ttu.edu> wrote...

I have had trouble determining whether the STL list.size() operation
is O(1) or O(n). I know the list is a doubly-linked list, so if the
size() operation begins at the head, then counts to the end -
obviously its O(n). However, keeping track of inserts and deletes
isn't that tough, so its conceivable (to me!) that size() could be
O(1).
Actually, the Standard requires it to be of constant complexity.
See table 65 and "Notes" after it.
For my application, the lists can be quite long (millions of
elements), so the answer greatly impacts my design decisions.

OR is this compiler dependent? (I am using gcc 3.2.2.)

No, it should not be compiler-dependent.

Victor

Jul 19 '05 #2

Fraser Ross

"Victor Bazarov"

"Brett L. Moore"
I have had trouble determining whether the STL list.size() operation
is O(1) or O(n). I know the list is a doubly-linked list, so if the
size() operation begins at the head, then counts to the end -
obviously its O(n). However, keeping track of inserts and deletes
isn't that tough, so its conceivable (to me!) that size() could be
O(1).

Actually, the Standard requires it to be of constant complexity.
See table 65 and "Notes" after it.

Are you sure? This would forsake constant time splicing. This does not
correspond with waht Iread in Effective STL Item 4.

For my application, the lists can be quite long (millions of
elements), so the answer greatly impacts my design decisions.

OR is this compiler dependent? (I am using gcc 3.2.2.)

No, it should not be compiler-dependent.

Again this does not correspond with what I read in Effective STL Item 4.

Fraser.

Jul 19 '05 #3

Howard Hinnant

In article <12**************************@posting.google.com >, Brett L.
Moore <br*********@ttu.edu> wrote:

| Hi,
|
| I have had trouble determining whether the STL list.size() operation
| is O(1) or O(n). I know the list is a doubly-linked list, so if the
| size() operation begins at the head, then counts to the end -
| obviously its O(n). However, keeping track of inserts and deletes
| isn't that tough, so its conceivable (to me!) that size() could be
| O(1). For my application, the lists can be quite long (millions of
| elements), so the answer greatly impacts my design decisions.
|
| OR is this compiler dependent? (I am using gcc 3.2.2.)

It is compiler dependent (unfortunately).

The standard says this about size() on all of the containers:

| should have constant complexity

And actually the same note applies to the member swap function, and the
max_size() member function.

The "should" in the above quote does not mean "must". In standardeze
it means "maybe, maybe not".

In the case of list::swap, there are some implementations that have
O(N) size. I'm not positive, but I think gcc may be one of them. I
believe Dinkumware, and I know Metrowerks have a constant complexity
list::size().

The subject has been debated on and off again over the years on
comp.std.c++. You might try a search of that newsgroup for the words
"list", "size" and maybe "complexity" for more information.

A constant time size() forsake's constant time splicing in one out of
the 5 possible splicing situations: splicing some elements from
another list. That operation is (probably) O(some) if size is O(1).
And the constant on the O(some) is quite small. The other 4 splicing
situations remain O(1) even with an O(1) size:

splice one from self
splice one from other
splice some from self
splice all from other

--
Howard Hinnant
Metrowerks

Jul 19 '05 #4

Andrey Tarasevich

Brett L. Moore wrote:

...
I have had trouble determining whether the STL list.size() operation
is O(1) or O(n). I know the list is a doubly-linked list, so if the
size() operation begins at the head, then counts to the end -
obviously its O(n). However, keeping track of inserts and deletes
isn't that tough, so its conceivable (to me!) that size() could be
O(1). For my application, the lists can be quite long (millions of
elements), so the answer greatly impacts my design decisions.

OR is this compiler dependent? (I am using gcc 3.2.2.)
...

This issue has been discussed at length in comp.std.c++. Try googling
for it.

In short - yes, it is implementation dependent. An implementation might
choose to count list elements, thus making 'list<>::size' an O(1)
operation, but the complexity of list-into-list splice will become O(n)
in this case. Or it might choose to implement this splice as an O(1)
operation, paying for it by O(n) for 'list<>::size'.

--
Best regards,
Andrey Tarasevich
Brainbench C and C++ Programming MVP

Jul 19 '05 #5

Evan

"Fraser Ross" <fraserATmembers.v21.co.unitedkingdom> wrote in message news:<3f******@news.greennet.net>...

"Victor Bazarov"
"Brett L. Moore"
I have had trouble determining whether the STL list.size() operation
is O(1) or O(n). I know the list is a doubly-linked list, so if the
size() operation begins at the head, then counts to the end -
obviously its O(n). However, keeping track of inserts and deletes
isn't that tough, so its conceivable (to me!) that size() could be
O(1).

Actually, the Standard requires it to be of constant complexity.
See table 65 and "Notes" after it.

Are you sure? This would forsake constant time splicing. This does not
correspond with waht Iread in Effective STL Item 4.

It doesn't have to I don't think... You could probably rig it so that
when you insert or remove something a counter in the list class is
incremented or decremented. I don't know if that's actually how it's
done, but it would probably be possible.

Jul 19 '05 #6

Andrey Tarasevich

Evan wrote:

...
It doesn't have to I don't think... You could probably rig it so that
when you insert or remove something a counter in the list class is
incremented or decremented. I don't know if that's actually how it's
done, but it would probably be possible.
...

The problem is that when you splice a range of elements into a list you
don't know the number of elements being inserted. And you don't really
need to know it (which gives you a O(1) splicing), unless you want to
update the element counter on-the-fly. If you choose to do the latter,
you'll have to cope with the range-into-list splice being an O(n)
operation.

It is a trade-off. The implementation is forced to make a choice.

--
Best regards,
Andrey Tarasevich
Brainbench C and C++ Programming MVP

Jul 19 '05 #7

Alf P. Steinbach

On Fri, 11 Jul 2003 14:52:33 -0700, Andrey Tarasevich <an**************@hotmail.com> wrote:

Evan wrote:
...
It doesn't have to I don't think... You could probably rig it so that
when you insert or remove something a counter in the list class is
incremented or decremented. I don't know if that's actually how it's
done, but it would probably be possible.
...

The problem is that when you splice a range of elements into a list you
don't know the number of elements being inserted. And you don't really
need to know it (which gives you a O(1) splicing), unless you want to
update the element counter on-the-fly. If you choose to do the latter,
you'll have to cope with the range-into-list splice being an O(n)
operation.

It is a trade-off. The implementation is forced to make a choice.

Yes, but there are more choices than (A) counting the elements in the
range when the splice is done, or (B) counting the elements in the
result list every time its size is requested. For example, (C) keeping
track of whether the size of a list is known or not, updating a known
size whenever it can be done in constant time.

Option (C) would defer a potentially very costly operation until it's
actually needed, which with luck might be never.

With option (C) size() would in general be O(1), unless you've recently
spliced range of elements into the list (say) and not requested the list
size after that, in which case it would be O(n); I'm not sure of the
amortization involved here, it feels complicated whereas the concept is
very simple.

What it illustrates is, I believe, a design _defect_: the arguments
of list::splice do not seem to offer (the implementation) any way to pass
in the size of the range although that size might well be known by the
client code.

The decision seems to have been to "go safe" with a very low-
level and general specification of a range. I think "go safe" is not
compatible with a very low-level and general specification. In short,
I think that that choice (efficiency versus safety) should be the
library user's decision to make, not something forced and frozen,
especially since there are so many other ways of introducing bugs.
Cheers,

- Alf

Jul 19 '05 #8

Andrey Tarasevich

Stuart Golodetz wrote:

> ...
> I have had trouble determining whether the STL list.size() operation
> is O(1) or O(n). I know the list is a doubly-linked list, so if the
> size() operation begins at the head, then counts to the end -
> obviously its O(n). However, keeping track of inserts and deletes
> isn't that tough, so its conceivable (to me!) that size() could be
> O(1). For my application, the lists can be quite long (millions of
> elements), so the answer greatly impacts my design decisions.
>
> OR is this compiler dependent? (I am using gcc 3.2.2.)
> ...

This issue has been discussed at length in comp.std.c++. Try googling
for it.

In short - yes, it is implementation dependent. An implementation might
choose to count list elements, thus making 'list<>::size' an O(1)
operation, but the complexity of list-into-list splice will become O(n)
in this case. Or it might choose to implement this splice as an O(1)
operation, paying for it by O(n) for 'list<>::size'.

Why not have a flag noting whether or not a list-into-list splice has been
carried out since the last call to size? Then if splices were never carried
out, size would always be O(1), otherwise size would be O(n) for the call
after the splice only, and then O(1) again. Then the cost of recalculating
the size would be deferred until the next size call (i.e. you don't pay for
recalculating the size unless you actually need to) and size's complexity
would remain O(1) if you never did list-into-list splices (i.e. you don't
pay for splicing if you don't use it). I don't know if any actual
implementations do it this way?
...

That would be a perfectly reasonable implementation technique. It won't
change the formal complexity of these operations in general case, of
course, but for most practical cases this approach looks very
attractive. As I said before, this issue has already been discussed
before. And not once.

--
Best regards,
Andrey Tarasevich
Brainbench C and C++ Programming MVP

Jul 19 '05 #9

Howard Hinnant

In article <3f***************@News.CIS.DFN.DE>, Alf P. Steinbach
<al***@start.no> wrote:

| What it illustrates is, I believe, a design _defect_: the arguments
| of list::splice do not seem to offer (the implementation) any way to pass
| in the size of the range although that size might well be known by the
| client code.
|
| The decision seems to have been to "go safe" with a very low-
| level and general specification of a range. I think "go safe" is not
| compatible with a very low-level and general specification. In short,
| I think that that choice (efficiency versus safety) should be the
| library user's decision to make, not something forced and frozen,
| especially since there are so many other ways of introducing bugs.

Agreed. It would be great to add an additional splice signature:

void splice(iterator position, list& x, iterator first, iterator last,
size_type n);

Precondition: n == distance(first, last)

I reviewed the places I'm using splice: Most of the time I'm splicing
in an entire list, so this isn't an issue (it's always constant time).
But in one place I'm splicing in a range. And in that particular
example, I already know n before I call splice. It is a waste to have
list::splice recompute it.

-----------

Another aspect of this eternal debate that usually gets neglected is
that an O(1) size can also aid several other list operations. It
doesn't actually change the complexity of any of these other functions,
but it does make them faster (sometimes significantly so), or enable
strong exception safety in some circumstances:

resize
operator==
operator!=
operator=
assign
maybe sort

For example, in resize you just have to flat out compute size if you
don't already have it, so that you can decide whether you want to
append or erase.

In operator== you can check size first (if it is O(1)) and avoid the
element to element compare if the size's are different.

In operator= and assign you can use O(1) size to bump the exception
safety from basic to strong if T::operator= doesn't throw, at no
addtional computational or memory expense, while at the same time
recycling existing nodes as needed.

-----------

On the "lazy-size" concept: It seems to me like a significant amount
of logic/checking/code in order to avoid this one operation under

void splice(iterator position, list& x, iterator first, iterator last)
{
...
size_type delta_size = distance(first, last);
...
}

When you take into account increased code size, possibly increased
per-container overhead, and possible increased run-time checking in
most of the other list members, it seems like a heavy price to pay just
to avoid computing the distance between first and last in this one
function (which the standard says has linear complexity).

-----------

<shrug> The standard says that list::size() /should/ have constant
complexity. It also says that deque::size() /should/ have constant
complexity. Can you imagine the noise it would generate if someone
decided to ship a deque (or vector or string or map) with a size()
function that was O(N)?

Personally I would rather get a compile time error if list::size() was
O(N). At least then I would know that I need to distance(l.begin(),
l.end()) and the computational cost would be clear and explicit. Same
logic goes against list::operator[](size_type). It would be easy to
implement, but unexpectedly expensive.

-----------

All that being said, I'd still love to see:

void splice(iterator position, list& x,
iterator first, iterator last, size_type n);

Precondition: n == distance(first, last)

as an addition to the existing interface. I believe such an addition
would settle this debate once and for all. And then we could make
std::container::size() O(1), instead of "should be" O(1).

--
Howard Hinnant
Metrowerks

Jul 19 '05 #10

Dhruv

On Sat, 12 Jul 2003 01:00:15 +0000, Howard Hinnant wrote:
[snip]......
I agree with all of it, but:

-----------

All that being said, I'd still love to see:

void splice(iterator position, list& x,
iterator first, iterator last, size_type n);

Precondition: n == distance(first, last)

as an addition to the existing interface. I believe such an addition
would settle this debate once and for all. And then we could make
std::container::size() O(1), instead of "should be" O(1).

How would the client know n, again, distance would be a O(N)
operation......

It would be better to just stick to the current solution. Actually, there
are many ways to implement this particular splice O(N) for size() == (1).
I'd like to know which one would be the best.
Some that come to mind are:

1. Individually decrement size for the other list, and increment the
current size, while splicing the individual elements (seems like
inefficient).
2. Splice the range into a temporary list, and then the size for that list
is known, so its fine. Again, here, we will have to use a special splice,
which does not do anything to size, so it can be O(1). Then, we subtract,
and add size() of the temporary list to the other list, and *this.
3. Same thing as (2), but in a loop.

Regards,
-Dhruv.

Jul 19 '05 #11

Howard Hinnant

In article <pa****************************@gmx.net>, Dhruv
<dh*******@gmx.net> wrote:

| > -----------
| >
| > All that being said, I'd still love to see:
| >
| > void splice(iterator position, list& x,
| > iterator first, iterator last, size_type n);
| >
| > Precondition: n == distance(first, last)
| >
| > as an addition to the existing interface. I believe such an addition
| > would settle this debate once and for all. And then we could make
| > std::container::size() O(1), instead of "should be" O(1).
|
| How would the client know n, again, distance would be a O(N)
| operation......

Often the work of defining the range to splice from is O(N) whether you
count the number of elements or not.

Please reread:

In article <11************************@metrowerks.com>, Howard Hinnant
<hi*****@metrowerks.com> wrote:

| But in one place I'm splicing in a range. And in that particular
| example, I already know n before I call splice. It is a waste to have
| list::splice recompute it.

In article <pa****************************@gmx.net>, Dhruv
<dh*******@gmx.net> continued:

| It would be better to just stick to the current solution.

In article <11************************@metrowerks.com>, Howard Hinnant
<hi*****@metrowerks.com> wrote:

| as an addition to the existing interface.

How does this addition to the current interface make the situation
worse? If you don't know n, just don't supply it. I'm not proposing
to remove any existing signatures. And the addition does not create
any ambiguities or other problems that I can see.

void splice(iterator position, list& x);
void splice(iterator position, list& x, iterator i);
void splice(iterator position, list& x, iterator first, iterator last);
void splice(iterator position, list& x, iterator first, iterator last,
size_type n); // proposed

| Actually, there
| are many ways to implement this particular splice O(N) for size() == (1).
| I'd like to know which one would be the best.
| Some that come to mind are:
|
| 1. Individually decrement size for the other list, and increment the
| current size, while splicing the individual elements (seems like
| inefficient).

Seems inefficient to me too.

| 2. Splice the range into a temporary list, and then the size for that list
| is known, so its fine. Again, here, we will have to use a special splice,
| which does not do anything to size, so it can be O(1). Then, we subtract,
| and add size() of the temporary list to the other list, and *this.

How does the temporary list know the size?

| 3. Same thing as (2), but in a loop.

You've lost me on that one.

The best way I know of to implement

void splice(iterator position, list& x, iterator first, iterator last);

(assuming O(1) size)

is to first check if the range [first, last) is empty and if so, do
nothing. Then if this != &x, compute the distance [first, last) and
subtract that amount from x.size() and add it to this->size(). Then
(regardless if this == &x or not) twiddle the links at first and at
last to disconnect [first, last) from x, and connect it to *this.

No need to visit each node, except for computing distance(first, last).
No need for a temporary list.

In the proposed variant, there is no need to compute distance(first,
last) as it is supplied by the client. Though a debug build might
compute distance(first, last) anyway just to make sure the client knows
what he's talking about.

--
Howard Hinnant
Metrowerks

Jul 19 '05 #12

Dhruv

On Mon, 14 Jul 2003 18:05:58 +0000, Howard Hinnant wrote:

In article <pa****************************@gmx.net>, Dhruv
<dh*******@gmx.net> wrote:

| > -----------
| >
| > All that being said, I'd still love to see:
| >
| > void splice(iterator position, list& x,
| > iterator first, iterator last, size_type n);
| >
| > Precondition: n == distance(first, last)
| >
| > as an addition to the existing interface. I believe such an addition
| > would settle this debate once and for all. And then we could make
| > std::container::size() O(1), instead of "should be" O(1).
|
| How would the client know n, again, distance would be a O(N)
| operation......

Often the work of defining the range to splice from is O(N) whether you
count the number of elements or not.

Please reread:

In article <11************************@metrowerks.com>, Howard Hinnant
<hi*****@metrowerks.com> wrote:

| But in one place I'm splicing in a range. And in that particular
| example, I already know n before I call splice. It is a waste to have
| list::splice recompute it.

In article <pa****************************@gmx.net>, Dhruv
<dh*******@gmx.net> continued:

| It would be better to just stick to the current solution.

In article <11************************@metrowerks.com>, Howard Hinnant
<hi*****@metrowerks.com> wrote:

| as an addition to the existing interface.
Oops, missed that part :-)
How does this addition to the current interface make the situation
worse? If you don't know n, just don't supply it. I'm not proposing
to remove any existing signatures. And the addition does not create
any ambiguities or other problems that I can see.

void splice(iterator position, list& x);
void splice(iterator position, list& x, iterator i);
void splice(iterator position, list& x, iterator first, iterator last);
void splice(iterator position, list& x, iterator first, iterator last,
size_type n); // proposed

| Actually, there
| are many ways to implement this particular splice O(N) for size() == (1).
| I'd like to know which one would be the best.
| Some that come to mind are:
|
| 1. Individually decrement size for the other list, and increment the
| current size, while splicing the individual elements (seems like
| inefficient).

Seems inefficient to me too.

| 2. Splice the range into a temporary list, and then the size for that list
| is known, so its fine. Again, here, we will have to use a special splice,
| which does not do anything to size, so it can be O(1). Then, we subtract,
| and add size() of the temporary list to the other list, and *this.

How does the temporary list know the size?
We provide a recalculate size function for the list, which will obviously
be private.
| 3. Same thing as (2), but in a loop.

You've lost me on that one.

I meant what you wrote.

The best way I know of to implement

void splice(iterator position, list& x, iterator first, iterator last);

(assuming O(1) size)

is to first check if the range [first, last) is empty and if so, do
nothing. Then if this != &x, compute the distance [first, last) and
subtract that amount from x.size() and add it to this->size(). Then
(regardless if this == &x or not) twiddle the links at first and at
last to disconnect [first, last) from x, and connect it to *this.

No need to visit each node, except for computing distance(first, last).
No need for a temporary list.

In the proposed variant, there is no need to compute distance(first,
last) as it is supplied by the client. Though a debug build might
compute distance(first, last) anyway just to make sure the client knows
what he's talking about.

Ok, so when the client does know the size, it will be helpful.
I was thinking that it would be highly unprobale that the client does know
the distance between 2 nodes, because at least I haven't encountered any
such situation. Maybe you have? Most of the time, I've been splicing a
single element or, the whole list (ops. for which size is known), and when
I splice in a subrange, then I have no idea of the size. Of cource, if
someone does know the size, then this addition will help then a lot.

Regards,
-Dhruv.

Jul 19 '05 #13

STL list.size() operation - O(1) or O(n) ?

Similar topics