zaineb@gmail.com wrote:[color=blue]
> The whole point of doing this was not having to iterate all the way
> down the list.
> So the way this is used is that I have a bunch of records. Depending on
> the record x used, you would get 25xth to 25(x+1)th . So if there are
> multiple records called, for each record you would have to iterate all
> the way down the list to get the values. For lists that are in the
> thousands, this is very hard. So for imagine this for x = 1000 with x =
> 1500. Now imagine doing this for 100 or so records all with different x
> values. There's a lot of unnecessary iteration going on.
> Since insertions or removals might happen 6-7 times a second, it makes
> more sense to have an index instead of using +=25 to iterate linearly.
> The gains in doing that are substantial.[/color]
Unless the creation of the indices introduces overhead equivalent to
the linear-search work you're trying to avoid by creating them. Which
is what I stated. I'm well aware, and said explicitly, that the +=
operation doesn't magically introduce constant-time random access to
std::map. Neither does your scheme. If you've got 6-7
insertions/removals per second, and calculate the set of indices (or a
significant fraction thereof -- half, on average, if they're evenly
distributed) each time, you're doing 6-7 operations with time
complexity that's linear in the size of the map per second. If you're
willing to recalculate the set of indices less often, at the expense of
not spanning exactly 25 elements each time, you can incrementally
reduce this. You didn't respond to my suggestion along these lines,
though.
The asymptotic time-complexities of your current scheme, from my best
understanding of your descriptions, are:
Insert/remove element:
O(n*log(n)) // log(n) for std::map insertion, plus O(n) insertions of
recalculated indices
Access element number x*25, given cached index set:
O(log(n)) // can be improved to O(1) if you use vector<map::iterator>
Iterate across all elements, 25 at a time, assuming no insert/delete
interruptions:
O(n*log(n)) // O(n) elements accessed, log(n) look up each time. Can
be improved to O(n) if you use vector<map::iterator>
There's no way you'll get better than O(n) for accessing every 25th
element, since the number of elements you want to actually look at has
cardinality of O(n). The problem is the O(n*log(n)) step on
insert/remove. That's as bad as doing push_back on a vector and
calling std::sort each time you insert. Do you still like this scheme?
Finally, getting back to the original point, it's not accurate to
characterize the += step as "iterating all the way down the list." 25
is a constant. You've got an iterator to one point in the list, and
+=25 involves a constant amount of work to get to the next point you're
interested in. If you want to do this n times (or n/25 times, as it
happens), you're going to have linear-time complexity regardless of
whether your constant-time operation is an efficient random access
(which is not available) or the one I described involving looping over
operator++.
I hope this is more clear. I feel like your response was a little
impolite, incidentally -- I feel I've demonstrated adequate
understanding of asymptotic complexity in this thread not to merit a
remedial explanation of why linear search is inefficient.
Unless you've got some better-than-O(n) way of caching those indices
each time you insert/remove, perform fewer insertions/removals, or
recache your indices far less frequently, you're not going to improve
on O(n*log(n)) here. Given that fact, you might as well give up on
this scheme for a more straightforward O(n*log(n)) scheme (one with a
better constant factor, no doubt), or try approaching the problem from
a different perspective entirely.
According to what you've said, it sounds like perhaps you're
over-valuing random access. You said that, after doing one of these
random accesses, you proceed to iterate over the entire collection
starting at that point (on average, half of the elements), 25 at a
time. That's inherently an O(n) operation, so you wouldn't increase
your big-O even by doing linear search for the starting point. Of
course, the constant factor in that lookup is greater by a factor of
25. But, well, food for thought.
Luke