468,467 Members | 2,634 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,467 developers. It's quick & easy.

std::set and insert speed

I have a program that reads sorted data from a database and inserts it
element by element into a set.

If the data is sorted is there a faster way to insert ? Meaning is
there a way to tell the std::set to check the end of the tree before
doing the insert as a fast check on where the key is going to end up ?

Thanks.

Jul 7 '06 #1
5 5296
asdf wrote:
I have a program that reads sorted data from a database and inserts it
element by element into a set.

If the data is sorted is there a faster way to insert ? Meaning is
there a way to tell the std::set to check the end of the tree before
doing the insert as a fast check on where the key is going to end up ?
RTFM. There is more than one 'insert' function. Also see the return
values of those 'insert' members.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Jul 7 '06 #2
In article <11**********************@s13g2000cwa.googlegroups .com>,
qj*********@yahoo.com says...
I have a program that reads sorted data from a database and inserts it
element by element into a set.

If the data is sorted is there a faster way to insert ? Meaning is
there a way to tell the std::set to check the end of the tree before
doing the insert as a fast check on where the key is going to end up ?
Yes, you can use the version of insert that allows you to pass a
"hint" about where the insertion is likely to take place.

Given that you're starting with sorted data, you might consider
putting it into a vector, and using a binary search. You can then
supplement that (if necessary) with a set to hold newly-inserted
data. Since it uses contiguous memory, searching a vector is nearly
always substantially faster than searching a set. The shortcoming, of
course, is insertions (and deletions, if you insist on really
deleting data instead of just marking a location as empty).

Combining the two makes it fairly easy to get most of the good points
of each: you get faster searching because most data is in the vector,
but also fast inserts because you insert new data into the set.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jul 7 '06 #3
Victor Bazarov wrote:
asdf wrote:
>I have a program that reads sorted data from a database and inserts it
element by element into a set.

If the data is sorted is there a faster way to insert ? Meaning is
there a way to tell the std::set to check the end of the tree before
doing the insert as a fast check on where the key is going to end up ?

RTFM. There is more than one 'insert' function. Also see the return
values of those 'insert' members.
For building a set from sorted data you would not need the insert return
value just use end() (or begin() resp.) as hint.

Jul 7 '06 #4

Jerry Coffin wrote:
In article <11**********************@s13g2000cwa.googlegroups .com>,
qj*********@yahoo.com says...
I have a program that reads sorted data from a database and inserts it
element by element into a set.

If the data is sorted is there a faster way to insert ? Meaning is
there a way to tell the std::set to check the end of the tree before
doing the insert as a fast check on where the key is going to end up ?

Yes, you can use the version of insert that allows you to pass a
"hint" about where the insertion is likely to take place.
Thanks ! Just wondering - if the data happens not to be sorted and the
hint is pretty much useless what is the performance hit going to be
like ? I think it should still be comparable to without using the hint
- likely just a little slower. Eg what does the algorithm do when it
sees the hint ?
Given that you're starting with sorted data, you might consider
putting it into a vector, and using a binary search. You can then
supplement that (if necessary) with a set to hold newly-inserted
data. Since it uses contiguous memory, searching a vector is nearly
always substantially faster than searching a set. The shortcoming, of
course, is insertions (and deletions, if you insist on really
deleting data instead of just marking a location as empty).

Combining the two makes it fairly easy to get most of the good points
of each: you get faster searching because most data is in the vector,
but also fast inserts because you insert new data into the set.
Sounds good ! I will keep this in mind.

Jul 7 '06 #5
In article <11**********************@h48g2000cwc.googlegroups .com>,
qj*********@yahoo.com says...

[ ... ]
Thanks ! Just wondering - if the data happens not to be sorted and the
hint is pretty much useless what is the performance hit going to be
like ? I think it should still be comparable to without using the hint
- likely just a little slower. Eg what does the algorithm do when it
sees the hint ?
Right -- basically it normally just checks whether the hint is
correct. If it is, it can insert there in roughly constant time. If
not, you've lost that time to do the check, and then it does the
insert as if there was no hint. IOW, a fairly substantial gain when
it's right, but a pretty minimal penalty when it's wrong.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jul 8 '06 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

26 posts views Thread by Michael Klatt | last post: by
7 posts views Thread by ma740988 | last post: by
1 post views Thread by jimmy | last post: by
4 posts views Thread by Gernot Frisch | last post: by
2 posts views Thread by shuisheng | last post: by
7 posts views Thread by Renzr | last post: by
2 posts views Thread by mathieu | last post: by
reply views Thread by NPC403 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.