Connecting Tech Pros Worldwide Help | Site Map

STL: how to get the sequence number of a newly added item into a set

bilgekhan
Guest
 
Posts: n/a
#1: Jun 27 '08
After doing a succcessful insert() or find() on a set<Tcontainer
is it possible to get the item number of this item in the set?
(ie. its zero-based sequence number (position/location/rank/index)
within the sorted collection)

Additional info:
insert() returns a pair<Tand find() returns an iterator.
Using any of these two objects is it possible to get the
sequence number of the item these objects point to?
After the insert() or find() the container won't be modified.

Jerry Coffin
Guest
 
Posts: n/a
#2: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


In article <g1c3fs$ts8$1@aioe.org>, bilgekhan@bilgekhanbilgekhan.net.tr
says...
Quote:
After doing a succcessful insert() or find() on a set<Tcontainer
is it possible to get the item number of this item in the set?
(ie. its zero-based sequence number (position/location/rank/index)
within the sorted collection)
>
Additional info:
insert() returns a pair<Tand find() returns an iterator.
Using any of these two objects is it possible to get the
sequence number of the item these objects point to?
your_iterator = yourset.find(whatever);
std::distance(yourset.begin(), your_iterator);

--
Later,
Jerry.

The universe is a figment of its own imagination.
Sam
Guest
 
Posts: n/a
#3: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


bilgekhan writes:
Quote:
After doing a succcessful insert() or find() on a set<Tcontainer
is it possible to get the item number of this item in the set?
(ie. its zero-based sequence number (position/location/rank/index)
within the sorted collection)
There are no "sequence numbers" for members of a set. Individual members of
sets are accessed via iterators.
Quote:
Additional info:
insert() returns a pair<Tand find() returns an iterator.
Using any of these two objects is it possible to get the
sequence number of the item these objects point to?
No, because there is no such a concept as a "sequence number". It doesn't
exist. A set is an opaque collection, with certain properties. The
collection contains a "set" of items. A number of operations are defined on
a set. That's it.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEABECAAYFAkg5nFIACgkQx9p3GYHlUOJdOACffjSkVb+8zH iUD+HIyucsSB9d
AVUAn20V5OkYlH+SNv0ook90UT02hj0h
=FY7F
-----END PGP SIGNATURE-----

Daniel T.
Guest
 
Posts: n/a
#4: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
Quote:
After doing a succcessful insert() or find() on a set<Tcontainer
is it possible to get the item number of this item in the set?
>
(ie. its zero-based sequence number (position/location/rank/index)
within the sorted collection)
>
Additional info:
insert() returns a pair<Tand find() returns an iterator.
Using any of these two objects is it possible to get the
sequence number of the item these objects point to?
After the insert() or find() the container won't be modified.
set<int>::iterator it = s.insert( x ).first;
// or = s.find( x );

int seqNumb = distance( s.begin(), it );
bilgekhan
Guest
 
Posts: n/a
#5: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


"Jerry Coffin" <jcoffin@taeus.comwrote:
Quote:
bilgekhan@bilgekhanbilgekhan.net.tr says...
Quote:

After doing a succcessful insert() or find() on a set<Tcontainer
is it possible to get the item number of this item in the set?
(ie. its zero-based sequence number (position/location/rank/index)
within the sorted collection)

Additional info:
insert() returns a pair<Tand find() returns an iterator.
Using any of these two objects is it possible to get the
sequence number of the item these objects point to?
>
your_iterator = yourset.find(whatever);
std::distance(yourset.begin(), your_iterator);
Yes, distance() does what I wanted, but it is VERY slow!
It seems each time it walks all items from the beginning to the
item in question, ie. it practically does a find().
That's unfortunately too inefficient for my needs.
Isn't there a better (a direct) method without traversing the set ?

FYI: If I program this without STL (and then even using an unordered array
--ie. each time doing a sequential search) then it is about 7 times
faster than the STL version that uses set<Tand distance() !
How come? :-)
The culprit is IMO the distance() call. One needs a better, a direct,
method without traversing the tree.

A better way would have been if the insert() and find() functions would
optionally deliver the distance value on the fly;
ie. STL needs to extend these 2 functions with a direct 'distance' option.

bilgekhan
Guest
 
Posts: n/a
#6: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


"Daniel T." <daniel_t@earthlink.netwrote:
Quote:
"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
>
Quote:
After doing a succcessful insert() or find() on a set<Tcontainer
is it possible to get the item number of this item in the set?

(ie. its zero-based sequence number (position/location/rank/index)
within the sorted collection)

Additional info:
insert() returns a pair<Tand find() returns an iterator.
Using any of these two objects is it possible to get the
sequence number of the item these objects point to?
After the insert() or find() the container won't be modified.
>
set<int>::iterator it = s.insert( x ).first;
// or = s.find( x );
>
int seqNumb = distance( s.begin(), it );

Yes, distance() does what I wanted, but it is VERY slow!
It seems each time it walks all items from the beginning to the
item in question, ie. it practically does a find().
That's unfortunately too inefficient for my needs.
Isn't there a better (a direct) method without traversing the set ?

FYI: If I program this without STL (and then even using an unordered array
--ie. each time doing a sequential search) then it is about 7 times
faster than the STL version that uses set<Tand distance() !
How come? :-)
The culprit is IMO the distance() call. One needs a better, a direct,
method without traversing the tree.

A better way would have been if the insert() and find() functions would
optionally deliver the distance value on the fly;
ie. STL needs to extend these 2 functions with a direct 'distance' option.

Jerry Coffin
Guest
 
Posts: n/a
#7: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


In article <g1di7e$8ib$1@aioe.org>, bilgekhan@bilgekhanbilgekhan.net.tr
says...

[ ... ]
Quote:
Yes, distance() does what I wanted, but it is VERY slow!
That's not terribly surprising.
Quote:
It seems each time it walks all items from the beginning to the
item in question, ie. it practically does a find().
Yes, that's correct.
Quote:
That's unfortunately too inefficient for my needs.
Isn't there a better (a direct) method without traversing the set ?
None of which I'm aware.
Quote:
FYI: If I program this without STL (and then even using an unordered array
--ie. each time doing a sequential search) then it is about 7 times
faster than the STL version that uses set<Tand distance() !
How come? :-)
A set is composed of nodes connected by pointers. Each node is
separately allocated via the allocator for the set (which will use new,
unless you supply an allocator of your own). Those nodes will usually
have fairly poor locality of reference, which leads to poor cache usage
(among other things). The result is relatively slow traversal.

An array is required to be allocated as a single, contiguous block of
storage. This leads to relatively good locality of reference, better
cache usage, and (generally) relatively fast traversal.
Quote:
The culprit is IMO the distance() call. One needs a better, a direct,
method without traversing the tree.
>
A better way would have been if the insert() and find() functions would
optionally deliver the distance value on the fly;
ie. STL needs to extend these 2 functions with a direct 'distance' option.
This requirement is sufficiently unusual that I doubt that'll happen.
Anything that supports random access iterators can do distance in
constant time.

Since a set is typically implemented as a balanced tree, you could
theoretically improve the speed of distance() by having each pointer in
a node also record the number of nodes in the subtree at which it points
(or at least in its left sub-tree). This would avoid traversing at least
part of the tree most of the time -- OTOH, maintaining those counts
would slow down insertions and deletions. TANSTAAFL.

--
Later,
Jerry.

The universe is a figment of its own imagination.
James Kanze
Guest
 
Posts: n/a
#8: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


On May 26, 8:04 am, Jerry Coffin <jcof...@taeus.comwrote:
Quote:
In article <g1di7e$8i...@aioe.org>, bilgek...@bilgekhanbilgekhan.net.tr
says...
Quote:
[ ... ]
Quote:
Quote:
Yes, distance() does what I wanted, but it is VERY slow!
Quote:
That's not terribly surprising.
Quote:
Quote:
It seems each time it walks all items from the beginning to the
item in question, ie. it practically does a find().
Quote:
Yes, that's correct.
Quote:
Quote:
That's unfortunately too inefficient for my needs.
Isn't there a better (a direct) method without traversing the set ?
Quote:
None of which I'm aware.
Perhaps the solution to his problem would be to use an
std::vector, and keep it sorted (using std::lower_bound to find
the insertion point). If insertions aren't too frequent, or
copying is fairly cheap, this can be even faster than an
std::set. And of course, given an iterator to an element, it's
trivial and very fast to find the elements sequence number.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Pete Becker
Guest
 
Posts: n/a
#9: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


On 2008-05-26 01:34:32 -0400, "bilgekhan"
<bilgekhan@bilgekhanbilgekhan.net.trsaid:
Quote:
"Daniel T." <daniel_t@earthlink.netwrote:
Quote:
>"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
>>
Quote:
>>After doing a succcessful insert() or find() on a set<Tcontainer
>>is it possible to get the item number of this item in the set?
>>>
>>(ie. its zero-based sequence number (position/location/rank/index)
>>within the sorted collection)
>>>
>>Additional info:
>>insert() returns a pair<Tand find() returns an iterator.
>>Using any of these two objects is it possible to get the
>>sequence number of the item these objects point to?
>>After the insert() or find() the container won't be modified.
>>
>set<int>::iterator it = s.insert( x ).first;
>// or = s.find( x );
>>
>int seqNumb = distance( s.begin(), it );
>
>
Yes, distance() does what I wanted, but it is VERY slow!
It seems each time it walks all items from the beginning to the
item in question, ie. it practically does a find().
That's unfortunately too inefficient for my needs.
Isn't there a better (a direct) method without traversing the set ?
I think you're asking the wrong question. What do you plan to do with
this "sequence number", and are there other ways to do the same thing?

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Daniel T.
Guest
 
Posts: n/a
#10: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
Quote:
"Daniel T." <daniel_t@earthlink.netwrote:
Quote:
"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
Quote:
After doing a succcessful insert() or find() on a set<Tcontainer
is it possible to get the item number of this item in the set?
>
(ie. its zero-based sequence number (position/location/rank/index)
within the sorted collection)
>
Additional info:
insert() returns a pair<Tand find() returns an iterator.
Using any of these two objects is it possible to get the
sequence number of the item these objects point to?
After the insert() or find() the container won't be modified.
set<int>::iterator it = s.insert( x ).first;
// or = s.find( x );

int seqNumb = distance( s.begin(), it );
>
Yes, distance() does what I wanted, but it is VERY slow!
It seems each time it walks all items from the beginning to the
item in question, ie. it practically does a find().
How else could it be done?
Quote:
FYI: If I program this without STL (and then even using an unordered array
--ie. each time doing a sequential search) then it is about 7 times
faster than the STL version that uses set<Tand distance() !
How come? :-)
Your comparing apples to oranges. In one implementation you are using an
unordered array where all the elements are contiguous, in the other
implementation the elements are ordered but not contiguous. But before
abandoning the set idea, make sure you have all optimizations turned on.
Many standard libraries have extensive error checking built in and make
a lot of calls that can be inlined, but the former doesn't get removed
and the latter doesn't get inlined unless optimizations are turned on.
Quote:
The culprit is IMO the distance() call. One needs a better, a direct,
method without traversing the tree.
Like how?
Quote:
A better way would have been if the insert() and find() functions would
optionally deliver the distance value on the fly
They would have to traverse the tree to do that.

As others have pointed out, "sequence numbers" aren't all that useful in
sets. It sounds like you would be better served by using a std::vector
and sorting it (using std::sort,) then you can do lookups with
std::lower_bound, upper_bound, equal_range, and binary_search and
getting the "sequence number" is simply a matter of a little pointer
math.
Jerry Coffin
Guest
 
Posts: n/a
#11: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


In article <611c02c0-da39-45cc-95aa-767706d1c0a8
@a70g2000hsh.googlegroups.com>, james.kanze@gmail.com says...

[ ... ]
Quote:
Perhaps the solution to his problem would be to use an
std::vector, and keep it sorted (using std::lower_bound to find
the insertion point). If insertions aren't too frequent, or
copying is fairly cheap, this can be even faster than an
std::set. And of course, given an iterator to an element, it's
trivial and very fast to find the elements sequence number.
Right -- I'd add, however, that my mention of random access iterators
instead of vectors was intentional. Under the circumstances at hand, a
deque might well be a better choice, since it halves the number of
elements that need to be shuffled for an average insertion.


--
Later,
Jerry.

The universe is a figment of its own imagination.
bilgekhan
Guest
 
Posts: n/a
#12: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


"Daniel T." <daniel_t@earthlink.netwrote:
Quote:
"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
Quote:
"Daniel T." <daniel_t@earthlink.netwrote:
Quote:
"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
>
After doing a succcessful insert() or find() on a set<Tcontainer
is it possible to get the item number of this item in the set?

(ie. its zero-based sequence number (position/location/rank/index)
within the sorted collection)

Additional info:
insert() returns a pair<Tand find() returns an iterator.
Using any of these two objects is it possible to get the
sequence number of the item these objects point to?
After the insert() or find() the container won't be modified.
>
set<int>::iterator it = s.insert( x ).first;
// or = s.find( x );
>
int seqNumb = distance( s.begin(), it );
Yes, distance() does what I wanted, but it is VERY slow!
It seems each time it walks all items from the beginning to the
item in question, ie. it practically does a find().
>
How else could it be done?
See below.
Quote:
Quote:
FYI: If I program this without STL (and then even using an unordered array
--ie. each time doing a sequential search) then it is about 7 times
faster than the STL version that uses set<Tand distance() !
How come? :-)
>
Your comparing apples to oranges. In one implementation you are using an
unordered array where all the elements are contiguous, in the other
implementation the elements are ordered but not contiguous. But before
abandoning the set idea, make sure you have all optimizations turned on.
Many standard libraries have extensive error checking built in and make
a lot of calls that can be inlined, but the former doesn't get removed
and the latter doesn't get inlined unless optimizations are turned on.
>
Quote:
The culprit is IMO the distance() call. One needs a better, a direct,
method without traversing the tree.
>
Like how?
See below.
Quote:
Quote:
A better way would have been if the insert() and find() functions would
optionally deliver the distance value on the fly
>
They would have to traverse the tree to do that.
Yes, of course! Therefore the traversion does not need
to be done again in distance() ! That's the point I'm trying
to make clear. If you reread my inital posting then you will see that
I said that I need the distance value immediately after an insert()
or find() call... Do you see what I mean? :-)
Currently the tree is traversed twice: once in insert() and find(),
and the second time in distance(). The second traversion is
IMO unneccessary if insert() and find() would store this value
somewhere internally, and distance() then just returns that value
w/o traversing the tree.
Maybe for this one can make a new function member like
"distance_last_added()" or something that. Another option would be
if insert() and find() would deliver it in a overloaded version...
Quote:
As others have pointed out, "sequence numbers" aren't all that useful in
sets. It sounds like you would be better served by using a std::vector
and sorting it (using std::sort,) then you can do lookups with
std::lower_bound, upper_bound, equal_range, and binary_search and
getting the "sequence number" is simply a matter of a little pointer
math.
I need to get the seqnbr for each item that gets added to
the sorted collection, so then especially the sorting would be an overkill.

FYI: I'm doing this for the table-lookup routine (ie. a dictionary) of
a special data compressor for data records with common structure;
ie. it's not general purpose.

I just want to make this last point clear: I just have pointed out
IMO a shortcoming, performance issues or missing features
of the language that I in the field have encountered.
It's of course up to the language and library designers to
understand the problem and to fix or implement the functionality.
I'm just informing the people who are more involved in such issues.
If I were the designer of the library I would do it the way I described above.

Kai-Uwe Bux
Guest
 
Posts: n/a
#13: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


bilgekhan wrote:
Quote:
"Daniel T." <daniel_t@earthlink.netwrote:
Quote:
>"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
Quote:
"Daniel T." <daniel_t@earthlink.netwrote:
"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
>
After doing a succcessful insert() or find() on a set<Tcontainer
is it possible to get the item number of this item in the set?

(ie. its zero-based sequence number (position/location/rank/index)
within the sorted collection)

Additional info:
insert() returns a pair<Tand find() returns an iterator.
Using any of these two objects is it possible to get the
sequence number of the item these objects point to?
After the insert() or find() the container won't be modified.
>
set<int>::iterator it = s.insert( x ).first;
// or = s.find( x );
>
int seqNumb = distance( s.begin(), it );
>
Yes, distance() does what I wanted, but it is VERY slow!
It seems each time it walks all items from the beginning to the
item in question, ie. it practically does a find().
>>
>How else could it be done?
>
See below.
>
Quote:
Quote:
FYI: If I program this without STL (and then even using an unordered
array --ie. each time doing a sequential search) then it is about 7
times faster than the STL version that uses set<Tand distance() !
How come? :-)
>>
>Your comparing apples to oranges. In one implementation you are using an
>unordered array where all the elements are contiguous, in the other
>implementation the elements are ordered but not contiguous. But before
>abandoning the set idea, make sure you have all optimizations turned on.
>Many standard libraries have extensive error checking built in and make
>a lot of calls that can be inlined, but the former doesn't get removed
>and the latter doesn't get inlined unless optimizations are turned on.
>>
Quote:
The culprit is IMO the distance() call. One needs a better, a direct,
method without traversing the tree.
>>
>Like how?
>
See below.
>
Quote:
Quote:
A better way would have been if the insert() and find() functions would
optionally deliver the distance value on the fly
>>
>They would have to traverse the tree to do that.
Notice the "would".
Quote:
Yes, of course! Therefore the traversion does not need
to be done again in distance() ! That's the point I'm trying
to make clear. If you reread my inital posting then you will see that
I said that I need the distance value immediately after an insert()
or find() call... Do you see what I mean? :-)
Currently the tree is traversed twice: once in insert() and find(),
No. Insert and find do _not_ traverse the tree. They only run through a path
from the root to a leave in the tree. During this process, the sequence
number is not established as a by-product.
Quote:
and the second time in distance().
It's the first time.

[snip]


Best

Kai-Uwe Bux
Daniel T.
Guest
 
Posts: n/a
#14: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


"bilgekhan" <bilgekhan@bilgekhanbilgekhan.net.trwrote:
Quote:
Quote:
It sounds like you would be better served by using a std::vector
and sorting it (using std::sort,) then you can do lookups with
std::lower_bound, upper_bound, equal_range, and binary_search and
getting the "sequence number" is simply a matter of a little pointer
math.
>
I need to get the seqnbr for each item that gets added to
the sorted collection, so then especially the sorting would be an overkill.
Then don't. Use std::lower_bound to do all your insertions and you won't
have to use sort and will be able to find sequence numbers with ease.
Use the right tool for the job and the job will be much easier.
Quote:
FYI: I'm doing this for the table-lookup routine (ie. a dictionary) of
a special data compressor for data records with common structure;
ie. it's not general purpose.
Maybe you should consider a std::map then.
Quote:
I just want to make this last point clear: I just have pointed out
IMO a shortcoming, performance issues or missing features
of the language that I in the field have encountered.
It's of course up to the language and library designers to
understand the problem and to fix or implement the functionality.
I'm just informing the people who are more involved in such issues.
If I were the designer of the library I would do it the way I described above.
It is, of course, your prerogative to design classes however you want.
By all means make a container that implements a red-black tree *and*
keeps track of sequence numbers. Maybe in doing so, you will come to
understand why std::set doesn't work that way, or maybe you will build a
container that others are eager to use.
Jerry Coffin
Guest
 
Posts: n/a
#15: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


In article <g1fabi$rc$1@aioe.org>, bilgekhan@bilgekhanbilgekhan.net.tr
says...

[ ... ]
Quote:
Quote:
Quote:
A better way would have been if the insert() and find() functions would
optionally deliver the distance value on the fly
They would have to traverse the tree to do that.
>
Yes, of course! Therefore the traversion does not need
to be done again in distance() ! That's the point I'm trying
to make clear. If you reread my inital posting then you will see that
I said that I need the distance value immediately after an insert()
or find() call... Do you see what I mean? :-)
I see what you mean, but I think you're wrong. Consider (for example),
inserting an item that's going to be in the right subtree. insert()
compares the new item to the root node, finds that the new item is
larger than the root, and proceeds to the node to the right of the root.
In doing so, it has ignored the _entire_ left sub-tree.

When you use distance to find the position of the new node in the tree,
it cannot ignore the left subtree. Instead, it traverses the entire left
subtree counting each node as it goes. It then proceeds to count through
the right subtree until it reaches the point at which the new node was
inserted.
Quote:
Currently the tree is traversed twice: once in insert() and find(),
and the second time in distance(). The second traversion is
IMO unneccessary if insert() and find() would store this value
somewhere internally, and distance() then just returns that value
w/o traversing the tree.
See above. The traversals are entirely _different_ from each other. The
traversal involved in an insertion or deletion is NOT sufficient to
determine the overall position of the new node in the tree.

You _could_ determine an _extremely_ rough estimate of the overall
position in the tree with only a little extra work. A set, map, etc., is
normally implemented as a balanced binary tree (e.g. red-black or AVL
tree). To maintain balancing, such a tree stores some information about
the _relative_ depths of the left and right subtrees in each node. They
limit the difference in height between the left and right subtrees
(though r-b trees and AVL trees impose slightly different restrictions).

Based upon that height difference, and the depth of the tree you _did_
traverse, you could very quickly compute upper and lower bounds for the
overall position of the new node in the tree -- but those bounds may
easily differ from each other by a factor of 2 or so. Getting the
correct number is going to be linear.

[ ... ]
Quote:
I just want to make this last point clear: I just have pointed out
IMO a shortcoming, performance issues or missing features
of the language that I in the field have encountered.
It's of course up to the language and library designers to
understand the problem and to fix or implement the functionality.
I'm just informing the people who are more involved in such issues.
If I were the designer of the library I would do it the way I described above.
Maybe so -- but if you did, I doubt much of anybody would use your
library. In fact, except for this specific situation, even YOU probably
wouldn't use it. Making things work as you seem to want would involve a
_large_ slowdown in common operations in the hope of achieving a _small_
speedup for a relatively rare sequence of operations.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Frank Birbacher
Guest
 
Posts: n/a
#16: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


Hi!

bilgekhan schrieb:
Quote:
Quote:
>As others have pointed out, "sequence numbers" aren't all that useful in
>sets. It sounds like you would be better served by using a std::vector
>and sorting it (using std::sort,) then you can do lookups with
>std::lower_bound, upper_bound, equal_range, and binary_search and
>getting the "sequence number" is simply a matter of a little pointer
>math.
>
I need to get the seqnbr for each item that gets added to
the sorted collection, so then especially the sorting would be an overkill.
Your answer doesn't make sense to me. I guess you misunderstood the
advise. The advise propses what I had done if I were you:

DO NOT USE A STD::SET.

It doesn't fit your needs. The fact that you need the "index" of some
element shows it.

Replace the std::set<Whateverby std::vector<Whateverand try if you
can get along.
Quote:
FYI: I'm doing this for the table-lookup routine (ie. a dictionary) of
a special data compressor for data records with common structure;
ie. it's not general purpose.
[snip]
Quote:
If I were the designer of the library I would do it the way I described above.
Maybe all people writing a non general purpose data compressor with a
table-lookup routine will use your library then.

I probably wouldn't. Don't be offensed. It is just that std::set will
never have a "index_last_added" method. It's by design. If you need a
function like that you need to implement a new set class of your own.
Having done that come back and tell us how easy it was to determine the
exact size of a tree branch which is never visited or how to keep the
counts up-to-date which tell you the exact size, how you then managed to
keep the insert in O(log n). I think this is too complicated compared to
a sorted vector solution.

Frank
Daniel Pitts
Guest
 
Posts: n/a
#17: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


Frank Birbacher wrote:
Quote:
Hi!
>
bilgekhan schrieb:
Quote:
Quote:
>>As others have pointed out, "sequence numbers" aren't all that useful
>>in sets. It sounds like you would be better served by using a
>>std::vector and sorting it (using std::sort,) then you can do lookups
>>with std::lower_bound, upper_bound, equal_range, and binary_search
>>and getting the "sequence number" is simply a matter of a little
>>pointer math.
>>
>I need to get the seqnbr for each item that gets added to
>the sorted collection, so then especially the sorting would be an
>overkill.
>
Your answer doesn't make sense to me. I guess you misunderstood the
advise. The advise propses what I had done if I were you:
>
DO NOT USE A STD::SET.
>
It doesn't fit your needs. The fact that you need the "index" of some
element shows it.
>
Replace the std::set<Whateverby std::vector<Whateverand try if you
can get along.
>
Quote:
>FYI: I'm doing this for the table-lookup routine (ie. a dictionary) of
>a special data compressor for data records with common structure;
>ie. it's not general purpose.
[snip]
Quote:
>If I were the designer of the library I would do it the way I
>described above.
>
Maybe all people writing a non general purpose data compressor with a
table-lookup routine will use your library then.
>
I probably wouldn't. Don't be offensed. It is just that std::set will
never have a "index_last_added" method. It's by design. If you need a
function like that you need to implement a new set class of your own.
Having done that come back and tell us how easy it was to determine the
exact size of a tree branch which is never visited or how to keep the
counts up-to-date which tell you the exact size, how you then managed to
keep the insert in O(log n). I think this is too complicated compared to
a sorted vector solution.
>
Frank
I agree with your point, but feel like pointing out that it is possible
(almost trivial) to have every node on a tree keep track of how many
children it has. An insert operation would increment the value on the
return-trip if the insert was successful.

My question to the OP is, do you need to update the insert-position of
the elements *after* the last inserted element? say I have an existing set:

<1, 5, 6 so, 5 and 6 are at position 1 and 2 respectively...
I insert 3:
<1, 3, 5, 6 and determine that 3 is at position 1, do I need to update
the associated data with 5 and 6 to say they are now at position 2 and 3
respectively?

Also to the OP:
If you design your library for your specific use-cases, but in such a
way that it can easily be extended, you'll find yourself much happier.
--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
Daniel T.
Guest
 
Posts: n/a
#18: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


Daniel Pitts <newsgroup.spamfilter@virtualinfinity.netwrote:
Quote:
...it is possible (almost trivial) to have every node on a tree
keep track of how many children it has. An insert operation would
increment the value on the return-trip if the insert was successful.
I'm not so sure if an extra unsigned long per element can be considered
"trivial".
Daniel Pitts
Guest
 
Posts: n/a
#19: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


Daniel T. wrote:
Quote:
Daniel Pitts <newsgroup.spamfilter@virtualinfinity.netwrote:
>
Quote:
>...it is possible (almost trivial) to have every node on a tree
>keep track of how many children it has. An insert operation would
>increment the value on the return-trip if the insert was successful.
>
I'm not so sure if an extra unsigned long per element can be considered
"trivial".
I said "almost trivial", which refers to difficulty, not space complexity.

Also, it could easily be a compile-time decision if you needed it or not.

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
bilgekhan
Guest
 
Posts: n/a
#20: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


"Daniel Pitts" wrote:
Quote:
Frank Birbacher wrote:
Quote:

Don't be offensed. It is just that std::set will never have a
"index_last_added" method. It's by design. If you need a
function like that you need to implement a new set class of your own.
Having done that come back and tell us how easy it was to determine the
exact size of a tree branch which is never visited or how to keep the
counts up-to-date which tell you the exact size, how you then managed to
keep the insert in O(log n). I think this is too complicated compared to
a sorted vector solution.
I agree with your point, but feel like pointing out that it is possible
(almost trivial) to have every node on a tree keep track of how many
children it has. An insert operation would increment the value on the
return-trip if the insert was successful.
>
My question to the OP is, do you need to update the insert-position of
the elements *after* the last inserted element? say I have an existing set:
>
<1, 5, 6 so, 5 and 6 are at position 1 and 2 respectively...
I insert 3:
<1, 3, 5, 6 and determine that 3 is at position 1, do I need to update
the associated data with 5 and 6 to say they are now at position 2 and 3
respectively?
From point of application code it is sufficient to know the position of
just only the *last* successful insert() operation and of the
*last* successful find() operation.
Ie. in the application code there is no need to keep track of the positions
of the other items; just that of the latest successful insert() or find().

bilgekhan
Guest
 
Posts: n/a
#21: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


"Daniel Pitts" wrote:
Quote:
Frank Birbacher wrote:
Quote:

Don't be offensed. It is just that std::set will never have a
"index_last_added" method. It's by design. If you need a
function like that you need to implement a new set class of your own.
Having done that come back and tell us how easy it was to determine the
exact size of a tree branch which is never visited or how to keep the
counts up-to-date which tell you the exact size, how you then managed to
keep the insert in O(log n). I think this is too complicated compared to
a sorted vector solution.
I agree with your point, but feel like pointing out that it is possible
(almost trivial) to have every node on a tree keep track of how many
children it has. An insert operation would increment the value on the
return-trip if the insert was successful.
>
My question to the OP is, do you need to update the insert-position of
the elements *after* the last inserted element? say I have an existing set:
>
<1, 5, 6 so, 5 and 6 are at position 1 and 2 respectively...
I insert 3:
<1, 3, 5, 6 and determine that 3 is at position 1, do I need to update
the associated data with 5 and 6 to say they are now at position 2 and 3
respectively?
From point of application code it is sufficient to know the position of
just only the *last* successful insert() operation and of the
*last* successful find() operation.
Ie. in the application code there is no need to keep track of the positions
of the other items; just that of the latest successful insert() or find().

In the example above, after inserting 3 into this set
it just shall be possible to get the position
of this latest new item (ie. here the position of 3, ie. 1).
There is no need for the library to keep track or to update
anything defined in the application code that references items of the set.
Ie. there are no such references (at least not in my code).

Kai-Uwe Bux
Guest
 
Posts: n/a
#22: Jun 27 '08

re: STL: how to get the sequence number of a newly added item into a set


Daniel T. wrote:
Quote:
Daniel Pitts <newsgroup.spamfilter@virtualinfinity.netwrote:
>
Quote:
>...it is possible (almost trivial) to have every node on a tree
>keep track of how many children it has. An insert operation would
>increment the value on the return-trip if the insert was successful.
>
I'm not so sure if an extra unsigned long per element can be considered
"trivial".
It's not really "extra": Any balanced tree has to store some additional
information. This information can be _traded_ for the unsigned long since
size information can be used to rebalance. Then, alignment and memory
quantization from the allocator will very likely imply that

2 pointers + 1 unsigned long + user data

is simply indistinguishable from

2 pointers + balancing info + user data



Best

Kai-Uwe Bux
Closed Thread