440,727 Members | 766 Online
Need help? Post your question and get tips & solutions from a community of 440,727 IT Pros & Developers. It's quick & easy.

# fastest navigation in a list

 P: n/a hello all, I have an application in which I build a list, with potentially thousands of nodes. each node has an "index", and all nodes are ordered by this index. this index reflects a value that has been computed before, and will range from 0 to N. node 0: index 0 node 1: index 6 node 2: index 9 node 3: index 15 and so on. my question is: what would theoretically be the fastest way of finding a node by its index? I now have a second list in which I keep bookmarks, i.e. I run through the first list and every 100 or so nodes, I copy that node into the second list. So that when I'm looking for the node with index 5000, I don't have to run through the entire(first) list of nodes, but only a dozen or so (in the second list). I also keep track of the last node refernced, since most lookups will be in each other's neighbourhood, and begin my next search from that node. This works very well; it is veru fast. But I was wondering if using an "advance" method would be equally fast and efficient? If yes, I would replace my list of bookmarks by a navigation based on "advance"... greetings, D. Jul 23 '05 #1
9 Replies

 P: n/a danny van elsen wrote: hello all, I have an application in which I build a list, with potentially thousands of nodes. each node has an "index", and all nodes are ordered by this index. this index reflects a value that has been computed before, and will range from 0 to N. node 0: index 0 node 1: index 6 node 2: index 9 node 3: index 15 and so on. my question is: what would theoretically be the fastest way of finding a node by its index? I now have a second list in which I keep bookmarks, i.e. I run through the first list and every 100 or so nodes, I copy that node into the second list. So that when I'm looking for the node with index 5000, I don't have to run through the entire(first) list of nodes, but only a dozen or so (in the second list). I also keep track of the last node refernced, since most lookups will be in each other's neighbourhood, and begin my next search from that node. This works very well; it is veru fast. But I was wondering if using an "advance" method would be equally fast and efficient? If yes, I would replace my list of bookmarks by a navigation based on "advance"... greetings, D. Perhaps a better data structure? std::set is already ordered, giving O(log n) search times. Jul 23 '05 #2

 P: n/a danny van elsen scribbled on the stall wall: hello all, I have an application in which I build a list, with potentially thousands of nodes. each node has an "index", and all nodes are ordered by this index. this index reflects a value that has been computed before, and will range from 0 to N. node 0: index 0 node 1: index 6 node 2: index 9 node 3: index 15 and so on. my question is: what would theoretically be the fastest way of finding a node by its index? I now have a second list in which I keep bookmarks, i.e. I run through the first list and every 100 or so nodes, I copy that node into the second list. So that when I'm looking for the node with index 5000, I don't have to run through the entire(first) list of nodes, but only a dozen or so (in the second list). I also keep track of the last node refernced, since most lookups will be in each other's neighbourhood, and begin my next search from that node. This works very well; it is veru fast. But I was wondering if using an "advance" method would be equally fast and efficient? If yes, I would replace my list of bookmarks by a navigation based on "advance"... If you have an index and that index is suitable to use as a "key" then create a map or multimap instead of a list. ----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups ----= East and West-Coast Server Farms - Total Privacy via Encryption =---- Jul 23 '05 #3

 P: n/a "danny van elsen" wrote in message news:pa***************************@hotmail.com... hello all, I have an application in which I build a list, with potentially thousands of nodes. each node has an "index", and all nodes are ordered by this index. this index reflects a value that has been computed before, and will range from 0 to N. node 0: index 0 node 1: index 6 node 2: index 9 node 3: index 15 and so on. my question is: what would theoretically be the fastest way of finding a node by its index? If you want to keep that structure, instead of using one of the alternatives mentioned by others, then a binary search is probably your best bet. That's what it's designed for. I now have a second list in which I keep bookmarks, i.e. I run through the first list and every 100 or so nodes, I copy that node into the second list. So that when I'm looking for the node with index 5000, I don't have to run through the entire(first) list of nodes, but only a dozen or so (in the second list). I also keep track of the last node refernced, since most lookups will be in each other's neighbourhood, and begin my next search from that node. This works very well; it is veru fast. But I was wondering if using an "advance" method would be equally fast and efficient? If yes, I would replace my list of bookmarks by a navigation based on "advance"... (Sorry, I don't know what an "advance" method is.) -Howard Jul 23 '05 #4

 P: n/a In message , Howard writes"danny van elsen" wrote in messagenews:pa***************************@hotmail.com. .. hello all, I have an application in which I build a list, with potentially thousands of nodes. each node has an "index", and all nodes are ordered by this index. this index reflects a value that has been computed before, and will range from 0 to N. node 0: index 0 node 1: index 6 node 2: index 9 node 3: index 15 and so on. my question is: what would theoretically be the fastest way of finding a node by its index? Other posters have suggested using map, but that's not necessarily any better. There are too many unanswered questions: Is there a particular reason for choosing a list as the basic storage structure? Are the index values actually in sorted order? Are elements inserted all at once or sporadically? Are elements ever removed? If you want to keep that structure, instead of using one of the alternativesmentioned by others, then a binary search is probably your best bet. That'swhat it's designed for. Except that std::list only has bidirectional iterators. binary_search will work, but it may not be as fast as you expect. I now have a second list in which I keep bookmarks, i.e. I run through the first list and every 100 or so nodes, I copy that node into the second list. So that when I'm looking for the node with index 5000, I don't have to run through the entire(first) list of nodes, but only a dozen or so (in the second list). I also keep track of the last node refernced, since most lookups will be in each other's neighbourhood, and begin my next search from that node. This works very well; it is veru fast. It's sometimes known as "indexed sequential" in database circles. But I was wondering if using an "advance" method would be equally fast and efficient? If yes, I would replace my list of bookmarks by a navigation based on "advance"...(Sorry, I don't know what an "advance" method is.) It's what you use to move non-random-access iterators by more than one step ;-) -- Richard Herring Jul 23 '05 #5

 P: n/a "Richard Herring" wrote in message news:yz**************@baesystems.com... If you want to keep that structure, instead of using one of thealternativesmentioned by others, then a binary search is probably your best bet.That'swhat it's designed for. Except that std::list only has bidirectional iterators. binary_search will work, but it may not be as fast as you expect. Ah. That would tend to make the search less efficient, eh? :-) I now have a second list in which I keep bookmarks, i.e. I run through the first list and every 100 or so nodes, I copy that node into the second list. So that when I'm looking for the node with index 5000, I don't have to run through the entire(first) list of nodes, but only a dozen or so (in the second list). I also keep track of the last node refernced, since most lookups will be in each other's neighbourhood, and begin my next search from that node. This works very well; it is veru fast. It's sometimes known as "indexed sequential" in database circles.But I was wondering if using an "advance" method would be equally fast and efficient? If yes, I would replace my list of bookmarks by a navigation based on "advance"...(Sorry, I don't know what an "advance" method is.) It's what you use to move non-random-access iterators by more than one step ;-) Ok, thanks for the info. -Howard Jul 23 '05 #6

 P: n/a >>> I have an application in which I build a list, with potentially thousands of nodes. each node has an "index", and all nodes are ordered by this index. this index reflects a value that has been computed before, and will range from 0 to N. node 0: index 0 node 1: index 6 node 2: index 9 node 3: index 15 and so on. my question is: what would theoretically be the fastest way of finding a node by its index? Other posters have suggested using map, but that's not necessarily any better. There are too many unanswered questions: Is there a particular reason for choosing a list as the basic storage structure? Are the index values actually in sorted order? Are elements inserted all at once or sporadically? Are elements ever removed? The list will be the result of an analysis phase in which I start out with perhaps tens of thousands of nodes; each node will be looked at and most probably split up in several new nodes. So insertion is certainly not all at once, and 'a little bit random', because not all nodes will be treated sequentially. I will certainly end up with a multitude of the original number of nodes. The indexes will always remain ordered, however. list is certainly better than vector, because of the inserts and removals; but my artificial way of looking up a node made me think that perhaps there should be better predefined methods available? If you want to keep that structure, instead of using one of thealternatives mentioned by others, then a binary search is probably yourbest bet. That's what it's designed for. Except that std::list only has bidirectional iterators. binary_search will work, but it may not be as fast as you expect. Do I understand correctly that what makes you say this is that there are no random iterators for a list? so every next step in the binary search is necessarily a sequence of increments or decrements? Perhaps then my way of searching is not that bad at all? I now have a second list in which I keep bookmarks, i.e. I run through the first list and every 100 or so nodes, I copy that node into the second list. So that when I'm looking for the node with index 5000, I don't have to run through the entire(first) list of nodes, but only a dozen or so (in the second list). I also keep track of the last node refernced, since most lookups will be in each other's neighbourhood, and begin my next search from that node. This works very well; it is very fast. It's sometimes known as "indexed sequential" in database circles.But I was wondering if using an "advance" method would be equally fast and efficient? If yes, I would replace my list of bookmarks by a navigation based on "advance"...(Sorry, I don't know what an "advance" method is.) It's what you use to move non-random-access iterators by more than one step ;-) Jul 23 '05 #7

 P: n/a In message , danny van elsen writes I have an application in which I build a list, with potentially thousands of nodes. each node has an "index", and all nodes are ordered by this index. this index reflects a value that has been computed before, and will range from 0 to N. node 0: index 0 node 1: index 6 node 2: index 9 node 3: index 15 and so on. my question is: what would theoretically be the fastest way of finding a node by its index? Other posters have suggested using map, but that's not necessarily any better. There are too many unanswered questions: Is there a particular reason for choosing a list as the basic storage structure? Are the index values actually in sorted order? Are elements inserted all at once or sporadically? Are elements ever removed?The list will be the result of an analysis phase in which I start out withperhaps tens of thousands of nodes; each node will be looked at and mostprobably split up in several new nodes. So insertion is certainly not allat once, and 'a little bit random', because not all nodes will betreated sequentially. I will certainly end up with a multitude of theoriginal number of nodes.The indexes will always remain ordered, however.list is certainly better than vector, because of the inserts and removals; That makes sense. But if there were no removals, and the sequence didn't need to be sorted until the end of the analysis phase, you might consider just appending to a vector, sorting it once only, and then using binary_search. And for the removals you might consider using the erase(remove_if(begin, end, predicate), end) approach of copying wanted elements down over the unwanted ones rather than removing them individually. but my artificial way of looking up a node made me think that perhapsthere should be better predefined methods available?If you want to keep that structure, instead of using one of thealternatives mentioned by others, then a binary search is probably yourbest bet. That's what it's designed for. Except that std::list only has bidirectional iterators. binary_search will work, but it may not be as fast as you expect.Do I understand correctly that what makes you say this is that thereare no random iterators for a list? so every next step in the binarysearch is necessarily a sequence of increments or decrements? Correct. Under the hood std::list is probably (though the standard doesn't require it) a straightforward doubly-linked list whose nodes are randomly scattered around in memory, so the only way to traverse it is to follow the links one by one. So a binary search on a list will do O(N log N) comparisons _and_ O(N log N) iterator increments. If the cost of the comparisons dominates, that's not too bad. If the increments are expensive (probably not for std::list, as it's just an indirection) there might be a problem. Perhaps thenmy way of searching is not that bad at all? Maybe not. It has a long history (think of those dictionaries with a thumb index!). Knuth mentions something similar (The Art of Computer Programming vol. III, page 473.) If you index every M'th element, it takes O(N/M+M) comparisons and increments. Ultimately the only way to find the best strategy is to try them all and time them. -- Richard Herring Jul 23 '05 #8

 P: n/a > Thought: I'm making the assumption that once your list is built up, it's unchanging. If this is incorrect, the following suggestion no longer works. Make a std::set of iterators into your list. The sort function for the set is "*iter1 < *iter2". You have the overhead of sorting the iterators exactly once. There was an article in either CUJ or DDJ on a "view library" which used a similar mechanism to create alternate views into an STL Container. The point being that once you've built up your set of iterators, all searches are O(log n). Hell, you don't even need to use a set, you could use a vector of iterators, sort the vector once, and then use a binary search on the vector. ok, this is an interesting idea, I'll try that... thanks for all answers! Jul 23 '05 #10

### This discussion thread is closed

Replies have been disabled for this discussion.