Hi,
I am searching for a data structure that stores key-value pairs in it.
This data structure is to hold large amounts of key-value pairs, and so
needs to be efficient both in insertion and deletion.
Does anybody know of ready data structures that can be freely used
under the .Net framwork.
Thanks in Advance 22 7560
Hi,
HashTable is most common datastructure that you can use for storing keyvalue
pairs.
also, SortedList is there which is a sort of HashTable with sorting feature.
You can look at System.Collections namespaces in MSDN to find other
collection APIs if you have some other specific need.
HTH,
"Curious" <th****@mail.global.net.mt> wrote in message
news:11*********************@g43g2000cwa.googlegro ups.com... Hi,
I am searching for a data structure that stores key-value pairs in it. This data structure is to hold large amounts of key-value pairs, and so needs to be efficient both in insertion and deletion.
Does anybody know of ready data structures that can be freely used under the .Net framwork.
Thanks in Advance
Curious <th****@mail.global.net.mt> wrote: I am searching for a data structure that stores key-value pairs in it. This data structure is to hold large amounts of key-value pairs, and so needs to be efficient both in insertion and deletion.
Does anybody know of ready data structures that can be freely used under the .Net framwork.
Have you tried Hashtable? I don't know how efficient it is, but it's
certainly worth trying to start with. How many pairs are we talking
here?
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
I forgot to mention, that the data needs to be maintained sorted.
When removing the data, the minimum key with its data is to be removed.
Curious <th****@mail.global.net.mt> wrote: I forgot to mention, that the data needs to be maintained sorted.
I assume you mean by key?
When removing the data, the minimum key with its data is to be removed.
It sounds like you may want two maps then - one to map "key" to "data"
and one to map "data" to "set of keys". Typically one removes an entry
from a map by key, not by value.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Initially I started using the RedBlack tree downloaded from the
codeproject.com.
Then I discovered that two instances of this project(RedBlack) cannot
work together in the same application, since errors started to be
given.
I require something similar to that type of datastructure(tree).
I hope I have cleared my self better.
Curious wrote: I am searching for a data structure that stores key-value pairs in it. This data structure is to hold large amounts of key-value pairs, and so needs to be efficient both in insertion and deletion.
Does anybody know of ready data structures that can be freely used under the .Net framwork.
Also: I forgot to mention, that the data needs to be maintained sorted. When removing the data, the minimum key with its data is to be removed.
You won't *necessarily* get great efficiency on both insertion and removal. But
given your description so far it sounds as if you can - I would try a
SortedList. The insertion uses .Add(object key, object value) - which *should*
be efficient, at least if the key elements arrive in anything like random order.
(Test it!) You can remove elements either with .Remove(object key) or
..RemoveAt(int index) - index being 0 in your case of removing the element with
least key. Either of these *should* be quite efficient (O(log n) at worst) *if*
the list is internally maintained sorted as per the documentation. (Again, test
it!)
HTH,
-rick-
Curious <th****@mail.global.net.mt> wrote: Initially I started using the RedBlack tree downloaded from the codeproject.com.
Then I discovered that two instances of this project(RedBlack) cannot work together in the same application, since errors started to be given.
Are you absolutely sure that that's the case? You could have had some
other error.
I require something similar to that type of datastructure(tree).
I hope I have cleared my self better.
If a red-black tree does what you want, it's probably worth
investigating the one from codeproject further. It may be easily
fixable even if it doesn't quite work now. I don't believe there's
anything similar in the framework.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Trying the following code, throws an exception. The main reason is
that a static field exists to identify leaf nodes. Now, have two
instances the same static field is being used, which would not give
correct results.
RedBlack x = new RedBlack();
RedBlack y = new RedBlack();
x.Add(1, "one");
y.Add(2, "two");
Curious <th****@mail.global.net.mt> wrote: Trying the following code, throws an exception. The main reason is that a static field exists to identify leaf nodes. Now, have two instances the same static field is being used, which would not give correct results.
RedBlack x = new RedBlack(); RedBlack y = new RedBlack(); x.Add(1, "one"); y.Add(2, "two");
There should be no need of a static field at all. I suggest you contact
the author, see if he has a later version which has fixed this problem,
and if not, look into fixing it yourself.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
out of curiosity, I took a look at the Red Black Tree implementation on the
codeproject. I could see no reason why the author used a static field since
nothing else is static. on top of that, it's also publicly exposed, means
any code can change it at will. combined with the Tree itself doesn't
implement any standard .NET collection interfaces (not even IEnumerable), I
don't really have much confidence using it in my projects.
As an alternative, if you are using 2.0, I suggest you take a look at power
collections for .NET http://www.wintellect.com/powercollections/ They also
happen to have a implementation of red black tree (though it's only internal
and used by other collection objects) You might find something that suits
your need.
"Curious" wrote: Trying the following code, throws an exception. The main reason is that a static field exists to identify leaf nodes. Now, have two instances the same static field is being used, which would not give correct results.
RedBlack x = new RedBlack(); RedBlack y = new RedBlack(); x.Add(1, "one"); y.Add(2, "two");
Have you noticed that MSDN for .NET 2.0 now sports the Ordos costs of
different calls.
Like 'Calling this method uses a linear search for your object and is a O(N)
operation.'
I think that is excellent.
Not often you get to give kudos to Microsoft but the new MSDN is simply
great!
- Michael S
Hi,
I agree with the others that the Red/Black tree should be easy to fix,
and the SortedList will probably do what you want, too.
But you could also try my favorite data structure, the SkipList. There
is a c# implementation on CodeProject here: http://www.codeproject.com/csharp/SkipList1.asp.
The article includes benchmarks against a SortedList, and SkipList beats
it handily. SkipList is also much simpler than a Red/Black tree, and
should give better average performance. You'll have to test this to be
sure, though.
Gabe Curious wrote:
I am searching for a data structure that stores key-value pairs in it. This data structure is to hold large amounts of key-value pairs, and so needs to be efficient both in insertion and deletion.
Does anybody know of ready data structures that can be freely used under the .Net framwork.
Also:
I forgot to mention, that the data needs to be maintained sorted. When removing the data, the minimum key with its data is to be removed.
You won't *necessarily* get great efficiency on both insertion and removal. But given your description so far it sounds as if you can - I would try a SortedList. The insertion uses .Add(object key, object value) - which *should* be efficient, at least if the key elements arrive in anything like random order. (Test it!) You can remove elements either with .Remove(object key) or .RemoveAt(int index) - index being 0 in your case of removing the element with least key. Either of these *should* be quite efficient (O(log n) at worst) *if* the list is internally maintained sorted as per the documentation. (Again, test it!)
HTH, -rick-
At my office we've found SortedList to be slow on insertions.
That said, the OP never answered the question: How many items are we
talking about here? Curious, you said that you had to do a "a lot" of
insertions and removals. How many is "a lot"?
"Gabe Moothart" <ga**@imaginesystems.net> wrote in message
news:OX*************@TK2MSFTNGP12.phx.gbl... Hi, I agree with the others that the Red/Black tree should be easy to fix, and the SortedList will probably do what you want, too.
But you could also try my favorite data structure, the SkipList. There is a c# implementation on CodeProject here: http://www.codeproject.com/csharp/SkipList1.asp.
The article includes benchmarks against a SortedList, and SkipList beats it handily. SkipList is also much simpler than a Red/Black tree, and should give better average performance. You'll have to test this to be sure, though.
Gabe
I also thought of a SkipList, as it is a pretty neat alogorithm.
But then I had a look at his requirements:
1. A lot of data
2. Fast find.
2. Fast change.
Feel free to invent something that does this for him.
Really, if any one of us can do this, you better get the patent. Then you
can always call Google and license it to them. Don't think they would return
your call thou. Or just tell the whole world that every book on Ordos
(Big-O) is obsolete.
Either way: You will get a nobel price.
This is a classic problem.
Also why there is a market for high-inserts-databases (like billing systems,
Vodaphone do need to store that you made a call to charge you)... to
transactional databases (vanilla).. to
read-only-with-few-increments-databases (search-engines)
Back to SkipLists. While SkipList being smart and funny to code, they only
perform in theory. With virtual memory you get a lot of page-faults and
disk-swapping. This is why most databases work with 'blocks' and 'buckets'
to make sure that the Principle Of Locality somewhat holds.
But what I'm pondering on is; How often do it need to get sorted?
If there are 1000 changes for each read, I would consider a red-black-tree
with tree-sort on top of that.
Might involve a lot of RAM though....
- Michael S
but the requirements demands fast inserts and deletes.
The SortedDictionary<TKey, TValue> type in 2.0 is implemented as a
red-black tree. No need to use the PowerCollections.
Daniel Jin wrote: out of curiosity, I took a look at the Red Black Tree implementation on the codeproject. I could see no reason why the author used a static field since nothing else is static. on top of that, it's also publicly exposed, means any code can change it at will. combined with the Tree itself doesn't implement any standard .NET collection interfaces (not even IEnumerable), I don't really have much confidence using it in my projects.
As an alternative, if you are using 2.0, I suggest you take a look at power collections for .NET http://www.wintellect.com/powercollections/ They also happen to have a implementation of red black tree (though it's only internal and used by other collection objects) You might find something that suits your need.
"Curious" wrote:
Trying the following code, throws an exception. The main reason is that a static field exists to identify leaf nodes. Now, have two instances the same static field is being used, which would not give correct results.
RedBlack x = new RedBlack(); RedBlack y = new RedBlack(); x.Add(1, "one"); y.Add(2, "two");
Curious,
The SortedDictionary in the 2.0 Framework is implemented with a red
black tree.
I posted a red black tree at
<http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=6644D97B-B743-47D9-A0E7-A207EEBFB0F9>
a long time ago that you can use on the 1.1 Framework. I also provide
a splay tree which you might want to experiment around with. Splay
trees have a unique implementation which makes them faster than red
black trees in some scenarios.
You can use the SortedList, but it is horribly inefficient on inserts
and deletes since it uses O(n log n) implementations.
Brian
Curious wrote: Initially I started using the RedBlack tree downloaded from the codeproject.com.
Then I discovered that two instances of this project(RedBlack) cannot work together in the same application, since errors started to be given.
I require something similar to that type of datastructure(tree).
I hope I have cleared my self better.
Brian Gideon wrote: I posted a red black tree at <http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=6644D97B-B743-47D9-A0E7-A207EEBFB0F9> a long time ago that you can use on the 1.1 Framework. I also provide a splay tree which you might want to experiment around with. Splay trees have a unique implementation which makes them faster than red black trees in some scenarios.
I forgot to mention that I provide a skip list as well.
Rick Lones wrote: You won't *necessarily* get great efficiency on both insertion and removal. But given your description so far it sounds as if you can - I would try a SortedList. The insertion uses .Add(object key, object value) - which *should* be efficient, at least if the key elements arrive in anything like random order. (Test it!) You can remove elements either with .Remove(object key) or .RemoveAt(int index) - index being 0 in your case of removing the element with least key. Either of these *should* be quite efficient (O(log n) at worst) *if* the list is internally maintained sorted as per the documentation. (Again, test it!)
HTH, -rick-
Actually, Add and Remove on the SortedList are O(n log n)
implementations.
Michael S wrote: Back to SkipLists. While SkipList being smart and funny to code, they only perform in theory. With virtual memory you get a lot of page-faults and disk-swapping. This is why most databases work with 'blocks' and 'buckets' to make sure that the Principle Of Locality somewhat holds.
I disagree. A skip list performs quite well in practice. It can be
tuned to balance memory and speed requirements, but usually consume
less memory than balanced binary search trees. That makes it
attractive. It's the theory that make prevent some people from using
it since *most* implementations use a random number generator and only
guarentee an average case operation of O(log n).
What is very attractive to the OP is that access to the max or min keys
can be implemented so that they are O(1) operations. That makes the
skip list an ideal candidate for a priority queue as well.
Brian
Brian Gideon wrote: I forgot to mention that I provide a skip list as well.
Doh...it's at
<http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=3F773D8A-7E1A-40F2-8EC0-48FB811989AA>.
Brian Gideon wrote: Actually, Add and Remove on the SortedList are O(n log n) implementations.
Correction. It's O(n) according to the documentation.
Brian Gideon wrote: You can use the SortedList, but it is horribly inefficient on inserts and deletes since it uses O(n log n) implementations.
Correction. It's O(n) according to the documentation. This discussion thread is closed Replies have been disabled for this discussion. Similar topics
16 posts
views
Thread by laclac01 |
last post: by
|
4 posts
views
Thread by Thomas Paul Diffenbach |
last post: by
|
3 posts
views
Thread by mrhicks |
last post: by
|
1 post
views
Thread by pedagani |
last post: by
|
16 posts
views
Thread by Christian Christmann |
last post: by
|
11 posts
views
Thread by efrat |
last post: by
|
5 posts
views
Thread by sql_er |
last post: by
|
4 posts
views
Thread by Luna Moon |
last post: by
|
82 posts
views
Thread by Bill David |
last post: by
| | | | | | | | | | |