472,144 Members | 2,014 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,144 software developers and data experts.

Efficient Data Structures

Hi,

I am searching for a data structure that stores key-value pairs in it.
This data structure is to hold large amounts of key-value pairs, and so
needs to be efficient both in insertion and deletion.

Does anybody know of ready data structures that can be freely used
under the .Net framwork.

Thanks in Advance

Dec 13 '05 #1
22 7560
Hi,
HashTable is most common datastructure that you can use for storing keyvalue
pairs.
also, SortedList is there which is a sort of HashTable with sorting feature.

You can look at System.Collections namespaces in MSDN to find other
collection APIs if you have some other specific need.

HTH,

"Curious" <th****@mail.global.net.mt> wrote in message
news:11*********************@g43g2000cwa.googlegro ups.com...
Hi,

I am searching for a data structure that stores key-value pairs in it.
This data structure is to hold large amounts of key-value pairs, and so
needs to be efficient both in insertion and deletion.

Does anybody know of ready data structures that can be freely used
under the .Net framwork.

Thanks in Advance

Dec 13 '05 #2
Curious <th****@mail.global.net.mt> wrote:
I am searching for a data structure that stores key-value pairs in it.
This data structure is to hold large amounts of key-value pairs, and so
needs to be efficient both in insertion and deletion.

Does anybody know of ready data structures that can be freely used
under the .Net framwork.


Have you tried Hashtable? I don't know how efficient it is, but it's
certainly worth trying to start with. How many pairs are we talking
here?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 13 '05 #3
I forgot to mention, that the data needs to be maintained sorted.

When removing the data, the minimum key with its data is to be removed.

Dec 13 '05 #4
Curious <th****@mail.global.net.mt> wrote:
I forgot to mention, that the data needs to be maintained sorted.
I assume you mean by key?
When removing the data, the minimum key with its data is to be removed.


It sounds like you may want two maps then - one to map "key" to "data"
and one to map "data" to "set of keys". Typically one removes an entry
from a map by key, not by value.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 13 '05 #5
Initially I started using the RedBlack tree downloaded from the
codeproject.com.

Then I discovered that two instances of this project(RedBlack) cannot
work together in the same application, since errors started to be
given.

I require something similar to that type of datastructure(tree).

I hope I have cleared my self better.

Dec 13 '05 #6
Curious wrote:
I am searching for a data structure that stores key-value pairs in it.
This data structure is to hold large amounts of key-value pairs, and so
needs to be efficient both in insertion and deletion.

Does anybody know of ready data structures that can be freely used
under the .Net framwork.
Also: I forgot to mention, that the data needs to be maintained sorted.
When removing the data, the minimum key with its data is to be removed.


You won't *necessarily* get great efficiency on both insertion and removal. But
given your description so far it sounds as if you can - I would try a
SortedList. The insertion uses .Add(object key, object value) - which *should*
be efficient, at least if the key elements arrive in anything like random order.
(Test it!) You can remove elements either with .Remove(object key) or
..RemoveAt(int index) - index being 0 in your case of removing the element with
least key. Either of these *should* be quite efficient (O(log n) at worst) *if*
the list is internally maintained sorted as per the documentation. (Again, test
it!)

HTH,
-rick-
Dec 13 '05 #7
Curious <th****@mail.global.net.mt> wrote:
Initially I started using the RedBlack tree downloaded from the
codeproject.com.

Then I discovered that two instances of this project(RedBlack) cannot
work together in the same application, since errors started to be
given.
Are you absolutely sure that that's the case? You could have had some
other error.
I require something similar to that type of datastructure(tree).

I hope I have cleared my self better.


If a red-black tree does what you want, it's probably worth
investigating the one from codeproject further. It may be easily
fixable even if it doesn't quite work now. I don't believe there's
anything similar in the framework.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 13 '05 #8
Trying the following code, throws an exception. The main reason is
that a static field exists to identify leaf nodes. Now, have two
instances the same static field is being used, which would not give
correct results.

RedBlack x = new RedBlack();
RedBlack y = new RedBlack();
x.Add(1, "one");
y.Add(2, "two");

Dec 13 '05 #9
Curious <th****@mail.global.net.mt> wrote:
Trying the following code, throws an exception. The main reason is
that a static field exists to identify leaf nodes. Now, have two
instances the same static field is being used, which would not give
correct results.

RedBlack x = new RedBlack();
RedBlack y = new RedBlack();
x.Add(1, "one");
y.Add(2, "two");


There should be no need of a static field at all. I suggest you contact
the author, see if he has a later version which has fixed this problem,
and if not, look into fixing it yourself.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 13 '05 #10
out of curiosity, I took a look at the Red Black Tree implementation on the
codeproject. I could see no reason why the author used a static field since
nothing else is static. on top of that, it's also publicly exposed, means
any code can change it at will. combined with the Tree itself doesn't
implement any standard .NET collection interfaces (not even IEnumerable), I
don't really have much confidence using it in my projects.

As an alternative, if you are using 2.0, I suggest you take a look at power
collections for .NET http://www.wintellect.com/powercollections/ They also
happen to have a implementation of red black tree (though it's only internal
and used by other collection objects) You might find something that suits
your need.

"Curious" wrote:
Trying the following code, throws an exception. The main reason is
that a static field exists to identify leaf nodes. Now, have two
instances the same static field is being used, which would not give
correct results.

RedBlack x = new RedBlack();
RedBlack y = new RedBlack();
x.Add(1, "one");
y.Add(2, "two");

Dec 13 '05 #11
Have you noticed that MSDN for .NET 2.0 now sports the Ordos costs of
different calls.
Like 'Calling this method uses a linear search for your object and is a O(N)
operation.'

I think that is excellent.

Not often you get to give kudos to Microsoft but the new MSDN is simply
great!

- Michael S
Dec 13 '05 #12
Hi,
I agree with the others that the Red/Black tree should be easy to fix,
and the SortedList will probably do what you want, too.

But you could also try my favorite data structure, the SkipList. There
is a c# implementation on CodeProject here:
http://www.codeproject.com/csharp/SkipList1.asp.

The article includes benchmarks against a SortedList, and SkipList beats
it handily. SkipList is also much simpler than a Red/Black tree, and
should give better average performance. You'll have to test this to be
sure, though.

Gabe
Curious wrote:
I am searching for a data structure that stores key-value pairs in it.
This data structure is to hold large amounts of key-value pairs, and so
needs to be efficient both in insertion and deletion.

Does anybody know of ready data structures that can be freely used
under the .Net framwork.

Also:
I forgot to mention, that the data needs to be maintained sorted.
When removing the data, the minimum key with its data is to be removed.

You won't *necessarily* get great efficiency on both insertion and
removal. But given your description so far it sounds as if you can - I
would try a SortedList. The insertion uses .Add(object key, object
value) - which *should* be efficient, at least if the key elements
arrive in anything like random order. (Test it!) You can remove
elements either with .Remove(object key) or .RemoveAt(int index) - index
being 0 in your case of removing the element with least key. Either of
these *should* be quite efficient (O(log n) at worst) *if* the list is
internally maintained sorted as per the documentation. (Again, test it!)

HTH,
-rick-

Dec 13 '05 #13
At my office we've found SortedList to be slow on insertions.

That said, the OP never answered the question: How many items are we
talking about here? Curious, you said that you had to do a "a lot" of
insertions and removals. How many is "a lot"?

Dec 13 '05 #14

"Gabe Moothart" <ga**@imaginesystems.net> wrote in message
news:OX*************@TK2MSFTNGP12.phx.gbl...
Hi,
I agree with the others that the Red/Black tree should be easy to fix, and
the SortedList will probably do what you want, too.

But you could also try my favorite data structure, the SkipList. There is
a c# implementation on CodeProject here:
http://www.codeproject.com/csharp/SkipList1.asp.

The article includes benchmarks against a SortedList, and SkipList beats
it handily. SkipList is also much simpler than a Red/Black tree, and
should give better average performance. You'll have to test this to be
sure, though.

Gabe


I also thought of a SkipList, as it is a pretty neat alogorithm.
But then I had a look at his requirements:

1. A lot of data
2. Fast find.
2. Fast change.

Feel free to invent something that does this for him.

Really, if any one of us can do this, you better get the patent. Then you
can always call Google and license it to them. Don't think they would return
your call thou. Or just tell the whole world that every book on Ordos
(Big-O) is obsolete.

Either way: You will get a nobel price.

This is a classic problem.

Also why there is a market for high-inserts-databases (like billing systems,
Vodaphone do need to store that you made a call to charge you)... to
transactional databases (vanilla).. to
read-only-with-few-increments-databases (search-engines)

Back to SkipLists. While SkipList being smart and funny to code, they only
perform in theory. With virtual memory you get a lot of page-faults and
disk-swapping. This is why most databases work with 'blocks' and 'buckets'
to make sure that the Principle Of Locality somewhat holds.

But what I'm pondering on is; How often do it need to get sorted?

If there are 1000 changes for each read, I would consider a red-black-tree
with tree-sort on top of that.

Might involve a lot of RAM though....

- Michael S




but the requirements demands fast inserts and deletes.
Dec 13 '05 #15
The SortedDictionary<TKey, TValue> type in 2.0 is implemented as a
red-black tree. No need to use the PowerCollections.

Daniel Jin wrote:
out of curiosity, I took a look at the Red Black Tree implementation on the
codeproject. I could see no reason why the author used a static field since
nothing else is static. on top of that, it's also publicly exposed, means
any code can change it at will. combined with the Tree itself doesn't
implement any standard .NET collection interfaces (not even IEnumerable), I
don't really have much confidence using it in my projects.

As an alternative, if you are using 2.0, I suggest you take a look at power
collections for .NET http://www.wintellect.com/powercollections/ They also
happen to have a implementation of red black tree (though it's only internal
and used by other collection objects) You might find something that suits
your need.

"Curious" wrote:

Trying the following code, throws an exception. The main reason is
that a static field exists to identify leaf nodes. Now, have two
instances the same static field is being used, which would not give
correct results.

RedBlack x = new RedBlack();
RedBlack y = new RedBlack();
x.Add(1, "one");
y.Add(2, "two");

Dec 13 '05 #16
Curious,

The SortedDictionary in the 2.0 Framework is implemented with a red
black tree.

I posted a red black tree at
<http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=6644D97B-B743-47D9-A0E7-A207EEBFB0F9>
a long time ago that you can use on the 1.1 Framework. I also provide
a splay tree which you might want to experiment around with. Splay
trees have a unique implementation which makes them faster than red
black trees in some scenarios.

You can use the SortedList, but it is horribly inefficient on inserts
and deletes since it uses O(n log n) implementations.

Brian

Curious wrote:
Initially I started using the RedBlack tree downloaded from the
codeproject.com.

Then I discovered that two instances of this project(RedBlack) cannot
work together in the same application, since errors started to be
given.

I require something similar to that type of datastructure(tree).

I hope I have cleared my self better.


Dec 13 '05 #17

Brian Gideon wrote:
I posted a red black tree at
<http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=6644D97B-B743-47D9-A0E7-A207EEBFB0F9>
a long time ago that you can use on the 1.1 Framework. I also provide
a splay tree which you might want to experiment around with. Splay
trees have a unique implementation which makes them faster than red
black trees in some scenarios.


I forgot to mention that I provide a skip list as well.

Dec 13 '05 #18

Rick Lones wrote:
You won't *necessarily* get great efficiency on both insertion and removal. But
given your description so far it sounds as if you can - I would try a
SortedList. The insertion uses .Add(object key, object value) - which *should*
be efficient, at least if the key elements arrive in anything like random order.
(Test it!) You can remove elements either with .Remove(object key) or
.RemoveAt(int index) - index being 0 in your case of removing the element with
least key. Either of these *should* be quite efficient (O(log n) at worst) *if*
the list is internally maintained sorted as per the documentation. (Again, test
it!)

HTH,
-rick-


Actually, Add and Remove on the SortedList are O(n log n)
implementations.

Dec 13 '05 #19

Michael S wrote:
Back to SkipLists. While SkipList being smart and funny to code, they only
perform in theory. With virtual memory you get a lot of page-faults and
disk-swapping. This is why most databases work with 'blocks' and 'buckets'
to make sure that the Principle Of Locality somewhat holds.


I disagree. A skip list performs quite well in practice. It can be
tuned to balance memory and speed requirements, but usually consume
less memory than balanced binary search trees. That makes it
attractive. It's the theory that make prevent some people from using
it since *most* implementations use a random number generator and only
guarentee an average case operation of O(log n).

What is very attractive to the OP is that access to the max or min keys
can be implemented so that they are O(1) operations. That makes the
skip list an ideal candidate for a priority queue as well.

Brian

Dec 13 '05 #20

Brian Gideon wrote:
I forgot to mention that I provide a skip list as well.


Doh...it's at
<http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=3F773D8A-7E1A-40F2-8EC0-48FB811989AA>.

Dec 13 '05 #21

Brian Gideon wrote:
Actually, Add and Remove on the SortedList are O(n log n)
implementations.


Correction. It's O(n) according to the documentation.

Dec 14 '05 #22

Brian Gideon wrote:
You can use the SortedList, but it is horribly inefficient on inserts
and deletes since it uses O(n log n) implementations.


Correction. It's O(n) according to the documentation.

Dec 14 '05 #23

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Thomas Paul Diffenbach | last post: by
1 post views Thread by pedagani | last post: by
16 posts views Thread by Christian Christmann | last post: by
5 posts views Thread by sql_er | last post: by
82 posts views Thread by Bill David | last post: by
reply views Thread by Saiars | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.