469,327 Members | 1,242 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,327 developers. It's quick & easy.

C# DIctionary size limit

Hi Guys,

I have a class A composed of several string and int members.
I would like to manage a huge amount (several thousands) of A objects
in a dictionary where each object has its unique key.

Is there any known limitation or best practice that you think I should
consider?

Thanks,
Orsula.z
Aug 12 '08 #1
10 22086
On Aug 12, 4:53*pm, orsula <orsul...@gmail.comwrote:
I have a class A composed of several string and int members.
I would like to manage a huge amount (several thousands) of A objects
in a dictionary where each object has its unique key.

Is there any known limitation or best practice that you think I should
consider?
"Several thousand" is actually reasonably small. Basically if you've
got enough room in memory for the objects and a bit of overhead, the
dictionary should be fine.

Jon
Aug 12 '08 #2
Orsula,

Be aware that as with every system collection, you are only adding the
references of the objects, not the values itself, therefore a dictionary can
be in fact very small.

Cor

"orsula" <or******@gmail.comschreef in bericht
news:59**********************************@k13g2000 hse.googlegroups.com...
Hi Guys,

I have a class A composed of several string and int members.
I would like to manage a huge amount (several thousands) of A objects
in a dictionary where each object has its unique key.

Is there any known limitation or best practice that you think I should
consider?

Thanks,
Orsula.z
Aug 12 '08 #3
On Aug 12, 7:11*pm, "Jon Skeet [C# MVP]" <sk...@pobox.comwrote:
On Aug 12, 4:53*pm, orsula <orsul...@gmail.comwrote:
I have a class A composed of several string and int members.
I would like to manage a huge amount (several thousands) of A objects
in a dictionary where each object has its unique key.
Is there any known limitation or best practice that you think I should
consider?

"Several thousand" is actually reasonably small. Basically if you've
got enough room in memory for the objects and a bit of overhead, the
dictionary should be fine.

Jon
Thanks (again :) ) Jon
Aug 12 '08 #4
orsula wrote:
I have a class A composed of several string and int members.
I would like to manage a huge amount (several thousands) of A objects
in a dictionary where each object has its unique key.

Is there any known limitation or best practice that you think I should
consider?
No.

My understanding is that would be able to stuff 2GB/16B = 125 million
of entries in a Dictionary<string,Aon 32 bit Windows.

Arne
Aug 13 '08 #5
On Aug 13, 1:21*am, Arne Vajhøj <a...@vajhoej.dkwrote:
No.

My understanding is that would be able to stuff 2GB/16B = 125 million
of entries in a Dictionary<string,Aon 32 bit Windows.
Not quite. For one thing I think there's an upper limit of about 1GB
per object (including the entries array) in the CLR. I've also found
that for some reason (using a Dictionary<int,intto avoid creating
any extra objects) the memory per entry averages at around 40 bytes
rather than 16. No idea why at the moment - 16 would make sense to me
as well, although there's bound to be *some* extra overhead for the
buckets etc. Possibly alignment issues? Hmm...

Jon
Aug 13 '08 #6
A little bit difficult, because then all pairs should be referencing to null
and that is not possible.

Cor

"Arne Vajhøj" <ar**@vajhoej.dkschreef in bericht
news:48***********************@news.sunsite.dk...
orsula wrote:
>I have a class A composed of several string and int members.
I would like to manage a huge amount (several thousands) of A objects
in a dictionary where each object has its unique key.

Is there any known limitation or best practice that you think I should
consider?

No.

My understanding is that would be able to stuff 2GB/16B = 125 million
of entries in a Dictionary<string,Aon 32 bit Windows.

Arne

Aug 13 '08 #7
On Aug 13, 1:18*am, "Jon Skeet [C# MVP]" <sk...@pobox.comwrote:
On Aug 13, 1:21*am, Arne Vajhøj <a...@vajhoej.dkwrote:
No.
My understanding is that would be able to stuff 2GB/16B = 125 million
of entries in a Dictionary<string,Aon 32 bit Windows.

Not quite. For one thing I think there's an upper limit of about 1GB
per object (including the entries array) in the CLR. I've also found
that for some reason (using a Dictionary<int,intto avoid creating
any extra objects) the memory per entry averages at around 40 bytes
rather than 16. No idea why at the moment - 16 would make sense to me
as well, although there's bound to be *some* extra overhead for the
buckets etc. Possibly alignment issues? Hmm...

Jon
I just Reflector'd it. In addition to the key and the value the hash
code (int) and a next pointer (int) are also stored on each entry.
That would account for 16B. There is also an int[] array that is used
for indexing the entries array. That gets us to 20B. The resizing
algorithm uses the first prime number greater than twice the
hashtable's current size and uses that for the new size. That would
easily get you to 40.

Since we're discussing very large dictionaries I noticed that the
prime table is preloaded up to 7,199,369 (which is not to say that all
primes up to that value are included). Once it reaches that size the
next resize operation will start using the naive algorithm for
primality testing, but will *probably* yield 14,398,753, 28,797,523,
and 57,595,063. That gives us jumps to 288, 576, and 1152 MB
respectively. If the 1GB limit is indeed in play for arrays then we
might expect ~28 million to be the upper limit. Pure speculation.
Aug 13 '08 #8
On Aug 13, 4:25*pm, Brian Gideon <briangid...@yahoo.comwrote:
I just Reflector'd it. *In addition to the key and the value the hash
code (int) and a next pointer (int) are also stored on each entry.
That would account for 16B. *There is also an int[] array that is used
for indexing the entries array. *That gets us to 20B.
For some reason I didn't think that the int[] was the same length as
the main array - but I see that it is.
*The resizing
algorithm uses the first prime number greater than twice the
hashtable's current size and uses that for the new size. *That would
easily get you to 40.
Ah, I've just worked out a flaw in my test - I was only outputing the
result when the amount of memory changed, which would always give the
worst case. Printing out the best case as well shows around 20 bytes,
which now makes sense.
Since we're discussing very large dictionaries I noticed that the
prime table is preloaded up to 7,199,369 (which is not to say that all
primes up to that value are included). *Once it reaches that size the
next resize operation will start using the naive algorithm for
primality testing, but will *probably* yield 14,398,753, 28,797,523,
and 57,595,063. That gives us jumps to 288, 576, and 1152 MB
respectively. *If the 1GB limit is indeed in play for arrays then we
might expect ~28 million to be the upper limit. *Pure speculation.
Don't forget that the 1GB limit applies to individual objects AFAIK -
so only the 16 bytes per entry of the "main" array. The int[] is
separate. 57,595,063*16 = 921 521 008, which should be okay. However,
you've also got to have enough room for the old version as well, so
your whole process needs 1.5x as much memory, unless you preallocate
that capacity.

Unfortunately I don't know the exact details of all the limits, so I
won't analyse this any further, for fear of giving the wrong
information.

Jon
Aug 13 '08 #9
On Aug 13, 11:22*am, "Jon Skeet [C# MVP]" <sk...@pobox.comwrote:
Don't forget that the 1GB limit applies to individual objects AFAIK -
so only the 16 bytes per entry of the "main" array. The int[] is
separate. 57,595,063*16 = 921 521 008, which should be okay. However,
you've also got to have enough room for the old version as well, so
your whole process needs 1.5x as much memory, unless you preallocate
that capacity.
Ah yes. Good points. Also in the sake of full disclosure, that
57595063 value was based on a quick statistical test for primality so
it may not even be prime for all I know. I'm sure there is a prime
somewhere in that ballpark so it's probably not going to skew the
numbers too much.
>
Unfortunately I don't know the exact details of all the limits, so I
won't analyse this any further, for fear of giving the wrong
information.
Understood.

Aug 13 '08 #10
On Aug 13, 1:37*am, "Cor Ligthert [MVP]" <notmyfirstn...@planet.nl>
wrote:
A little bit difficult, because then all pairs should be referencing to null
and that is not possible.

Cor
Well, as long as TValue is a reference type then the value could be
null. I haven't actually tested that because I've never done it, but
it should be possible according to the documentation.
Aug 13 '08 #11

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Benjamin Scott | last post: by
2 posts views Thread by John | last post: by
2 posts views Thread by Kums | last post: by
5 posts views Thread by Andrew Robinson | last post: by
8 posts views Thread by Brian L. Troutwine | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by listenups61195 | last post: by
reply views Thread by Purva khokhar | last post: by
reply views Thread by haryvincent176 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.