473,890 Members | 1,369 Online

# about Equals and GetHashCode

Hello!

I can't figure out what point it is to use GetHashCode. I know that this
GetHashCode is used for obtaining a unique integer
value.
Can somebody give me an example that prove the usefulness of this
GetHashCode or it it of no use at all?

A mean that if I get an integer from current time in some way what should I
use it for?

One more question should these Equals and GetHashCode be used in pair?

//Tony

Jun 27 '08
28 3797
On Jun 12, 6:51*pm, "Peter Duniho" <NpOeStPe...@nn owslpianmk.com>
wrote:
You could still multiply the 0 by 37 in the first field incorporated, *
preserving the symmetry aspect. *I just don't see the point in starting *
with 17 versus 0. *It has the sense of "magic" about it, but unlike the *
multiplication by 37, I don't see any practical advantage. *A less *
"magical" initializer of 0 would be just as good.

I think. *:)

Pete
It does seem like magic doesn't it? But, then again mathematics is
full intriguing quirks so it wouldn't surprise me if there was a valid
reason.

It reminds of me of the section in Knuth Vol 2 where he talks about
random number generators and warns that using one random number
generator to seed another generator may actually make the end result
*less* random! I know...complete ly off topic. My mind works in
mysterious ways :)

Jun 27 '08 #21
On Thu, 12 Jun 2008 19:41:32 -0700, Brian Gideon <br*********@ya hoo.com>
wrote:
It does seem like magic doesn't it? But, then again mathematics is
full intriguing quirks so it wouldn't surprise me if there was a valid
reason.
Well, in this case it's pretty much just algebra. We're not in the realm
of "intriguing quirks of mathematics". It's proveable that the
initializer always winds up just affecting the hash code by a fixed
constant. It's hard to see how that fixed constant could be significant.
At best, it would just shift all of the hashed codes by the same amount
along the bins for the hash table. It won't change their relative
arrangement, and thus wouldn't affect the actual operation.

Pete
Jun 27 '08 #22
On Jun 13, 12:51 am, "Peter Duniho" <NpOeStPe...@nn owslpianmk.com>
wrote:
No, my point is that you could just initialize the local hash variable to
0, rather than 17.

You could still multiply the 0 by 37 in the first field incorporated,
preserving the symmetry aspect. I just don't see the point in starting
with 17 versus 0. It has the sense of "magic" about it, but unlike the
multiplication by 37, I don't see any practical advantage. A less
"magical" initializer of 0 would be just as good.

I think. :)
Okay, I'll have to consult Effective Java, which is where the advice
originally came from. I believe there's some benefit to having two
magic numbers which are coprime to each other. Wikipedia doesn't show
this as one of the example hash functions, unfortunately, so there's
no discussion there to help :(

Jon
Jun 27 '08 #23
On Thu, 12 Jun 2008 22:54:40 -0700, Jon Skeet [C# MVP] <sk***@pobox.co m>
wrote:
Okay, I'll have to consult Effective Java, which is where the advice
originally came from.
For what it's worth, the last discussion I was in when this came up,
Bloch's book was also quoted. So at least I'm seeing a consistent source
here. :)
I believe there's some benefit to having two
magic numbers which are coprime to each other.
As I mentioned, my inspection of the problem concluded that no matter what
number you pick for the initial value, it factors out to a single constant
added to the final result. Straight algebraic manipulation. And since
adding a single constant to a hash code isn't going to affect its
usefulness, selection of some specific initial value should not either.

I don't have a copy of "Effective Java", but if you do and can go back to
look to see if he offered any sort of justification for the initializer, I
suppose I'd be interested in hearing that. I've been wrong before, but
usually there's someone around who can explain how or why. Or, I suppose,
maybe that's not true and I've actually been wrong a lot more than I or
anyone else realized. :) But assuming the latter isn't actually the
case, I remain relatively confident that I've figured this one correctly.
Until proven otherwise, of course. :)

Pete
Jun 27 '08 #24
On Jun 13, 3:28 am, Brian Gideon <briangid...@ya hoo.comwrote:
To be perfectly honest, I'm not sure how important it is for the hash
code to be absolutely random as opposed to just good.
I think to know that we'd have to know exactly what you mean by
"random". Completely evenly distributed?

I suspect it's beneficial because whatever the hashtable does to
process hashcodes, if you have some pattern/clustering to your data
you're more likely to get a poor distribution in terms of buckets.
However, I would hope that a good hashtable implementation would make
it *unlikely* that you'd see a problem. Then again, my understanding
of the *exact* implementation of hashtables is somewhat lacking.
I think that would be an interesting topic for a blog. I do try to
read your blog posts so I'll catch it if you do decide to post on the
topic.
I'll put it on my mental list of future posts :)

Jon
Jun 27 '08 #25
Re adding a seed; actually, it has one advantage... a lot of default
instances get created just with new Foo() or
Activator.Creat eInstance()
Without a seed, this type of hash algorithm (using + and *) will
return 0 for every default instance. Which is fine in itself, since
every default instance is broadly comparable... but! Now imagine that
we are comparing tuples of values, or lists of values - where the
tuple's hashcode is a composite (again using + and *) of the contained
values' hashcodes.
It is nice (from a purely theoretical standpoint) to be able to
immediately distinguish a set of 2 (default instances) from a set of 8
(default instances) - or the sets {A,B} and {[default],A,B}.

OK; in reality, this is an unlikely scenario - but it does still seem
desirable to avoid any "obvious" causes of zero. Yes, even with a seed
you will still get zero from some values, but then as a genuine
exception, and so it is not an issue.

Re random - "uniform" would be a better term; Int32.GetHashCo de() is
uniform but not random.

Marc
Jun 27 '08 #26
On Thu, 12 Jun 2008 23:36:38 -0700, Marc Gravell <ma**********@g mail.com>
wrote:
[...]
It is nice (from a purely theoretical standpoint) to be able to
immediately distinguish a set of 2 (default instances) from a set of 8
(default instances) - or the sets {A,B} and {[default],A,B}.
Um, okay. I'll buy that. Still, it seems like a pretty esoteric scenario
to drive a hash code implementation. Especially since it still doesn't
address sets of identical lengths. :)
[...]
Re random - "uniform" would be a better term; Int32.GetHashCo de() is
uniform but not random.
Agreed. I was going to pick that nit myself, but then I didn't. :)

Pete
Jun 27 '08 #27
Um, okay. I'll buy that. Still, it seems like a pretty esoteric scenario
to drive a hash code implementation. Especially since it still doesn't
address sets of identical lengths. :)
Re identical lengths - if the lengths and contents match, I'm happy
for the hashcode to match under that approach, since they sound like
the same set (assuming the definition of equality ties with
GetHashCode(), which it should).

Marc
Jun 27 '08 #28
On Jun 13, 12:54*am, "Jon Skeet [C# MVP]" <sk...@pobox.co mwrote:
Okay, I'll have to consult Effective Java, which is where the advice
originally came from. I believe there's some benefit to having two
magic numbers which are coprime to each other. Wikipedia doesn't show
this as one of the example hash functions, unfortunately, so there's
no discussion there to help :(

Jon
Okay, I'm looking through Knuth Vol 3 (those who have read the books
know that it isn't an easy read) on hashing and he talks about
choosing a prime number as the multiplier for the "scrambling
function" (aka GetHashCode). However, I don't see anything about the
initial seed being a prime number.

My best guess right now is definitely along the lines of what Marc
suggested. It has more to do with the end result you get from
combining the values of more than one call to GetHashCode (via
combinations of + and *).

From clues in Knuth this may be related to random number generation
theory. Since the scrambling function needs to be random it's easy to
see how that works in. Seeding the hash code with a number that is
coprime to the multiplier may be an attempt to prevent unwanted
periods from developing in the final hash code when combinging results
from multiple calls to GetHashCode.
Jun 27 '08 #29

This thread has been closed and replies have been disabled. Please start a new discussion.