473,385 Members | 1,486 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

GetHashCode()

Hi
I have question for GetHashCode() function, Is it correct in following code or there is more efficient way to implement GetHashCode() function

class IntArray
public int[] data

public override int GetHashCode()
int hash = 0
for (int i = 0; i < data.Length; i++

hash ^= data[i]

return hash
public override bool Equals(System.Object cs_

if (obj == null || GetType() != cs_.GetType()) return false

IntArray cs = (IntArray) cs_
if (data.Length != cs.data.Length) return false
for (int i = 0; i < data.Length; i++

if (data[i] != cs.data[i]) return false

return true

As documentation says
If two objects of the same type represent the same value, the hash function must return the same constant value for either object.

Is CLI library is viewable or open source? So I can take reference. Can any one point where to get it

Thank you
Avin Patel
Nov 16 '05 #1
7 6549
Avin Patel wrote:
Hi,
I have question for GetHashCode() function, Is it correct in following code or there is more efficient way to implement GetHashCode() function.

class IntArray {
public int[] data;

public override int GetHashCode() {
int hash = 0;
for (int i = 0; i < data.Length; i++)
{
hash ^= data[i];
}
return hash;
}

public override bool Equals(System.Object cs_)
{
if (obj == null || GetType() != cs_.GetType()) return false;

IntArray cs = (IntArray) cs_;
if (data.Length != cs.data.Length) return false;
for (int i = 0; i < data.Length; i++)
{
if (data[i] != cs.data[i]) return false;
}
return true;
}
}

As documentation says:
If two objects of the same type represent the same value, the hash function must return the same constant value for either object.
In my opinion, if the GetHashCode() function takes about as much
processing as Equals(), then there's not much point to it.

However, only if you have a decent idea of what type of data will be in
your class can you make a really good determination of what the hashcode
function should do.

For example, if your array is only ever going to have at most 10
elements, then there's not much you can do to make the hash function
better - it's going to be pretty fast no matter what.

However, if your arrays are going to routinely have 100000 elements or
more, then it probably makes sense to have a different hash function.
Suppose that for any 2 different arrays, you can expect that 99.9% of
the time that they will also be different in the first 10 elements. if
that's the case, you can hash over only the 1st 10 elements and you'd
have a good hash function.

However, if the 1st 10 elements are likely to be the same (maybe they're
a fileheader), then that would not be a good hash function... but maybe
hashing over the last 10 elements would work.

If you have no idea what kind of data will be in the array, then I'd
probably hash over the 1st X number of elements, where X is a number
that seems to work well after experimentation.

Is CLI library is viewable or open source? So I can take reference. Can any one point where to get it?


See Microsoft's 'Shared Source CLI':
http://www.microsoft.com/downloads/d...6-8DD34C6292F0

or Mono:

http://www.go-mono.com/

--
mikeb
Nov 16 '05 #2
mikeb <ma************@nospam.mailnull.com> wrote:
In my opinion, if the GetHashCode() function takes about as much
processing as Equals(), then there's not much point to it.


Why not? Suppose you have 100 keys in a hashtable, and you know the
hashcodes of all of them, and they're all distinct.

Finding the matching object when you're presented with a new key (to
retrieve with) only requires the hashcode of the new key to be
calculated, and then Equals to be called. Compare that with calling
Equals on *every* key within the table.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #3
Hi
So now as I understood, objects are stored in Hash Table. For Equals() function call, First it checks the matching Hash Key, If it is than it checks for matching value. If hashkey doesn't match than it will call Equals() function further

So I guess I should be calculating GetHashKey() function value for first 20 elements in array. As array size is not known to me

Thank you
Avin Patel
Nov 16 '05 #4
Avin Patel <an*******@discussions.microsoft.com> wrote:
So now as I understood, objects are stored in Hash Table. For
Equals() function call, First it checks the matching Hash Key, If it
is than it checks for matching value. If hashkey doesn't match than
it will call Equals() function further.
No, it only calls Equals if the hash *does* match. My point was that it
would have to call Equals on everything if hashing didn't exist at all.
So I guess I should be calculating GetHashKey() function value for
first 20 elements in array. As array size is not known to me.


Well, you could - but that would give an awful hash function if the
first 20 elements were the same for all instances, for example.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5
Jon Skeet [C# MVP] wrote:
mikeb <ma************@nospam.mailnull.com> wrote:
In my opinion, if the GetHashCode() function takes about as much
processing as Equals(), then there's not much point to it.

Why not? Suppose you have 100 keys in a hashtable, and you know the
hashcodes of all of them, and they're all distinct.

Finding the matching object when you're presented with a new key (to
retrieve with) only requires the hashcode of the new key to be
calculated, and then Equals to be called. Compare that with calling
Equals on *every* key within the table.


True - I went overboard. However, if you can generate a good hash (no or
very rare collisions) by iterating over 10 elements (or 20 or whatever),
then it's not really necessary to iterate over the entire object (if
it's large).

Note that I'm hiding behind a lot of caveats there. As with so many
things it can all depend on what you know about the data (if anything),
and there's no reason to optimize before knowing there's a problem.
Especially a hash function, since it's generally so easy to speed up if
it did turn out to be a problem.

--
mikeb
Nov 16 '05 #6
The requirement is that a.GetHashCode() and b.GetHashCode() be equal if
a.Equals(b) is true, but not in the other direction: a.Equals(b) may be
false, although a.GetHashCode() and b.GetHashCode() are equal.

So, you don't need to be as precise in your GetHashCode() as in your Equals
method. If your arrays are likely to contain lots of elements, I would use a
"sampling" of the elements to compute GetHashCode(). Something like:

const int HashSampleLen = 20; // adjust as necessary

public override int GetHashCode()
{
int hash = 0;
int step = Max(data.Length / HashSampleLen, 1);
for (int i = 0; i < data.Length; i += step)
hash ^= data[i];
return hash;
}

The System.String class uses a similar strategy to compute fast hash codes
even on long strings.

Bruno.

"Avin Patel" <an*******@discussions.microsoft.com> a écrit dans le message
de news:BF**********************************@microsof t.com...
Hi,
I have question for GetHashCode() function, Is it correct in following code or there is more efficient way to implement GetHashCode() function.
class IntArray {
public int[] data;

public override int GetHashCode() {
int hash = 0;
for (int i = 0; i < data.Length; i++)
{
hash ^= data[i];
}
return hash;
}

public override bool Equals(System.Object cs_)
{
if (obj == null || GetType() != cs_.GetType()) return false;

IntArray cs = (IntArray) cs_;
if (data.Length != cs.data.Length) return false;
for (int i = 0; i < data.Length; i++)
{
if (data[i] != cs.data[i]) return false;
}
return true;
}
}

As documentation says:
If two objects of the same type represent the same value, the hash function must return the same constant value for either object.
Is CLI library is viewable or open source? So I can take reference. Can any one point where to get it?
Thank you,
Avin Patel

Nov 16 '05 #7
Hi Bruno,

The requirement is that a.GetHashCode() and b.GetHashCode() be equal if
a.Equals(b) is true, but not in the other direction: a.Equals(b) may be
false, although a.GetHashCode() and b.GetHashCode() are equal.

This is very good point. I have done sampling for first 20 elements.

Thank You,
Avin Patel
Nov 16 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Fuzzy | last post by:
I have defined a struct in my application that contains 3 integers that maintain state information. I have a lot of these structs in time-critical portions of my app, so they must be as fast as...
5
by: Stoyan | last post by:
Hi All, I don't understand very well this part of MSDN: "Derived classes that override GetHashCode must also override Equals to guarantee that two objects considered equal have the same hash code;...
8
by: Matthias Kientz | last post by:
I have a class which should override the GetHashcode() function, which look like this: public class A { public string str1; public string str2; //other methods...
2
by: perspolis | last post by:
Hi I have a question about GetHashCode for strings. Is it unique for all string in diffrent OS? for example "test".GetHashCode () is the same for all of OS? thanks in advance
1
by: MariusI | last post by:
I have some business objects which overrides Equals to provide syntax equality instead of just reference equality. Overriding equals gives me a warning that i must override GetHashcode(). Msdn says...
3
by: MuZZy | last post by:
Hi, Is there any guarantee that MD5 hashing algorithm implementation will not change in the next .NET version unlike what's happened to String.GetHashcode? Thank you, MuZZy
5
by: Metaman | last post by:
I'm trying to write a generic method to generate Hashcodes but am having some problems with (generic) collections. Here is the code of my method: public static int GetHashCode(object input) {...
8
by: Ashish Khandelwal | last post by:
-----See below code, string str = "blair"; string strValue = "ABC"; string str1 = "brainlessness"; string strValue1 = "XYZ"; int hash = str.GetHashCode() ; // Returns 175803953 int hash1 =...
28
by: Tony Johansson | last post by:
Hello! I can't figure out what point it is to use GetHashCode. I know that this GetHashCode is used for obtaining a unique integer value. Can somebody give me an example that prove the...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.