473,396 Members | 1,879 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Generate hash code

Hi,

How do I write a GenerateHashcode function that will generate guaranteed
unique hashcodes for my classes?

cheers,
mortb
Jan 18 '06 #1
8 13346
If you mean unique per instance, then the only guaranteed way that I know of
is to have something unique in each instance - a simple static int counter
(along with an instance field) would suffice, where each instance (in the
instance ctor) gets the next value and increments the static counter (pref.
using thread-safe increment).

*however* Assuming your hash would be Int32, you are then limited to 2^32
objects - otherwise then can't be unique! (assuming no complex code to
recycle numbers)

You could implement this either as an abstract base class, but then your
*total* instances (of subtypes of this) would be limited to 2^32.

Is it actually required that they be unique? Although not unknown, this is
not (to my mind) recommended usage for a hash value?

Marc
Jan 18 '06 #2
As Marc alluded to, you might be better off looking at variations of
Guid.NewGuid
to ensure "Absolute" uniqeness. Unless of course you are travelling to Pluto
on the probe, in which case there's still a very remote possibilty there
could be a duplicate before you arrive there...
Peter

--
Co-founder, Eggheadcafe.com developer portal:
http://www.eggheadcafe.com
UnBlog:
http://petesbloggerama.blogspot.com


"mortb" wrote:
Hi,

How do I write a GenerateHashcode function that will generate guaranteed
unique hashcodes for my classes?

cheers,
mortb

Jan 18 '06 #3
I've thought a bit about the issue with hashcodes.

I use hashtables and I suppose that they use the GetHashcode function to
determine the identity of the objects stored. Then I thought that the
GetHashcode would have to return an unique number for every possible object
if I am to use it as key in the hashtable.

I've seen a few implementations on the internet looking like this:

Class C
{
public int propertyA;
public int propertyB;

public override GetHashcode()
{
return propertyA.GetHashcode ^ propertyB.GetHashcode;
}
}

Whould the bitwize OR generate unique hashcodes for objects with all
possible combinations in propertyA and propertyB?

In my current application there are several types of objects that inherit
from a base class that we've written. They are stored in a database and have
a primary key combined from two values in the database. The two values are
object id and version number.Then they have sveral other properties such as
name, createdByUserId, createTimeStamp, updateTimeStamp.

My guess is that I could write a GetHashcode function in the base class
looking like this:

GetHashcode()
{
return (typeof(this)).GetHashcode() ^ this.objectId ^
this.versionNumber();
}

Would this be a good implementation?
Are there any resources which sumarize how to quickly write *good*
GetHashcode() functions?

thanks for the help!
/mortb

"Marc Gravell" <mg******@rm.com> wrote in message
news:%2****************@TK2MSFTNGP11.phx.gbl...
If you mean unique per instance, then the only guaranteed way that I know
of is to have something unique in each instance - a simple static int
counter (along with an instance field) would suffice, where each instance
(in the instance ctor) gets the next value and increments the static
counter (pref. using thread-safe increment).

*however* Assuming your hash would be Int32, you are then limited to 2^32
objects - otherwise then can't be unique! (assuming no complex code to
recycle numbers)

You could implement this either as an abstract base class, but then your
*total* instances (of subtypes of this) would be limited to 2^32.

Is it actually required that they be unique? Although not unknown, this is
not (to my mind) recommended usage for a hash value?

Marc

Jan 18 '06 #4

mortb wrote:
I've thought a bit about the issue with hashcodes.

I use hashtables and I suppose that they use the GetHashcode function to
determine the identity of the objects stored.
Well, to determine a hash key for the objects, rather than the unique
identity of the objects.
Then I thought that the
GetHashcode would have to return an unique number for every possible object
if I am to use it as key in the hashtable.
No; the point of a hash function is to provide a hash key that will
*usually* be different. Hash collisions aren't the end of the world.
Suppose I have a hashtable that stores books, using the 4th digit of
the ISBN as a hash key. If I get lucky, I will be able to store 10
books before getting a hash collision. But at some point, my hash
buckets are going to start having more than one book in. That's not an
error, it's part of the design of a hashtable. The point is that by
storing the books in ten different buckets, the list of books I have to
search to find a particular one is typically ten times shorter than if
I just kept them all in one list. That's the point of a hash table.

If hashcodes had to be unique, then no hashtable could ever store more
than (the number of possible hashcodes) items, which wouldn't be
useful; and furthermore every object that ever wanted to be stored in a
hashtable would somehow have to obtain a unique hashcode, which would
be extremely unpleasant.

I've seen a few implementations on the internet looking like this:

Class C
{
public int propertyA;
public int propertyB;

public override GetHashcode()
{
return propertyA.GetHashcode ^ propertyB.GetHashcode;
}
}

Whould the bitwize OR generate unique hashcodes for objects with all
possible combinations in propertyA and propertyB?
It wouldn't; the purpose is to create some number that partakes of the
hashcodes for both propertyA and propertyB.

In my current application there are several types of objects that inherit
from a base class that we've written. They are stored in a database and have
a primary key combined from two values in the database. The two values are
object id and version number.Then they have sveral other properties such as
name, createdByUserId, createTimeStamp, updateTimeStamp.

My guess is that I could write a GetHashcode function in the base class
looking like this:

GetHashcode()
{
return (typeof(this)).GetHashcode() ^ this.objectId ^
this.versionNumber();
}

Would this be a good implementation?
Sure.
Are there any resources which sumarize how to quickly write *good*
GetHashcode() functions?


Not that I've seen but I'm sure someone has, so watch this space.

--
Larry Lard
Replies to group please

Jan 18 '06 #5

"mortb" <mo****@hotmail.com> wrote in message
news:ex**************@TK2MSFTNGP10.phx.gbl...
I've thought a bit about the issue with hashcodes.
.... I've seen a few implementations on the internet looking like this:

Class C
{
public int propertyA;
public int propertyB;

public override GetHashcode()
{
return propertyA.GetHashcode ^ propertyB.GetHashcode;
}
}

Whould the bitwize OR generate unique hashcodes for objects with all
possible combinations in propertyA and propertyB?

....

No, it would not. Btw, it not bitwise or, it's XOR (exclusive or). For
example, 1 ^ 0 yields the same result as 0 ^ 1. If hashcode is 32 bit wide
int, there are 2^32 possible hashcodes while there are 2^32 * 2^32 (2 ^ 64)
possible combinations of 2 ints. So they must collide. But, hashcode does
not have to be unique. You can even do:

public override int GetHashcode()
{
return 0;
}

and HashTable should be working ok (with the performance about the same as
simple list). Though I don't know what for you need hash.

Regards,
Goran
Jan 18 '06 #6
The hashcode is not the same as the key; the hashcode is just used to
partition the items (fairly arbitrarily) - so that it can jump to a subset
of the larger list before doing any real work. This is why IEqualityComparer
requests both an equality method and a hash method.

I've never written a hashtable from scratch, but I *believe* it tends to
work something like the following:

* The hashtable has a number of buckets; lets say 20 (although this is often
dynamic based on the amount of data contained)
* When adding items, it gets the hashcode (lets say it return 342545), and
then takes the mod (remainder on division) of this with the bucket count to
choose a bucket (bucket 5 in this case)
* When you query the hashtable, it
1: gets the hashcode of your argument and does the same operation to
pick the bucket to look within
2: scans the contents of this bucket (a much shorter list), possibly
checking Equals, or possibly first checking the hash of each item and then
checking Equals if the hash matched (i.e. something in bucket 5 could have a
hash of 342545, or it could have a hash of 25)

This raises another implementation point:

If 2 instances share a hash-code, it is *not* necessary for them to count as
equal (via Equals)
HOWEVER: if two instances *should* count as equal, then the hash codes
should be identical

Of course, if you use any of the inbuilt primatives (int, string, etc) for
your key you will be pretty safe

Marc
Jan 18 '06 #7
Thanks Marc, Larry, Goran and Peter for explaining things to me!

I conclude:

* GetHashcode has nothing to do whith uniqueness except that two identical
objects should generate the same hashcode. Instead GetHashcode is used to
provide a good spreeding in the hashtable. It should just do that and do it
quick.

* Hash table only allows one unique key. The key's uniqueness in
Hashtable.Add(key, object) is not determined - as I thougth - by using the
GetHashcode function. I suppose it uses object.Equals(x) or its override.

What think I should do is to override both GetHashcode and Equals(x) :)

thanks!
mortb

"Peter Bromberg [C# MVP]" <pb*******@yahoo.nospammin.com> wrote in message
news:DF**********************************@microsof t.com...
As Marc alluded to, you might be better off looking at variations of
Guid.NewGuid
to ensure "Absolute" uniqeness. Unless of course you are travelling to
Pluto
on the probe, in which case there's still a very remote possibilty there
could be a duplicate before you arrive there...
Peter

--
Co-founder, Eggheadcafe.com developer portal:
http://www.eggheadcafe.com
UnBlog:
http://petesbloggerama.blogspot.com


"mortb" wrote:
Hi,

How do I write a GenerateHashcode function that will generate guaranteed
unique hashcodes for my classes?

cheers,
mortb

Jan 18 '06 #8
To have guaranteed unique IDs, he needs a static counter and use this
(incrementing it for each new object, thread-safe, storing the "current"
value persistently if the objects are stored and need their ID stored also).

It's not a hash, but a hash is usually not guaranteed to be unique.

Unless for a counter (with sufficient range!), there's no uniqueness in the
world by any random value.

Ch.

"Peter Bromberg [C# MVP]" <pb*******@yahoo.nospammin.com> wrote in message
news:DF**********************************@microsof t.com...
As Marc alluded to, you might be better off looking at variations of
Guid.NewGuid
to ensure "Absolute" uniqeness. Unless of course you are travelling to
Pluto
on the probe, in which case there's still a very remote possibilty there
could be a duplicate before you arrive there...
Peter

--
Co-founder, Eggheadcafe.com developer portal:
http://www.eggheadcafe.com
UnBlog:
http://petesbloggerama.blogspot.com


"mortb" wrote:
Hi,

How do I write a GenerateHashcode function that will generate guaranteed
unique hashcodes for my classes?

cheers,
mortb

Jan 18 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Mamuninfo | last post by:
Hello, Have any function in the DB2 database that can generate unique id for each string like oracle, mysql,sybase,sqlserver database. In mysql:- select md5(concat_ws("Row name")) from...
16
by: dutche | last post by:
Hi, this is my first post here and I'm very very new in 'C' programming. And I began to 'translate' my PHP program, that beyond a lot of things generates a md5 hash, and I must generate a md5 hash...
2
by: Henry | last post by:
Hi, How can I generate an eight digit random? Can I use the staff name to generate it? May I ask is there any sample c# code to see? Thanks
12
by: Jim Michaels | last post by:
I need to generate 2 random numbers in rapid sequence from either PHP or mysql. I have not been able to do either. I get the same number back several times from PHP's mt_rand() and from mysql's...
3
by: RossettoeCioccolato | last post by:
Is there a brief tutorial somewhere on how to use the VC8 linker to generate a manifest for an isolated application with a dependency section for an arbitrary dll? There are some implementation...
0
by: rayapati | last post by:
dear all, i want to know how to generate a hash key in c language please can you tell me about that code or logic how to implement that i need it urgently, thank you,
9
by: Omatase | last post by:
I have a set of about 6 or so strings that I need to use to generate a unique hash. This hash will become the unique key in a database so the hash has to be the same each time I gen it for any 1...
15
by: Ashish Khandelwal | last post by:
As MSDN is not giving us guarantee upon uniqueness of Hash Code, so could any one suggest me that how to generate a unique Hash Code for same string always, and generate different-2 Hash Code...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.