By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,950 Members | 1,035 Online
Bytes IT Community
Submit an Article
Got Smarts?
Share your bits of IT knowledge by writing an article on Bytes.

Overriding the equals and hashCode methods

10K+
P: 13,264
The Object class has five non final methods namely equals, hashCode, toString, clone, and finalize.
These were designed to be overridden according to specific general contracts. Other classes that make use of these methods assume that the methods obey these contracts so it is necessary to ensure that if your classes override these methods, they do so correctly.

In this article I'll take a look at the equals and hashCode methods.

Overriding the equals method

Override the equals method when you want to specify the rules of logical equality of objects. What is logical equality? A simple definition is that two objects are logically equal if they have the same values for the same uniqueness attributes.

The contract

The equals method actually implements an equivalence relation between (non null) object references. Here are the rules that the equals method must obey:


  • Symmetry: For any reference values x and y, x.equals(y) is true implies that y.equals(x) is also true.
  • Reflectivity: For any reference value x, x.equals(x) must always return true.
  • Consistency: For any reference values x and y, x.equals(y) consistently returns true or consistently returns false, provided no information used in equals comparisons on the object is modified.
  • Transitivity: For any reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) must return true.
Note that all the reference values being referred to above must be non null. The rule for the null reference is
  • For any non-null reference value x, x.equals(null) must return false.
It's already there?

The object class already provides an implementation of the equals method. Here it is
Expand|Select|Wrap|Line Numbers
  1.  public boolean equals(Object obj) {
  2.          return (this == obj);
  3.     }
Not much is it?
The method simply tests for equality of object references. This is not always the desired behavior, particularly when comparing Strings. That's why the String class provided its own implementation of the equals method.

Personality test

Suppose you are creating a Person class with each person having a title, fullName and age. You could possibly decide that two Person objects must be referring to the same person if they have exactly the same fullName, title and age fields. i.e your uniqueness attributes set is (fullName, title, age) Your class could look like this:

Expand|Select|Wrap|Line Numbers
  1.  public class Person {
  2.         String title;
  3.         String fullName;
  4.         int age;
  5.         public Person(String title, String fullName, int age) {
  6.                          this.title = title;
  7.                  this.fullName = fullName;
  8.                  this.age = age;
  9.         }
  10.          String getFullName() {
  11.                        return fullName;
  12.          }
  13.          int getAge() {
  14.               return age;
  15.              }
  16.              String getTitle() {
  17.              return title;
  18.         }
  19. }
Now if you create two person objects with the same attributes, you'd want those two objects to be the same person.
If you do
Expand|Select|Wrap|Line Numbers
  1. Person admin1 = new Person("Admin", "r035198x", 2); //I'm not giving away my age that easily
  2. Person admin2 = new Person("Admin", "r035198x", 2);
  3. System.out.println(admin1.equals(admin2));
you will get false because admin1 and admin2 are different references and the equals method being used here (from the object class) is comparing references. So here we need to override the equals method to check for our uniqueness attributes for our comparisons to work.

There are a few things you need to be wary of when providing your own equals method.

Silly mistake

A silly mistake is to overload the equals method instead of overriding it!
The signature of the equals method is
Expand|Select|Wrap|Line Numbers
  1.  public boolean equals(Object obj) 
So having
Expand|Select|Wrap|Line Numbers
  1. public boolean equals(Person person) 
does not override the equals method. It simply creates another equals method (an overload of the equals method). The argument to the equals method must be an Object.

Optimizing it

Another important issue is optimization. Remember that symmetry says that an object is always equal to itself (x.equals(x) should always return true). We can take advantage of this and test for this case first. A “clever” implementation of equals should thus start as

Expand|Select|Wrap|Line Numbers
  1.  public boolean equals(Object obj) {
  2.            if(this == obj) {
  3.                 return true;
  4.             }
  5.       ...
There is no need to trouble the processor with further tests if the object passed is itself.

Killing many birds with one instanceof

Next you now need compare the actual field values to determine the equality based on your logic. Because the argument passed to the equals method is always of type Object, you'll aways need to cast it to an object of your class type first before comparing the fields. In our example we'll need to cast to an object of type Person.

What if we are passed a String (or an object of any other type) instead of a Person object?
This is where the instanceof operator comes in handy. The instanceof operator actually does a lot more than ensure that our cast does not fail. It provides some optimization as well because if x instanceof y is false then x.equals(y) is always false. We can thus use it to eliminate any objects passed which cannot be logically equal to our object by virtue of their class type. instanceof also eliminates the trivial case when the object passed is null because it returns false if the first argument is null. So we have
Expand|Select|Wrap|Line Numbers
  1.  if (!(obj instanceof Person)) {
  2.         return false; 
  3. }
Putting it together

We now need to add our own logic for the equality test.
Expand|Select|Wrap|Line Numbers
  1. public boolean equals(Object obj) {
  2.           if(this == obj) {
  3.                 return true;
  4.            }
  5.            if (!(obj instanceof Person)) {
  6.                   return false; 
  7.            }
  8.            Person person = (Person)obj;
  9.            return age == person.getAge() && fullName.equals(person.getFullName())
  10.         && title.equals(person.getTitle());
  11.  
  12.     }
  13.  
This time
Expand|Select|Wrap|Line Numbers
  1. System.out.println(admin1.equals(admin2)); 
will print true.

Inheritance may break the contract

Now suppose we extend the Person class with say a ChessPlayer class to have ranking and nationality as additional fields to those already inherited from the Person class. Suppose further that we want uniqueness of ChessPlayers to be determined by title, fullName, age, nationality and ranking. Now our equals method from the Person class no longer suffices.
You can see that it is impossible to extend a concrete class adding an aspect without violating the equals contract. Think about this every time you extend a class. The case when an aspect is being added to an abstract class is best left for researching by the reader.
The equals method thus needs to be rewritten in the ChessPlayer class to cater for the nationality and ranking aspects and while I'm at it I might as well leave that as an exercise to the reader.

Simple ones first


Note that I deliberately compared the ages (integers) first. The && operator has short circuit behavior, meaning that if the age comparison fails the rest of the comparison is abandoned and false is returned. It is therefore a performance advantage to have the cheaper (memory wise) tests first and the more memory demanding tests last.

Why bother?

There are thankfully some instances when you don't have to override the equals method.
  • When the references check is sufficient. This is when each instance of the class is unique. The Thread class is an example.
  • A parent class already has implemented the desired behavior. You have to be careful with this and make sure that the parent class' equals method really is sufficient for the subclass.
Guarding against stupid programmers

Note that if it does not make sense to define logical equality for a class then you should include the equals method and simply throw an UnsupportedOperationException. If you don't include it then if someone invokes it, they will get a super class implementation and get results when the comparison is not supposed to be allowed.

An Exception

From the specification of the Timestamp class we find

Note: This type is a composite of a java.util.Date and a separate nanoseconds value. Only integral seconds are stored in the java.util.Date component. The fractional seconds - the nanos - are separate. The Timestamp.equals(Object) method never returns true when passed an object that isn't an instance of java.sql.Timestamp, because the nanos component of a date is unknown. As a result, the Timestamp.equals(Object) method is not symmetric with respect to the java.util.Date.equals(Object) method. Also, the hashcode method uses the underlying java.util.Date implementation and therefore does not include nanos in its computation.
Bad isn't it? Well that's the way it is. The reason is explained there as well

The inheritance relationship between Timestamp and java.util.Date really denotes implementation inheritance, and not type inheritance.
In other words they are saying, “We did it, but don't do it”. Who are we to argue?

Now whenever you override the equals method, you must also override the hashCode method.


We've seen the equals method so now let's proceed to ...

Overriding the hashCode method.

The contract for the equals method should really have another line saying you must proceed to override the hashCode method after overriding the equals method. The hashCode method is supported for the benefit of hash based collections.

The contract
Again from the specs:

  • Whenever it is invoked on the same object more than once during an execution of an application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
So equal objects must have equal hashCodes. An easy way to ensure that this condition is always satisfied is to use the same attributes used in determining equality in determining the hashCode. You should now see why it is important to override hashCode every time you override equals.

The old hashtable and buckets example

Picture a hash table as a group of buckets. When you add a key-value pair, the key's hashCode is used to determine which bucket to put the mapping.
Similarly when you call the get method with a key on the hash table, the key's hashCode is used to determine in which bucket the mapping was stored. This is the bucket that is searched (sequentially) for the mapping. If you have two “equal” objects but with different hashCodes, then the hash table will see them as different objects and put them in different buckets. Similarly you can only retrieve an object from a hash table by passing an object with the same hashCode as the object you are trying to retrieve. If no matching hashCode is found, null is returned.
So let's say it again, “Equal objects must have equal hashCodes”.

A lazy hashCode

Now if your hashCode method returns the same constant value, then every mapping is stored in the same bucket and you have a hash table reduced to a (God forbid) LinkedList. This is where the third part of the contract comes in. It's allowed to have two unequal objects having the same hashCode but it makes the hash table very slow.

The best hashCode

The opposite case is to make all unequal objects have unequal hashCodes. This means each mapping is stored in its own bucket. This is the optimal case for the hash table and results in linear search times because only the correct bucket needs to be searched for. Once the correct bucket is found, the search is complete. That's why the API docs said
However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
Alright, now we know what's desirable and what's undesirable in a hashCode. Let's see how to create it.

We want it to be linked to the equals method in some way and so it must use the same attributes as the equals method.

It's just an int

The signature is
Expand|Select|Wrap|Line Numbers
  1. public native int hashCode()
. The key thing to note here is that the method returns an integer. This means that we should try to get an integer representation of all the attributes that were used to determine equality in the equals method. The trick is that we should get this integer representation in a way that ensures that we always get the same int value for the same attribute value.
Once we have the integers, it's up to us to find a way of combining them into one integer that represents the hashCode for our object.

One way of doing it

A common approach is to choose a multiplier ,say p, and then compute an int value by applying the following formula
hashCode = multiplier * hashCode + attribute's hashCode for all the attributes.

For three atributes (a1, a2, a3), the hashCode would be computed in the following steps
Expand|Select|Wrap|Line Numbers
  1.  hashCode =  multiplier  * hashCode  +  a1's hashCode //step 1
  2. hashCode =  multiplier  * hashCode  +  a2's hashCode //step 2
  3. hashCode =  multiplier  * hashCode  +  a3's hashCode //step 3
  4.  
You should initialize hashCode to some value so that the first multiplication does not give 0.
Your multiplier should preferably be a prime number for reasons best left for the amusement of the reader.

If you think about it you'll see that it's possible for two unequal objects to still have the same hashCode even if the method above was used. It's no mean feat to ensure that hashCodes are always unequal for unequal objects. Whatever algorithm you decide however, make sure the result is always an integer and will be the same integer returned for equal objects.
So how do we determine the hashCodes for the attributes themselves?

For the individual attributes values, you can use the following popular approach,
  • For boolean variables use 0 if it's true and 1 if it's false.
  • Converting byte, char or short to int is easy. Just cast to int. The result is always the same for the same value of the attribute.
  • A long is bigger than an int. You can use (int)value^(value >>> 32) . This is the method used by the java.lang.Long class.
  • If the field is a float, use Float.floatToIntBits(value).
  • If the field is a double, use Double.doubleToLongBits(value), and then hash the resulting long using the method above for long type.
  • If the field is an object reference and this class’s equals method compares the field by recursively invoking equals, then recursively invoke hashCode on the field as well.
  • If the value of the field is null, return 0 (or some other constant, 0 is more common but you might want to distinguish it from the boolean case).
  • Finally, if the field is an array, go through each element and compute each element's hashCode value. Use the sum of the hashCodes as the hashCode for the array attribute.
This is not difficult. It's just cumbersome and usually boring.

Hate Maths?

Another less mathematical method is to convert each of the attributes to Strings, concatenate the Strings and use the hashCode of the resultant String. You need to be careful with how you convert the attributes to Strings with this one.
In particular calling the toString method on the attributes may not be the correct way of doing it because the toString method may return a different string for equal objects!


Optimizing it


Calculating an object's hashCode can become quite complicated and sometimes time consuming. Especially when you have many array atributes.

It turns out you can get away with unneccessarily recomputing the same hashCode for immutable classes by calculating and storing the hashCode in a variable. Subsequent calls to the hashCode method would then just return that (cached) value. No need for computing it again.
Putting this in our Person class and using the first method for the hashCode gives

Finally some code

Expand|Select|Wrap|Line Numbers
  1. public class Person {
  2.     String title;
  3.     String fullName;
  4.     int age;
  5.     private volatile int hashCode = 0;
  6.  
  7.     public Person(String title, String fullName, int age) {
  8.         this.title = title;
  9.         this.fullName = fullName;
  10.         this.age = age;
  11.     }
  12.  
  13.      String getFullName() {
  14.         return fullName;
  15.     }
  16.  
  17.     int getAge() {
  18.         return age;
  19.     }
  20.  
  21.     String getTitle() {
  22.         return title;
  23.     }
  24.  
  25.     public boolean equals(Object obj) {
  26.         if(this == obj) {
  27.             return true;
  28.         }
  29.         if (!(obj instanceof Person)) {
  30.             return false; 
  31.         }
  32.         Person person = (Person)obj;
  33.         return age == person.getAge() && fullName.equals(person.getFullName())
  34.         && title.equals(person.getTitle());
  35.  
  36.     }
  37.  
  38.     public int hashCode () {
  39.         final int multiplier = 23;
  40.         if (hashCode == 0) {
  41.             int code = 133;
  42.             code = multiplier * code + age;
  43.             code = multiplier * code + fullName.hashCode();
  44.             code = multiplier * code + title.hashCode();
  45.             hashCode = code;
  46.         }
  47.         return hashCode;
  48.     }
  49. }
  50.  


The hashCode is only computed once for this object. Notice that the hashCode was lazily initialized to zero. This works only if the class is immutable because once created the uniqueness atributes values never change so the hashCode is always the same. If the age were to be allowed to change, the hashCode would also change but the value returned would still be the cached value. That would be a disaster.

Smart Alec?

You could use this to try and optimize even for mutable classes by recomputing the cached value every time a uniqueness attribute is changed. This is only possible if you are in control of all the possible ways in which the uniqueness attributes can be changed. Careful though that you don't out do yourself and end up creating more work for the computer while trying to create less work for it!

Conclusion

There's lots of other methods of doing this, some of which are much better and more thorough than the ones described above. You should certainly make an effort to learn another method.
Don't forget the important contract of the hashCode. Equal objects must have equal hashCodes.

That's all I have for the hashCode. Hope you'll find it useful one day.
Oct 15 '07 #1
Share this Article
Share on Google+
10 Comments


P: 1
A good artice.

I have a question baout the override equals in the preson class.

public boolean equals(Object obj)
{
...
return age == person.getAge() && fullName.equals(person.getFullName())
&& title.equals(person.getTitle());
}

If fullName is null the Equals override would throw an exception. What is the best way to do this:-

Test all nullable values for null in Equals()?
Ensure fullName can never be set to null in the property (And hope it never gets set to null in the class implementation)?
Something else?

Your thoughts would be appreciated.

Kev
Jul 29 '08 #2

10K+
P: 13,264
Ensuring fullName will never be null (or more generally ensuring uniquness attribute set elements will never be null) should not be enforced by the need to define an equals method. Rather it must be dictated by the meaning of that element. In this case if a Person can be created before knowing their full name then we shouldn't try to ensure that the full name is never null just so we can define an equals method.

That leaves only two options. Either return false if any of the uniqueness attributes is null or permit them to be null but check them before dereferencing. Either way requires that they be checked for null first.
Jul 29 '08 #3

P: 1
I like this article, It covers all the aspects of hashcode and equals
Apr 3 '09 #4

10K+
P: 13,264
@kuldeep s
Glad you liked it. Not too many people read articles(or anything at all) these days.
Apr 3 '09 #5

dmjpro
100+
P: 2,476
@r035198x
Great point you mentioned. All these are performance concerned ;)
So can you figure out more performance concern or can you give me some links or article on this? That would be very helpful :)
Apr 17 '09 #6

P: n/a
Ashu Sharma
Very good article .. precise,crisp and hits the mail right on the head
Nov 7 '10 #7

P: n/a
Sarvesh N
Its more explanatory. I really liked the article. Keep posting
Nov 23 '10 #8

P: n/a
Jay Perera
Fantastic article. All I wanted were included. thank you for presenting this.
Nov 25 '10 #9

P: n/a
OneNeo
Nice article...simple explanation of the basics...
Nov 26 '10 #10

P: 1
Its Very very good article.. its cleared all my confusion about equal and hashcode method Thanks a lot...
Feb 2 '12 #11