The Object class has five non final methods namely equals, hashCode, toString, clone, and finalize.
These were designed to be overridden according to specific general contracts. Other classes that make use of these methods assume that the methods obey these contracts so it is necessary to ensure that if your classes override these methods, they do so correctly.
In this article I'll take a look at the equals and hashCode methods.
Overriding the equals method
Override the equals method when you want to specify the rules of logical equality of objects. What is logical equality? A simple definition is that two objects are logically equal if they have the same values for the same uniqueness attributes.
The contract
The equals method actually implements an equivalence relation between (non null) object references. Here are the rules that the equals method must obey:
The object class already provides an implementation of the equals method. Here it is
Not much is it?
The method simply tests for equality of object references. This is not always the desired behavior, particularly when comparing Strings. That's why the String class provided its own implementation of the equals method.
Personality test
Suppose you are creating a Person class with each person having a title, fullName and age. You could possibly decide that two Person objects must be referring to the same person if they have exactly the same fullName, title and age fields. i.e your uniqueness attributes set is (fullName, title, age) Your class could look like this:
Now if you create two person objects with the same attributes, you'd want those two objects to be the same person.
If you do
you will get false because admin1 and admin2 are different references and the equals method being used here (from the object class) is comparing references. So here we need to override the equals method to check for our uniqueness attributes for our comparisons to work.
There are a few things you need to be wary of when providing your own equals method.
Silly mistake
A silly mistake is to overload the equals method instead of overriding it!
The signature of the equals method is
So having
does not override the equals method. It simply creates another equals method (an overload of the equals method). The argument to the equals method must be an Object.
Optimizing it
Another important issue is optimization. Remember that symmetry says that an object is always equal to itself (x.equals(x) should always return true). We can take advantage of this and test for this case first. A “clever” implementation of equals should thus start as
There is no need to trouble the processor with further tests if the object passed is itself.
Killing many birds with one instanceof
Next you now need compare the actual field values to determine the equality based on your logic. Because the argument passed to the equals method is always of type Object, you'll aways need to cast it to an object of your class type first before comparing the fields. In our example we'll need to cast to an object of type Person.
What if we are passed a String (or an object of any other type) instead of a Person object?
This is where the instanceof operator comes in handy. The instanceof operator actually does a lot more than ensure that our cast does not fail. It provides some optimization as well because if x instanceof y is false then x.equals(y) is always false. We can thus use it to eliminate any objects passed which cannot be logically equal to our object by virtue of their class type. instanceof also eliminates the trivial case when the object passed is null because it returns false if the first argument is null. So we have
Putting it together
We now need to add our own logic for the equality test.
This time
will print true.
Inheritance may break the contract
Now suppose we extend the Person class with say a ChessPlayer class to have ranking and nationality as additional fields to those already inherited from the Person class. Suppose further that we want uniqueness of ChessPlayers to be determined by title, fullName, age, nationality and ranking. Now our equals method from the Person class no longer suffices.
You can see that it is impossible to extend a concrete class adding an aspect without violating the equals contract. Think about this every time you extend a class. The case when an aspect is being added to an abstract class is best left for researching by the reader.
The equals method thus needs to be rewritten in the ChessPlayer class to cater for the nationality and ranking aspects and while I'm at it I might as well leave that as an exercise to the reader.
Simple ones first
Note that I deliberately compared the ages (integers) first. The && operator has short circuit behavior, meaning that if the age comparison fails the rest of the comparison is abandoned and false is returned. It is therefore a performance advantage to have the cheaper (memory wise) tests first and the more memory demanding tests last.
Why bother?
There are thankfully some instances when you don't have to override the equals method.
Note that if it does not make sense to define logical equality for a class then you should include the equals method and simply throw an UnsupportedOperationException. If you don't include it then if someone invokes it, they will get a super class implementation and get results when the comparison is not supposed to be allowed.
An Exception
From the specification of the Timestamp class we find
Now whenever you override the equals method, you must also override the hashCode method.
These were designed to be overridden according to specific general contracts. Other classes that make use of these methods assume that the methods obey these contracts so it is necessary to ensure that if your classes override these methods, they do so correctly.
In this article I'll take a look at the equals and hashCode methods.
Overriding the equals method
Override the equals method when you want to specify the rules of logical equality of objects. What is logical equality? A simple definition is that two objects are logically equal if they have the same values for the same uniqueness attributes.
The contract
The equals method actually implements an equivalence relation between (non null) object references. Here are the rules that the equals method must obey:
- Symmetry: For any reference values x and y, x.equals(y) is true implies that y.equals(x) is also true.
- Reflectivity: For any reference value x, x.equals(x) must always return true.
- Consistency: For any reference values x and y, x.equals(y) consistently returns true or consistently returns false, provided no information used in equals comparisons on the object is modified.
- Transitivity: For any reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) must return true.
- For any non-null reference value x, x.equals(null) must return false.
The object class already provides an implementation of the equals method. Here it is
Expand|Select|Wrap|Line Numbers
- public boolean equals(Object obj) {
- return (this == obj);
- }
The method simply tests for equality of object references. This is not always the desired behavior, particularly when comparing Strings. That's why the String class provided its own implementation of the equals method.
Personality test
Suppose you are creating a Person class with each person having a title, fullName and age. You could possibly decide that two Person objects must be referring to the same person if they have exactly the same fullName, title and age fields. i.e your uniqueness attributes set is (fullName, title, age) Your class could look like this:
Expand|Select|Wrap|Line Numbers
- public class Person {
- String title;
- String fullName;
- int age;
- public Person(String title, String fullName, int age) {
- this.title = title;
- this.fullName = fullName;
- this.age = age;
- }
- String getFullName() {
- return fullName;
- }
- int getAge() {
- return age;
- }
- String getTitle() {
- return title;
- }
- }
If you do
Expand|Select|Wrap|Line Numbers
- Person admin1 = new Person("Admin", "r035198x", 2); //I'm not giving away my age that easily
- Person admin2 = new Person("Admin", "r035198x", 2);
- System.out.println(admin1.equals(admin2));
There are a few things you need to be wary of when providing your own equals method.
Silly mistake
A silly mistake is to overload the equals method instead of overriding it!
The signature of the equals method is
Expand|Select|Wrap|Line Numbers
- public boolean equals(Object obj)
Expand|Select|Wrap|Line Numbers
- public boolean equals(Person person)
Optimizing it
Another important issue is optimization. Remember that symmetry says that an object is always equal to itself (x.equals(x) should always return true). We can take advantage of this and test for this case first. A “clever” implementation of equals should thus start as
Expand|Select|Wrap|Line Numbers
- public boolean equals(Object obj) {
- if(this == obj) {
- return true;
- }
- ...
Killing many birds with one instanceof
Next you now need compare the actual field values to determine the equality based on your logic. Because the argument passed to the equals method is always of type Object, you'll aways need to cast it to an object of your class type first before comparing the fields. In our example we'll need to cast to an object of type Person.
What if we are passed a String (or an object of any other type) instead of a Person object?
This is where the instanceof operator comes in handy. The instanceof operator actually does a lot more than ensure that our cast does not fail. It provides some optimization as well because if x instanceof y is false then x.equals(y) is always false. We can thus use it to eliminate any objects passed which cannot be logically equal to our object by virtue of their class type. instanceof also eliminates the trivial case when the object passed is null because it returns false if the first argument is null. So we have
Expand|Select|Wrap|Line Numbers
- if (!(obj instanceof Person)) {
- return false;
- }
We now need to add our own logic for the equality test.
Expand|Select|Wrap|Line Numbers
- public boolean equals(Object obj) {
- if(this == obj) {
- return true;
- }
- if (!(obj instanceof Person)) {
- return false;
- }
- Person person = (Person)obj;
- return age == person.getAge() && fullName.equals(person.getFullName())
- && title.equals(person.getTitle());
- }
Expand|Select|Wrap|Line Numbers
- System.out.println(admin1.equals(admin2));
Inheritance may break the contract
Now suppose we extend the Person class with say a ChessPlayer class to have ranking and nationality as additional fields to those already inherited from the Person class. Suppose further that we want uniqueness of ChessPlayers to be determined by title, fullName, age, nationality and ranking. Now our equals method from the Person class no longer suffices.
You can see that it is impossible to extend a concrete class adding an aspect without violating the equals contract. Think about this every time you extend a class. The case when an aspect is being added to an abstract class is best left for researching by the reader.
The equals method thus needs to be rewritten in the ChessPlayer class to cater for the nationality and ranking aspects and while I'm at it I might as well leave that as an exercise to the reader.
Simple ones first
Note that I deliberately compared the ages (integers) first. The && operator has short circuit behavior, meaning that if the age comparison fails the rest of the comparison is abandoned and false is returned. It is therefore a performance advantage to have the cheaper (memory wise) tests first and the more memory demanding tests last.
Why bother?
There are thankfully some instances when you don't have to override the equals method.
- When the references check is sufficient. This is when each instance of the class is unique. The Thread class is an example.
- A parent class already has implemented the desired behavior. You have to be careful with this and make sure that the parent class' equals method really is sufficient for the subclass.
Note that if it does not make sense to define logical equality for a class then you should include the equals method and simply throw an UnsupportedOperationException. If you don't include it then if someone invokes it, they will get a super class implementation and get results when the comparison is not supposed to be allowed.
An Exception
From the specification of the Timestamp class we find
Note: This type is a composite of a java.util.Date and a separate nanoseconds value. Only integral seconds are stored in the java.util.Date component. The fractional seconds - the nanos - are separate. The Timestamp.equals(Object) method never returns true when passed an object that isn't an instance of java.sql.Timestamp, because the nanos component of a date is unknown. As a result, the Timestamp.equals(Object) method is not symmetric with respect to the java.util.Date.equals(Object) method. Also, the hashcode method uses the underlying java.util.Date implementation and therefore does not include nanos in its computation.Bad isn't it? Well that's the way it is. The reason is explained there as well
The inheritance relationship between Timestamp and java.util.Date really denotes implementation inheritance, and not type inheritance.In other words they are saying, “We did it, but don't do it”. Who are we to argue?
Now whenever you override the equals method, you must also override the hashCode method.
We've seen the equals method so now let's proceed to ...
Overriding the hashCode method.
The contract for the equals method should really have another line saying you must proceed to override the hashCode method after overriding the equals method. The hashCode method is supported for the benefit of hash based collections.
The contract
Again from the specs:
- Whenever it is invoked on the same object more than once during an execution of an application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
- If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
- It is not required that if two objects are unequal according to the equals method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
The old hashtable and buckets example
Picture a hash table as a group of buckets. When you add a key-value pair, the key's hashCode is used to determine which bucket to put the mapping.
Similarly when you call the get method with a key on the hash table, the key's hashCode is used to determine in which bucket the mapping was stored. This is the bucket that is searched (sequentially) for the mapping. If you have two “equal” objects but with different hashCodes, then the hash table will see them as different objects and put them in different buckets. Similarly you can only retrieve an object from a hash table by passing an object with the same hashCode as the object you are trying to retrieve. If no matching hashCode is found, null is returned.
So let's say it again, “Equal objects must have equal hashCodes”.
A lazy hashCode
Now if your hashCode method returns the same constant value, then every mapping is stored in the same bucket and you have a hash table reduced to a (God forbid) LinkedList. This is where the third part of the contract comes in. It's allowed to have two unequal objects having the same hashCode but it makes the hash table very slow.
The best hashCode
The opposite case is to make all unequal objects have unequal hashCodes. This means each mapping is stored in its own bucket. This is the optimal case for the hash table and results in linear search times because only the correct bucket needs to be searched for. Once the correct bucket is found, the search is complete. That's why the API docs said
However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.Alright, now we know what's desirable and what's undesirable in a hashCode. Let's see how to create it.
We want it to be linked to the equals method in some way and so it must use the same attributes as the equals method.
It's just an int
The signature is
Expand|Select|Wrap|Line Numbers
- public native int hashCode()
Once we have the integers, it's up to us to find a way of combining them into one integer that represents the hashCode for our object.
One way of doing it
A common approach is to choose a multiplier ,say p, and then compute an int value by applying the following formula
hashCode = multiplier * hashCode + attribute's hashCode for all the attributes.
For three atributes (a1, a2, a3), the hashCode would be computed in the following steps
Expand|Select|Wrap|Line Numbers
- hashCode = multiplier * hashCode + a1's hashCode //step 1
- hashCode = multiplier * hashCode + a2's hashCode //step 2
- hashCode = multiplier * hashCode + a3's hashCode //step 3
Your multiplier should preferably be a prime number for reasons best left for the amusement of the reader.
If you think about it you'll see that it's possible for two unequal objects to still have the same hashCode even if the method above was used. It's no mean feat to ensure that hashCodes are always unequal for unequal objects. Whatever algorithm you decide however, make sure the result is always an integer and will be the same integer returned for equal objects.
So how do we determine the hashCodes for the attributes themselves?
For the individual attributes values, you can use the following popular approach,
- For boolean variables use 0 if it's true and 1 if it's false.
- Converting byte, char or short to int is easy. Just cast to int. The result is always the same for the same value of the attribute.
- A long is bigger than an int. You can use (int)value^(value >>> 32) . This is the method used by the java.lang.Long class.
- If the field is a float, use Float.floatToIntBits(value).
- If the field is a double, use Double.doubleToLongBits(value), and then hash the resulting long using the method above for long type.
- If the field is an object reference and this class’s equals method compares the field by recursively invoking equals, then recursively invoke hashCode on the field as well.
- If the value of the field is null, return 0 (or some other constant, 0 is more common but you might want to distinguish it from the boolean case).
- Finally, if the field is an array, go through each element and compute each element's hashCode value. Use the sum of the hashCodes as the hashCode for the array attribute.
Hate Maths?
Another less mathematical method is to convert each of the attributes to Strings, concatenate the Strings and use the hashCode of the resultant String. You need to be careful with how you convert the attributes to Strings with this one.
In particular calling the toString method on the attributes may not be the correct way of doing it because the toString method may return a different string for equal objects!
Optimizing it
Calculating an object's hashCode can become quite complicated and sometimes time consuming. Especially when you have many array atributes.
It turns out you can get away with unneccessarily recomputing the same hashCode for immutable classes by calculating and storing the hashCode in a variable. Subsequent calls to the hashCode method would then just return that (cached) value. No need for computing it again.
Putting this in our Person class and using the first method for the hashCode gives
Finally some code
Expand|Select|Wrap|Line Numbers
- public class Person {
- String title;
- String fullName;
- int age;
- private volatile int hashCode = 0;
- public Person(String title, String fullName, int age) {
- this.title = title;
- this.fullName = fullName;
- this.age = age;
- }
- String getFullName() {
- return fullName;
- }
- int getAge() {
- return age;
- }
- String getTitle() {
- return title;
- }
- public boolean equals(Object obj) {
- if(this == obj) {
- return true;
- }
- if (!(obj instanceof Person)) {
- return false;
- }
- Person person = (Person)obj;
- return age == person.getAge() && fullName.equals(person.getFullName())
- && title.equals(person.getTitle());
- }
- public int hashCode () {
- final int multiplier = 23;
- if (hashCode == 0) {
- int code = 133;
- code = multiplier * code + age;
- code = multiplier * code + fullName.hashCode();
- code = multiplier * code + title.hashCode();
- hashCode = code;
- }
- return hashCode;
- }
- }
The hashCode is only computed once for this object. Notice that the hashCode was lazily initialized to zero. This works only if the class is immutable because once created the uniqueness atributes values never change so the hashCode is always the same. If the age were to be allowed to change, the hashCode would also change but the value returned would still be the cached value. That would be a disaster.
Smart Alec?
You could use this to try and optimize even for mutable classes by recomputing the cached value every time a uniqueness attribute is changed. This is only possible if you are in control of all the possible ways in which the uniqueness attributes can be changed. Careful though that you don't out do yourself and end up creating more work for the computer while trying to create less work for it!
Conclusion
There's lots of other methods of doing this, some of which are much better and more thorough than the ones described above. You should certainly make an effort to learn another method.
Don't forget the important contract of the hashCode. Equal objects must have equal hashCodes.
That's all I have for the hashCode. Hope you'll find it useful one day.