Why is concept of equals and operator== implemented this way?

cody

Why can I overload operator== and operator!= separately having different
implementations and additionally I can override equals() also having a
different implementation.

Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).

This would remove inconsistencies like

myString1==myString2

and

(object)myString1==(object)myString2

having different results.

also, should operator!= not always return the negated value of!=?
So why is it good for?

I would like to understand the technical reason for that, if there was any.

Jun 17 '06 #1

Subscribe Post Reply

2181

Barry Kelly

cody <de********@gmx.de> wrote:

Why can I overload operator== and operator!= separately having different
implementations and additionally I can override equals() also having a
different implementation.

Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).
'==' is resolved statically, while Object.Equals(object) is determined
by dynamic dispatch. The simplest answer is: performance, both in
textual code size (see below) and in runtime execution.
This would remove inconsistencies like

myString1==myString2

and

(object)myString1==(object)myString2

having different results.
It would also make the '==' operator always perform a dynamic dispatch
(i.e. a virtual method call). Not only would it be slower, but it would
also make it harder to (for example) program safely with multithreaded
locks.

When you call a virtual method, you've got no idea what code is going to
be called. That's one of the reasons why it isn't recommended to call
virtual methods while holding a lock. If things were to work this way,
no one could safely compare object references while holding locks,
unless they used the cumbersome object.ReferenceEquals(object,object)
method.

Also, it is not always possible to override the correct Equals. If the
two sides of the '==' are of different type, which side would the
compiler generate a call to .Equals for? What if one side is a Framework
type like int, and the other is a complex number type?
also, should operator!= not always return the negated value of!=?
So why is it good for?

This question I cannot answer, since I've never had a reason to have a
different implementation.

-- Barry

--
http://barrkel.blogspot.com/

Jun 17 '06 #2

cody

Barry Kelly wrote:

cody <de********@gmx.de> wrote:
Why can I overload operator== and operator!= separately having different
implementations and additionally I can override equals() also having a
different implementation.

Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).
'==' is resolved statically, while Object.Equals(object) is determined
by dynamic dispatch. The simplest answer is: performance, both in
textual code size (see below) and in runtime execution.

even if making unrealistic micro benchmarks with empty method bodies
there is almost no difference between normal methods and virtual ones.
When you call a virtual method, you've got no idea what code is going to
be called. That's one of the reasons why it isn't recommended to call
virtual methods while holding a lock. If things were to work this way,
no one could safely compare object references while holding locks,
unless they used the cumbersome object.ReferenceEquals(object,object)
method.
This sounds logical at first, but in fact, but looking into microsofts
shared source cli implementation, almost all implementations of
operator== is calling Equals() and the very few ones not doing this are
accessing properties of the object which are in some cases also virtual
or calling virtual methods internally.
Also, it is not always possible to override the correct Equals. If the
two sides of the '==' are of different type, which side would the
compiler generate a call to .Equals for? What if one side is a Framework
type like int, and the other is a complex number type?

Well, I see the problem. But in reality I've never seen an
implementation of operator== for different types. Did you?
I cannot imagine a scenario where this would be necessary, given the
fact that objects of different types should per definition never be
considered to be equal, and even if, you can implement an implicit
conversion operator for that.

also, should operator!= not always return the negated value of!=?
So why is it good for?

This question I cannot answer, since I've never had a reason to have a
different implementation.

Indeed very strange. even comparisons with NaN or nullable types
involved, operator!= always yields the opposite value of the
corresponding operator==:

Console.WriteLine(double.NaN == double.NaN); // false
Console.WriteLine(double.NaN != double.NaN); // true
Console.WriteLine(float.NaN == float.NaN); // false
Console.WriteLine(float.NaN != float.NaN); // true
Console.WriteLine((float?)1 == (float?)null); // false
Console.WriteLine((float?)1 != (float?)null); // true

In conclusion, I still feel that the concepts of ==,!= and equals could
have been implemented in a simpler and more logical way in .NET.
Maybe the current implementation may be 1% faster than the simpler one
and you can do very strange stuff like == and != returning the same
value or making objects of different types equal but if that makes 99.9%
of all normal cases harder to write and to maintain this is too much to pay.

Jun 17 '06 #3

Barry Kelly

cody <de********@gmx.de> wrote:

Barry Kelly wrote:
cody <de********@gmx.de> wrote:
Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).
'==' is resolved statically, while Object.Equals(object) is determined
by dynamic dispatch. The simplest answer is: performance, both in
textual code size (see below) and in runtime execution.

even if making unrealistic micro benchmarks with empty method bodies
there is almost no difference between normal methods and virtual ones.

Unless it's been overloaded, '==' isn't a method call.

When you call a virtual method, you've got no idea what code is going to
be called. That's one of the reasons why it isn't recommended to call
virtual methods while holding a lock. If things were to work this way,
no one could safely compare object references while holding locks,
unless they used the cumbersome object.ReferenceEquals(object,object)
method.

This sounds logical at first, but in fact, but looking into microsofts
shared source cli implementation, almost all implementations of
operator== is calling Equals() and the very few ones not doing this are
accessing properties of the object which are in some cases also virtual
or calling virtual methods internally.

What you seem to be advocating is to turn *every* *usage* of '==' with
reference types into a method call, but it isn't currently. What you've
been looking up in the SSCLI is the overloaded '==' operator on various
types. The built-in '==' operator for reference types has different
semantics.

Also, it is not always possible to override the correct Equals. If the
two sides of the '==' are of different type, which side would the
compiler generate a call to .Equals for? What if one side is a Framework
type like int, and the other is a complex number type?

Well, I see the problem. But in reality I've never seen an
implementation of operator== for different types. Did you?

Yes. I've implemented them, for my own Date type.
I cannot imagine a scenario where this would be necessary, given the
fact that objects of different types should per definition never be
considered to be equal, and even if, you can implement an implicit
conversion operator for that.
There was a conversion operator too, but why convert when you can
compare as-is?
In conclusion, I still feel that the concepts of ==,!= and equals could
have been implemented in a simpler and more logical way in .NET.
Maybe the current implementation may be 1% faster than the simpler one
and you can do very strange stuff like == and != returning the same
value or making objects of different types equal but if that makes 99.9%
of all normal cases harder to write and to maintain this is too much to pay.

I will submit this contrived micro-benchmark in favour of the current
situation, if only to point out some of the overhead of virtual method
calls on Equals etc.:

---8<---
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Text;
using System.Runtime.CompilerServices;

class SomeObject
{
[MethodImpl(MethodImplOptions.NoInlining)]
public override bool Equals(object obj)
{
// This would be the only way to perform reference equality
// checks if the proposed idea was implemented.
return object.ReferenceEquals(this, obj);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static bool operator==(SomeObject left, SomeObject right)
{
return object.ReferenceEquals(left, right);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static bool operator!=(SomeObject left, SomeObject right)
{
return !object.ReferenceEquals(left, right);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public override int GetHashCode()
{
return base.GetHashCode();
}
}

class App
{
delegate void Method();

static void Benchmark(int iterations, string label, Method method)
{
method(); // warmup

Stopwatch start = Stopwatch.StartNew();
for (int i = 0; i < iterations; ++i)
method();
Console.WriteLine("{0,20} : {1,6:f3} ({2} iterations)",
label,
start.ElapsedTicks / (double) Stopwatch.Frequency,
iterations);
}

static void Main()
{
const int iterCount = 30;
const int objectCount = 10000000;

// To eliminate "unused value" optimizations.
int equalCount = 0;

SomeObject[] list = new SomeObject[objectCount];
for (int i = 0; i < objectCount; ++i)
list[i] = new SomeObject();

Benchmark(iterCount, "Overloaded '=='", delegate
{
for (int i = 0; i < list.Length; ++i)
if (list[i] == null)
++equalCount;
});

Benchmark(iterCount, "Overridden 'Equals'", delegate
{
for (int i = 0; i < list.Length; ++i)
if (list[i].Equals(null))
++equalCount;
});

Benchmark(iterCount, "Object '=='", delegate
{
for (int i = 0; i < list.Length; ++i)
if ((object) list[i] == null)
++equalCount;
});

Benchmark(iterCount, "RefEquals", delegate
{
for (int i = 0; i < list.Length; ++i)
if (object.ReferenceEquals(list[i], null))
++equalCount;
});

Console.WriteLine("Total EqualCount: {0}", equalCount);
}
}
--->8---

On my system:

---8<---
Overloaded '==' : 1.582 (30 iterations)
Overridden 'Equals' : 2.662 (30 iterations)
Object '==' : 0.609 (30 iterations)
RefEquals : 0.896 (30 iterations)
Total EqualCount: 0
--->8---

Of limited applicability, very contrived, usual disclaimers, etc. etc...

-- Barry

--
http://barrkel.blogspot.com/

Jun 17 '06 #4

cody

Barry Kelly wrote:

cody <de********@gmx.de> wrote:
Barry Kelly wrote:
cody <de********@gmx.de> wrote:

Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).
'==' is resolved statically, while Object.Equals(object) is determined
by dynamic dispatch. The simplest answer is: performance, both in
textual code size (see below) and in runtime execution.

even if making unrealistic micro benchmarks with empty method bodies
there is almost no difference between normal methods and virtual ones.

Unless it's been overloaded, '==' isn't a method call.

Yes that is true. If == couldn't be overloaded the system would always
have to call Object.Equals() and this could not optimized to a single IL
instruction as it now is. One either would have to call
Object.ReferenceEquals() explicitly (which then could be inlined),
or an operator=== like it is in PHP would have to be introduced.

Also, it is not always possible to override the correct Equals. If the
two sides of the '==' are of different type, which side would the
compiler generate a call to .Equals for? What if one side is a Framework
type like int, and the other is a complex number type?

System.Object.Equals(objA, objB) only calls Equals of the first
parameter. So you always would have to write if (myComplexNumber==1.5).
I think it is very intuitive having the own or most specialized type on
the left side, at least for me. Sure, when making the call the other way
around then always false is returned. But when the types are known at
compile time, an appropriate Equals method doesn't exist and the
compiler determines an appropriate Equals method exists for the other
way around then a warning could be issued.
Well, I see the problem. But in reality I've never seen an
implementation of operator== for different types. Did you?

Yes. I've implemented them, for my own Date type.

Which type does it compare with? Your own type with System.Datetime?
Now as you say it, I can remember doing exactly the same thing because
we needed a Date class for our project which can be null. It was so long
ago I couldn't remember it, sorry :)
But this is an interesting case. We have different types
(System.DateTime and our own), but each represents exactly the same
entity, therefore users expect that they are comparable with each other.

I cannot imagine a scenario where this would be necessary, given the
fact that objects of different types should per definition never be
considered to be equal, and even if, you can implement an implicit
conversion operator for that.

There was a conversion operator too, but why convert when you can
compare as-is?

Well, I can see there can indeed be a big performance difference between
always creating a new object just for equality testing and just
comparing some fields against each other, but in the date case you just
encapsulate a Datetime so no creation of a new object is necessary.

In conclusion, I still feel that the concepts of ==,!= and equals could
have been implemented in a simpler and more logical way in .NET.
Maybe the current implementation may be 1% faster than the simpler one
and you can do very strange stuff like == and != returning the same
value or making objects of different types equal but if that makes 99.9%
of all normal cases harder to write and to maintain this is too much to pay.

I will submit this contrived micro-benchmark in favour of the current
situation, if only to point out some of the overhead of virtual method
calls on Equals etc.:

---8<---
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Text;
using System.Runtime.CompilerServices;

class SomeObject
{
[MethodImpl(MethodImplOptions.NoInlining)]
public override bool Equals(object obj)
{
// This would be the only way to perform reference equality
// checks if the proposed idea was implemented.
return object.ReferenceEquals(this, obj);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static bool operator==(SomeObject left, SomeObject right)
{
return object.ReferenceEquals(left, right);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static bool operator!=(SomeObject left, SomeObject right)
{
return !object.ReferenceEquals(left, right);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public override int GetHashCode()
{
return base.GetHashCode();
}
}

class App
{
delegate void Method();

static void Benchmark(int iterations, string label, Method method)
{
method(); // warmup

Stopwatch start = Stopwatch.StartNew();
for (int i = 0; i < iterations; ++i)
method();
Console.WriteLine("{0,20} : {1,6:f3} ({2} iterations)",
label,
start.ElapsedTicks / (double) Stopwatch.Frequency,
iterations);
}

static void Main()
{
const int iterCount = 30;
const int objectCount = 10000000;

// To eliminate "unused value" optimizations.
int equalCount = 0;

SomeObject[] list = new SomeObject[objectCount];
for (int i = 0; i < objectCount; ++i)
list[i] = new SomeObject();

Benchmark(iterCount, "Overloaded '=='", delegate
{
for (int i = 0; i < list.Length; ++i)
if (list[i] == null)
++equalCount;
});

Benchmark(iterCount, "Overridden 'Equals'", delegate
{
for (int i = 0; i < list.Length; ++i)
if (list[i].Equals(null))
++equalCount;
});

Benchmark(iterCount, "Object '=='", delegate
{
for (int i = 0; i < list.Length; ++i)
if ((object) list[i] == null)
++equalCount;
});

Benchmark(iterCount, "RefEquals", delegate
{
for (int i = 0; i < list.Length; ++i)
if (object.ReferenceEquals(list[i], null))
++equalCount;
});

Console.WriteLine("Total EqualCount: {0}", equalCount);
}
}
--->8---

On my system:

---8<---
Overloaded '==' : 1.582 (30 iterations)
Overridden 'Equals' : 2.662 (30 iterations)
Object '==' : 0.609 (30 iterations)
RefEquals : 0.896 (30 iterations)
Total EqualCount: 0
--->8---

Having a closer look comparing my and your benchmark I noticed that I
was calling the virtual methods always on the same object in a loop so
the jit could cache the method pointer in a register, no wonder why
virtual methods are nearly as fast as normal ones then :)

Running the release exe without the IDE (using .net 2.0) the
ReferenceEquals always runs faster than Object== on my computer.
But if I remove the NoInline attribute, ReferenceEquals is slower as
Object==.

overridden : 3,401 (30 iterations)
normal : 2,725 (30 iterations)
static : 1,212 (30 iterations)
Overloaded '==' : 0,716 (30 iterations)
Overridden 'Equals' : 3,428 (30 iterations)
Object '==' : 1,104 (30 iterations)
RefEquals : 0,898 (30 iterations)
Total EqualCount: 0

Jun 18 '06 #5

Chris Nahr

On Sat, 17 Jun 2006 18:32:13 +0200, cody <de********@gmx.de> wrote:

Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).
1. Upcasting to object involves boxing & unboxing for value types, and
that's a very expensive operation. You absolutely want to avoid that
for something as frequently used as an equality test.

2. Even with reference types, you're overriding a method that takes
object parameters, so your own Equals override will have to perform
cumbersome type checking on the supplied objects.

Now that we have generics, both issues could be avoided with a generic
Equals method; unfortunately that was not available back when the CLR
was designed. So strongly-typed equality tests were necessary back
then, and strongly-typed methods can't be inherited from Object.
also, should operator!= not always return the negated value of!=?

Yeah, that's a good point. operator!= could be auto-generated.
--
http://www.kynosarges.de

Jun 18 '06 #6

cody

Chris Nahr wrote:

On Sat, 17 Jun 2006 18:32:13 +0200, cody <de********@gmx.de> wrote:
Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).
1. Upcasting to object involves boxing & unboxing for value types, and
that's a very expensive operation. You absolutely want to avoid that
for something as frequently used as an equality test.

structs do not have a default implementation of operator==. If you want
some you could implement a suitable equals method having the appropriate
value type as parameter and the compiler translates a call of
myStruct1==myStruct2 into a call to myStruct1.Equals(myStruct2) so that
MyStruct.Equals(MyStruct obj) is called.
2. Even with reference types, you're overriding a method that takes
object parameters, so your own Equals override will have to perform
cumbersome type checking on the supplied objects.
Sure it will have to, but honestly how often do you use == for objects
that are *no* value types and *no* strings (which are sealed) both
require no type checks. In collection methods like IndexOf the non-type
safe version of Equals is used and type checks have to be performed anyway.
For most other cases you want reference equality anyway for which an
operator=== could be used.

And as I said, nobody prevents you from implementing a strongly typed
version of Equals which will then be picked by the compiler and here
also is no type check necessary.
Now that we have generics, both issues could be avoided with a generic
Equals method; unfortunately that was not available back when the CLR
was designed. So strongly-typed equality tests were necessary back
then, and strongly-typed methods can't be inherited from Object.

Iam not sure in which way generics can help here.

also, should operator!= not always return the negated value of!=?

Yeah, that's a good point. operator!= could be auto-generated.

The same applies to operator< and operator>= vs operator> and
operator<= or operator==(Y a,X b) vs operator==(X b,Y a) or
operator'true' vs operator'false'.

Jun 18 '06 #7

Barry Kelly

cody <de********@gmx.de> wrote:

2. Even with reference types, you're overriding a method that takes
object parameters, so your own Equals override will have to perform
cumbersome type checking on the supplied objects.
Sure it will have to, but honestly how often do you use == for objects
that are *no* value types and *no* strings (which are sealed) both
require no type checks. In collection methods like IndexOf the non-type
safe version of Equals is used and type checks have to be performed anyway.

Referential equality for mutable types has completely different
semantics from value equality, which is used for strings. With the
greatest respect, I use referential equality far, far, far more often
than value equality.

Taking something out of thin air:

Foo f = new Foo();
Foo f2 = new Foo();
Console.WriteLine(f == f2);

How often, in your programs, do you require this to print "True" on the
console? Speaking for myself, almost *never*.
For most other cases you want reference equality anyway for which an
operator=== could be used.
Here, it seems to me, you are introducing a new operator in order to do
what most people already use '==' for.
And as I said, nobody prevents you from implementing a strongly typed
version of Equals which will then be picked by the compiler and here
also is no type check necessary.

But: this is advocating making an expensive boxing operation the
default. That approach might be more reasonable in a more dynamic
language like Lisp or Python, but it isn't right in a language with a
C-based history.

-- Barry

--
http://barrkel.blogspot.com/

Jun 18 '06 #8

Jon Skeet [C# MVP]

Barry Kelly <ba***********@gmail.com> wrote:

Sure it will have to, but honestly how often do you use == for objects
that are *no* value types and *no* strings (which are sealed) both
require no type checks. In collection methods like IndexOf the non-type
safe version of Equals is used and type checks have to be performed anyway.

Referential equality for mutable types has completely different
semantics from value equality, which is used for strings. With the
greatest respect, I use referential equality far, far, far more often
than value equality.

Taking something out of thin air:

Foo f = new Foo();
Foo f2 = new Foo();
Console.WriteLine(f == f2);

How often, in your programs, do you require this to print "True" on the
console? Speaking for myself, almost *never*.

Actually, I rarely compare things which *don't* overload equality. I
rarely check for references being identical, but I often check for
strings being equal, for instance.

I often compare value types for equality, however.

Out of interest, what do you tend to use reference identity tests for?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Jun 18 '06 #9

Mark Wilden

"Barry Kelly" <ba***********@gmail.com> wrote in message
news:oi********************************@4ax.com...

Referential equality for mutable types has completely different
semantics from value equality, which is used for strings.

Hmmm...I thought all identical strings were folded to the same reference. Or
am I thinking of a completely different language?

///ark

Jun 18 '06 #10

cody

Barry Kelly wrote:

cody <de********@gmx.de> wrote:
2. Even with reference types, you're overriding a method that takes
object parameters, so your own Equals override will have to perform
cumbersome type checking on the supplied objects.

Sure it will have to, but honestly how often do you use == for objects
that are *no* value types and *no* strings (which are sealed) both
require no type checks. In collection methods like IndexOf the non-type
safe version of Equals is used and type checks have to be performed anyway.

Referential equality for mutable types has completely different
semantics from value equality, which is used for strings. With the
greatest respect, I use referential equality far, far, far more often
than value equality.

Taking something out of thin air:

Foo f = new Foo();
Foo f2 = new Foo();
Console.WriteLine(f == f2);

How often, in your programs, do you require this to print "True" on the
console? Speaking for myself, almost *never*.
For most other cases you want reference equality anyway for which an
operator=== could be used.

Here, it seems to me, you are introducing a new operator in order to do
what most people already use '==' for.

Which IMO is bad style. You never know whether the given objects have an
overloaded operator or not, maybe one is added later which will change
semantics of your code. Currently you will have to use
Object.ReferenceEquals or (object)obj1==(object)obj2 to ensure reference
equality is done.

And as I said, nobody prevents you from implementing a strongly typed
version of Equals which will then be picked by the compiler and here
also is no type check necessary.

But: this is advocating making an expensive boxing operation the
default. That approach might be more reasonable in a more dynamic
language like Lisp or Python, but it isn't right in a language with a
C-based history.

Microsoft recommends in the newer guidelines to always add a strongly
typed Equals to classes anyway.

The current implementation of operator overloading doesn't allow generic
code making use of them since static methods like operators are, cannot
be specified in interfaces.

public T Add<T>(T a, T b)
where a,b : IAddable<T> // interface may implement + and -
{
return a+b; // internally calls a.Add(b)
}

This way, users can make interfaces which implements the operators they
need for specific operations. In languages like C++, D (Digital Mars),
Python and Ruby are also normal methods (C++ also allows static ones for
operators).

Jun 18 '06 #11

Jon Skeet [C# MVP]

Mark Wilden <Ma********@newsgroups.nospam> wrote:

Referential equality for mutable types has completely different
semantics from value equality, which is used for strings.

Hmmm...I thought all identical strings were folded to the same reference. Or
am I thinking of a completely different language?

That's interning you're thinking of, and while it automatically applies
to string *literals* it certainly doesn't apply to strings in general.
For instance:

using System;

class Test
{
static void Main()
{
string x = "hello".Substring (0, 1);
string y = "hi".Substring (0, 1);
Console.WriteLine (x==y);
Console.WriteLine (object.ReferenceEquals(x, y));
}
}

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Jun 18 '06 #12

Mark Wilden

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...

That's interning you're thinking of, and while it automatically applies
to string *literals* it certainly doesn't apply to strings in general.

Thanks. I knew I was thinking of something...

Jun 18 '06 #13

Why is concept of equals and operator== implemented this way?

Similar topics