Mystery of Equality in C#: IEquatable<T>, IEqualityComparer<T>, IComparable<T>, IComparer<T>
What is the difference between these confusing words? They are very similar to each other. In this article, we will learn everything about them. But first, you need to learn two important prerequisites: Equals and GetHashCode.
So our topics are:
- Equals
- GetHashCode
- IEquatable<T>
- IEqualityComparer<T>
- IComparable<T>
- IComparer<T>
Equals
In the first step, we talk about Equals. It might be the best point to start. Equals is a virtual function in object class:
public virtual bool Equals(Object obj)
if your class needs to be compared somewhere, you should override this function. What you write in the overridden Equals function, depends on what equality means in your class. Suppose you have a Student class: (We use this class for our examples.)
class Student
{
public string Name { get; set; }
public int Number { get; set; }
public int SerialNo { get; set; }
public int Grade { get; set; }
}
When do you think two objects of this class are equal? When their SerialNos are equal, or Numbers, or both?
It depends on the quality concept in your class. If we consider only the SerialNo which specifies the difference, we can write Equality function like this:
public override bool Equals(object obj)
{
if (obj is Student student)
{
if (student.SerialNo == SerialNo)
return true;
else
return false;
}
else
return false;
}
Question: why we should override a function of object class? We can create another function (for example IsMatch() function) without overriding base function. What is the benefit of overriding?
Answer: Yes. You can. But you are the only one who knows the name of that function. Do built-in functions know the name of your function? many of built-in functions or third party libraries need to access the equality concept of your class. They know cause every class is driven from the base object class, every object must have Equals function. they call the Equals function of the taken object.
For example in System.Collection.Hashtable, Contains function calls your class’s Equals to find the requested key:
Hashtable hashtable = new Hashtable();
hashtable.Add("MyFirstKey", 1);
hashtable.Add("MySecondKey", true);
if(hashtable.Contains("MySecondKey")) //it calls Equals
Console.WriteLine("I found it");
Point: We should override the Equals function if our class’s equality will be needed. If we don’t, the base object class’s equals will be called and it will not give a correct result.
GetHashCode
There is another function in the base object that should be overridden if equality is important for our class
public virtual int GetHashCode()
It helps our code to possess more performance in comparison. Before talking about overriding and performance, let’s know what is the hashcode that this function gives us?
Every variable whether value type or reference type has a specific number (int) that its name is hashcode. Notice that this number isn’t the address of variable. It’s just an assigned number to the variable. To see your variable’s hashcode, just call the GetHashCode function of your variable
int a = 145;
Console.WriteLine(a.GetHashCode()); // 145
Look other types’ hashcodes:
long a1 = 23; // a1.GetHashCode() => 23
long a2 = 23; // a2.GetHashCode() => 23float f1 = 213.453f; // f1.GetHashCode() => 1129673720
float f2 = 213.453f; // f1.GetHashCode() => 1129673720bool b1 = true; // b1.GetHashCode() => 1
bool b2 = true; // b2.GetHashCode() => 1
As you can see, int and Boolean’s hashcodes, are related to their values. But float is different. also In every build, the float variable has new hashcode.
But here it doesn’t matter how the variable’s hashcode is generated or what is its relation with the variable’s value. What you need to consider in hashcodes is equal variabales have equal hashcodes. Therefore, we can conclude that the other tool helpful for us to find equality can be hashcode.
But sometimes it doesn’t work. Look at another example. All the variables of student1 and student2 are exactly equal:
Student student1 = new Student()
{
Name = "Joe",
Grade = 3,
Number = 10,
SerialNo = 3213214
};
Student student2 = new Student()
{
Name = "Joe",
Grade = 3,
Number = 10,
SerialNo = 3213214
};
what are their hashcodes?
Console.WriteLine(student1.GetHashCode()); // 58225482
Console.WriteLine(student2.GetHashCode()); // 54267293
Oops. Their hashcodes don’t match.
The reason is the difference in the type of variables. Our Student class is a reference type but the variables in the first example are value types. In reference types, every instance has an address of an object in heap. So in fact the real value of student1 and student2 instances are different addresses that point to separated sections of heap. Thus the hashcode of student1 and student2 are different . (if you don’t know about heap and you didn’t understand this paragraph, don’t worry and continue rest of the article. That is why reference type variables don’t show the same behavior as value types. (More info)
let’s get to the point.
How GetHashCode function could help us in finding equality in reference types?
Exactly like Equals function, for reference types we should override the GetHashCode() of the base object and determine what is the equality concept of our Student class As following:
public override int GetHashCode()
{
return SerialNo.GetHashCode();
}
because our student’s hashcode is the SerialNo’ hashcode, this time student1.GetHahcode() and student2.GetHahcode() are aqual.
or if the student’s equality concept is the quality of SerialNo and Number, we should have something like this:
public override int GetHashCode()
{
return Number.GetHashCode() ^ SerialNo.GetHashCode();
}
Now our class’s GetHashCode(), like Equals, knows the equality concept of our class.
How GetHashCode helps the Equals to find faster equality?
In working with GetHashCode, there are two rules:
- When two variables’ hashcodes are equal, values of them maybe equal
- When two variables’ hashcodes don’t equal, values of them surely are not equal
GetHashcode is faster than Equals. So before you call the Equals of object, call GetHashCode of two objects and compare them. If they are equal, then call Equals, if they don’t match, the comparing is done. We know they are not equal and you don’t need to call Equals:
bool result = false;
if (student1.GetHashCode() == student2.GetHashCode())
result = student1.Equals(student2);
IEquatable<T>
As you can see, in the Equals we have cast and it isn’t type-safe. with this IEquatable interface, we will have a type-safe Equals function. You will have better performance if you don’t use casting.
First, let’s look at the IEquatable<T> interface:
public interface IEquatable<T>
{
bool Equals(T other);
}
It has only single function. The Equals function of the base object takes object type, but this new Equals function takes T type.
class Student : IEquatable<Student>
{
public string Name { get; set; }
public int Number { get; set; }
public int SerialNo { get; set; }
public int Grade { get; set; }
public bool Equals(Student other)
{
if (other.SerialNo == SerialNo)
return true;
else
return false;
}
}
Body of the Equals is like previous overridden base object’s Equals but here we don’t have casting. After implementing the IEquatable<T> you can see the second Equals function which accepts the Student type.
Queation1: how an interface can help us to have performance? It’s just an interface. You could just drop IEquatable<Student> from the class definition and the methods will still work the same way. So I don’t see how IEquatable<Student> in itself reduces casting here.
Answer: without IEquatable<Student> interface, your Equals function still works and helps to have performance. But the purpose of writing the IEquatable interface in front of our class is announcing to built-in functions or third-party libraries to say our Student class has a type-safe Equals function. IEquatable interface is a popular .NET Interface. Each class, anywhere it is, knows when one class implements it; hence surely it has a type-safe Equals function. (to know the answer completely, you need to understand the interface concept and know how it helps us in programming.)
Microsoft:
The IEquatable<T> interface is used by generic collection objects such as Dictionary<TKey,TValue>, List<T>, and LinkedList<T> when testing for equality in such methods as Contains, IndexOf, LastIndexOf, and Remove. It should be implemented for any object that might be stored in a generic collection.
Question2: In the Equals section, you said “many of functions (built-in or third-party library’s function) use base object’s Equals function. to get the correct result, we should override the Equals.” But in this section that we use IEquatable, those functions will get into trouble.
Answer: Great question. This problem will occur. to resolve this problem exactly like the Equals section, we should also override the Equals and GetHashCode functions of the base object
Eventually, our complete Student class is here:
class Student : IEquatable<Student>
{
public string Name { get; set; }
public int Number { get; set; }
public int SerialNo { get; set; }
public int Grade { get; set; }
public bool Equals(Student other)
{
if (other.SerialNo == SerialNo)
return true;
else
return false;
}
public override bool Equals(object obj)
{
if (obj is Student student)
return Equals(student);
else
return false;
}
public override int GetHashCode()
{
return SerialNo.GetHashCode();
}
}
Note that the IEquatable also should be used in Structures. Because in structures using of base object’ Equals, creates two more important problems for performance. Boxing and using reflection.
IEqualityComparer<T>
Suppose our Student class in different conditions, has different equality concepts. For example in one case, Serial No determines the equality concept, and in another, the Number. How we should implement that? with Creating Equals1 and Equals2? It’s impossible because the base object and IEquality<T> have one Equals function.
That is what IEqualityComparer<T> are for. But notice the Student class won’t implement that. We should create other classes to hold Student’s different equality concepts and give them to every function that needs to know the equality of our class.
public interface IEqualityComparer<in T>
{
bool Equals(T x, T y);
int GetHashCode(T obj);
}
Let’s create two classes that hold Student’s equality concepts:
class SerialNoEqualityComparer : IEqualityComparer<Student>
{
public bool Equals(Student s1, Student s2)
{
if (s1 == null && s2 == null)
return true;
else if (s1 == null || s2 == null)
return false;
else if (s1.SerialNo == s2.SerialNo)
return true;
else
return false;
}
public int GetHashCode(Student student)
{
return student.SerialNo.GetHashCode();
}
}
and
class NumberEqualityComparer : IEqualityComparer<Student>
{
public bool Equals(Student s1, Student s2)
{
if (s1 == null && s2 == null)
return true;
else if (s1 == null || s2 == null)
return false;
else if (s1.Number == s2.Number)
return true;
else
return false;
}
public int GetHashCode(Student student)
{
return student.Number.GetHashCode();
}
}
And we can use these classes in Contains (import System.Linq to access this function for List)
List<Student> students = new List<Student>();
// Add some students to this Listvar specificStudent = new Student()
{ SerialNo = 12, Number = 34, Grade = 3, Name = "Joe" };bool exist1 = students
.Contains(specificStudent, new NumberEqualityComparer());
bool exist2 = students
.Contains(specificStudent, new SerialNoEqualityComparer());
exist1 will be true if in students list there is at least one student with Number=34 and exist2 will be true if in students list there is at least one student with SerialNo=12.
It would be great if you use nested class and write the SerialNoEqualityComparer and NumberEqualityComparer inside the student class. because no other class uses the. Look at the final Student class in Github
IEqualityComparer<T> is also useful when a class is closed to us. For example, that class exist in a third-party library and we can’t implement it with IEquatable<T>
IComparable<T>
Do you want to order your objects? To do that, It’s needed to know which is bigger than other. Getting this cold is done with the help of IComparable<T> with its function.
public interface IComparable<in T>
{
int CompareTo(T other);
}
As you can see, the output of the CompareTo method is int. the output number should be only these three numbers: 1, 0, -1. The role of choosing the number for return is:
If the given object is smaller than the current object, return 1
If the given object is equal to the current object, return 0
If the given object is bigger than the current object, return -1
We decide the SerialNo determines which class is bigger. So we implement an IComparable interface like this:
class Student : IComparable<Student>
{
public string Name { get; set; }
public int Number { get; set; }
public int SerialNo { get; set; }
public int Grade { get; set; }
public int CompareTo(Student other)
{
if (other.SerialNo < SerialNo)
return 1;
else if (other.SerialNo == SerialNo)
return 0;
else
return -1;
}
}
Simple value type variables have implemented the IComparable interface. so for simplicity, we can write CompareTo like this:
public int CompareTo(Student other)
{
return SerialNo.CompareTo(other.SerialNo);
}
Now we can use the Sort function of the List class (The Sort function calls the CompareTo function inside itself):
List<Student> students = new List<Student>();
// add some students
students.Sort();
IComparer<T>
Like to the IEqualityComparer, which help us inject the equality concept into functions, IComparer does same work. So we should create another classe that holds greetness logic on our class. first look at the IComparer interface:
public interface IComparer<in T>
{
int Compare(T x, T y);
}
Now with this interface, we can create a SerialNoComparer class like this:
class SerialNoComparer : IComparer<Student>
{
public int Compare(Student s1, Student s2)
{
return s1.SerialNo.CompareTo(s2.SerialNo);
}
}
And it’s usage is here:
students.Sort(new SerialNoComparer());
don’t forget to act like IEqualityComparer<T> and use nested class to have beautiful design.
Thank you for reading. I will be happy to know your opinion. you can comment on my Linkedin post of this article.