Pranay Rana: DistinctBy in Linq (Find Distinct object by Property)

Friday, January 25, 2013

DistinctBy in Linq (Find Distinct object by Property)

In this post I am going to discuss about how to get distinct object using property of it from collection. Here I am going to show three different way to achieve it easily.
In this post I am going to discuss about extension method that can do task more than the current Distinct method available in .Net framework.

Distinct method of Linq works as following right now.
public class Product
{
    public string Name { get; set; }
    public int Code { get; set; }
}
Consider that we have product Class which is having Code and Name as property in it.
Now Requirement is I have to find out the all product with distinct Code values.
Product[] products = { new Product { Name = "apple", Code = 9 }, 
                       new Product { Name = "orange", Code = 4 }, 
                       new Product { Name = "apple", Code = 10 }, 
                       new Product { Name = "lemon", Code = 9 } };
var lstDistProduct = products.Distinct();
foreach (Product p in list1)
{
     Console.WriteLine(p.Code + " : " + p.Name);
}
Output

It returns all the product event though two product have same Code value. So this doesn't meet requirement of getting object with distinct Code value.

Way 1 : Make use of MoreLinq Library
First way to achieve the requirement is make use of MoreLinq Library, which support function called DistinctBy in which you can specify the property on which you want to find Distinct objects.
Below code is shows the use of the function.
var list1 = products.DistinctBy(x=> x.Code);

foreach (Product p in list1)
{
     Console.WriteLine(p.Code + " : " + p.Name);
}
Output

As  you can see in output there is only two object get return which actually I want. i.e. distinct value by Code or product.
If you want to pass more than on property than you can just do like this  var list1 = products.DistinctBy(a => new { a.Name, a.Code });
You can read about the MoreLinq and Download this DLL from here : http://code.google.com/p/morelinq/ one more thing about this library also contains number of other function that you can check.

Way 2: Implement Comparable
Second way to achieve the same functionality is make use of overload Distinct function which support to have comparator as argument.
here is MSDN documentation on this : Enumerable.Distinct<TSource> Method (IEnumerable<TSource>, IEqualityComparer<TSource>)

So for that I implemented IEqualityComparer and created new ProductComparare which you can see in below code.
   
    class ProductComparare : IEqualityComparer
    {
        private Func<Product, object> _funcDistinct;
        public ProductComparare(Func<Product, object> funcDistinct)
        {
            this._funcDistinct = funcDistinct;
        }

        public bool Equals(Product x, Product y)
        {
            return _funcDistinct(x).Equals(_funcDistinct(y));
        }

        public int GetHashCode(Product obj)
        {
            return this._funcDistinct(obj).GetHashCode();
        }
    }
So In ProductComparare constructor I am passing function as argument, so when I create any object of it I have to pass my project function as argument.
In Equal method I am comparing object which are returned by my projection function.
Now following is the way how I used this Comparare implementation to satisfy my requirement.
var list2 = products.Distinct(new ProductComparare( a => a.Code ));

            foreach (Product p in list2)
            {
                Console.WriteLine(p.Code + " : " + p.Name);
            }
Output

So this approach also satisfy my requirement easily. I not looked in code of MoreLinq library but I think its also doing like this only. If you want to pass more than on property than you can just do like this  var list1 = products.Distinct(a => new { a.Name, a.Code });.

Way 3: Easy GroupBy wa
The third and most eaisest way to avoide this I did in above like using MoreLine and Comparare implementation is just make use of GroupBy like as below
 List<Product> list = products
                   .GroupBy(a => a.Code )
                   .Select(g => g.First())
                   .ToList();

            foreach (Product p in list)
            {
                Console.WriteLine(p.Code + " : " + p.Name);
            }
In above code I am doing grouping object on basis of property and than in Select function just selecting fist one of the each group will doing work for me.
Output

So this approach also satisfy my requirement easily and output is similar to above two approach. If you want to pass more than on property than you can just do like this   .GroupBy(a => new { a.Name, a.Code }).

So this one is very easy trick to achieve the functionality that I want without using any thing extra in my code.

Conclusion
So Above is the way you can achieve Distinct of collection easily by property of object.

Leave comment if you have any query or if you like it.

No comments:

Post a Comment