Pages

Monday, May 27, 2019

Removing All Duplicate Values from ArrayList including Java 8 Streams

1. Overview

In this tutorial, We'll learn how to clean up the duplicate elements from ArrayList. Many of the scenarios we do encounter the same in real time projects. But, this can be resolved in many ways.

We will start with Set first then next using new List with using built in method contains() before adding each value to newList.

Next, will see duplicate elements removal with new java 8 concept Streams.

Then later, interestingly we have a list with Employee objects. Employee has id and name. Some Employee objects are added to list. Later realized that list has now duplicate employee based on the id. Now how to remove custom duplicate objects from ArrayList.

Removing All Duplicate Values from ArrayList including Java 8 Streams



we'll see how to clean up duplicate custom objects at end of this course.

In this post, We will execute all our code on below input list.

List fewMonths = new ArrayList<>();
fewMonths.add("JAN");
fewMonths.add("FEB");
fewMonths.add("MAR");
fewMonths.add("APR");
fewMonths.add("FEB");

Output:

Printing the duplicate list.

Duplicate List : [JAN, FEB, MAR, APR, FEB]

After removing duplicates List should be as below.

After removing duplicates : [JAN, FEB, MAR, APR]

2. Using Set

Set is a interface in Java which does not allow duplicates. One of Set implementations are Hashset and LinkedHashSet.

Hashset doesn't preserve the insertion order where as LinkedHashSet holds in the insertion order.

We need to just pass the duplicate fewMonths List to set while creating it's instance.

Set afterRemovingDuplicates = new LinkedHashSet<>(fewMonths);

Output:

After removing duplicates : [JAN, FEB, MAR, APR]

Order is preserved as in the input List.

Full Example Code

3. For Loop - with new List


In this approach,

A) Create a new List
B) Iterate the duplicate List
C) Take each value from duplicate list then checks in newList using contains() method whether this element is present in the newList.
D) If not present, add it to the newList
E) If present, do not add it to the newList.
F) Finally, newList is free from duplicates.

List newList = new ArrayList<>();

for (String month : fewMonths) {

 if (!newList.contains(month)) {
  newList.add(month);
 }
}


Full Example Code

4. Java 8 - Stream.distinct()

In this approach, We'll be using Java 8 Lambda Expressions to solve using Stream API. Stream API has a method distinct() which returns a stream of distinct elements by calling equals method.

List newMonthsListWithoutDuplicates = fewMonths.stream()
                .distinct()
                .collect(Collectors.toList());  

Full Example Code

5. Custom Objects

Let us take a look if List is having user defined objects rather than String or Wrapper objects in it.

For Example

Employee.java

class Employee {
 private int id;
 private String name;
 
 public Employee(int id, String name) {
  this.id = id;
  this.name = name;
 }
 
 // setters and getters
 
 @Override
 public String toString() {
  return "Employee [id=" + id + ", name=" + name + "]";
 }
 
}

Removing Duplicates using java 8:

Now adding 5 employee objects with id 100. We need to clean up the ArrayList and finally it should have only one Employee object which is unique. Below program is implemented using distinct() method.

public class RemovalUsingDistinct {

 public static void main(String[] args) {

  List fewMonths = new ArrayList<>();
  fewMonths.add(new Employee(100, "Jhon"));
  fewMonths.add(new Employee(100, "Jhon"));
  fewMonths.add(new Employee(100, "Jhon"));
  fewMonths.add(new Employee(100, "Jhon"));
  fewMonths.add(new Employee(100, "Jhon"));

  System.out.println("Duplicate List size: " + fewMonths.size());

  List newMonthsListWithoutDuplicates = fewMonths.stream().distinct().collect(Collectors.toList());
  System.out.println("After removing duplicates list size : "+newMonthsListWithoutDuplicates.size());
 }

}

Output:

Duplicate List size: 5
After removing duplicates list size : 5

Full Example Code

Observe the output and lists sizes are same before and after removal.

All above shown solutons will give same size as input.

In these types of scenarioes, We should know API internal working mechanism to solve.

Solution is just need to override the equals() and hashcode() method in Employee class as below.

@Override
public int hashCode() {
 // TODO Auto-generated method stub
 return this.id;
}

@Override
public boolean equals(Object obj) {
 Employee other = (Employee) obj;
 if (id != other.id)
  return false;
 return true;
}
 
Output:

now see the output

Duplicate List size: 5
After removing duplicates list size : 1

Full Example Code

6. Conclusion

In this tutorial, We've seen how easy to clean up duplicates from ArrayList using LinkedHashSet, new list using contains() mehtod and java 8 stream api distinct() method.

And more over demonstrated removing user defined duplicate objects from List.
All code snippet shown here are available on GitHub.

No comments:

Post a Comment

Please do not add any spam links in the comments section.