Streamline your Excel data efficiently by quickly removing duplicate values. This article demonstrates a straightforward approach to cleaning data, from simple lists to complex multi-column spreadsheets.
Key Insights
- Use Excel's "Remove Duplicates" feature by selecting any cell within your data, navigating to the Data tab, and clicking the Remove Duplicates button in the Data Tools group.
- When removing duplicates, Excel retains the original value and eliminates subsequent identical entries, as demonstrated by the removal of duplicate names "Mabel" and "Maria," reducing a list to eight unique values.
- In more complex data sets with multiple columns, ensure that all columns are selected in the Remove Duplicates dialog box to verify entire rows for duplication, illustrated through an example where four duplicate entries were identified and removed, leaving 1,189 unique values.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
Remove duplicates. In this section, we'll use remove duplicates to eliminate duplicate data. In this first exercise, we have two lists that are exactly the same.
We're going to remove duplicates from one of the lists and then compare it with the original list to see what was removed. The process of removing duplicates is very simple. You're going to click anywhere within the data where you would like to remove duplicates.
I'm going to click on this second list anywhere within the data. Then I'll navigate over to the data tab on the ribbon, the data tools group, and then look for remove duplicates. If the width of your application is wide enough, you'll actually see the text remove duplicates or you'll look for this icon.
If you hover your mouse over any of the icons on the ribbon, usually a pop-up will come up letting you know that you're in the right place. So with the cell selected within the data where I'd like to remove duplicates, I'm simply going to click remove duplicates. The column that needs to be selected is already selected.
I see that it's checked here. My data does have headers, so I'm not going to include the headers as part of the criteria for removing duplicates. Excel is going to look throughout the entire list and when it finds the second value that is the duplicate, it will remove it but keep the original value.
I'll click okay. Excel displays a progress report letting me know that two duplicate values have been found and removed and now I have eight unique values remaining. I can simply click okay and if I compare my unique list with the original list, I'll see that Mabel and Maria were the duplicates that were removed from the list, the second list.
So let's take a look at an exercise that we would practice in class related to this. If I go to the next tab, now we've increased the level of difficulty. We have multiple columns and multiple rows.
The process though is pretty much the same. I'm going to select a cell anywhere within the data. There's no need to select the entire region.
I'll go to the data tab. I'll head over to the data tools group. I'll click remove duplicates.
Immediately, I can see Excel did grab all the information. Also, my data has headers. If that was not checked, then my data would look like this.
I want to exclude the headers from the remove duplicates process. So I'll click the check box there and best bet is to make sure all the columns are selected because you may want to check each and every value in one row to see if it's duplicated by each and every value in another row. In this case, the entire row is a unique value and that's your best bet.
I'll now click okay. Excel gives me a progress report, tells me that four duplicate values were found and removed and 1189 unique values remain. I'll click okay.
I can go into the spreadsheet, type four, press ENTER and that solves that particular task. So that's how you remove duplicates. Simply select the cell anywhere within the data.
Go to the data tab in the data tools group. You'll click on remove duplicates and click okay.