Excel Tips and Tricks for Data Analysts - Create a copy of a table without duplicates


To validate some of my skills, I am studying for the Excel MOS 200 exam. Since I work in data processing a lot, it seemed like a good time to document some features in Excel that may not necessarily be that apparent to the casual Excel user. 

Next up - recreate a table of data without duplicate records. This is very similar to the previous post where I showed how to delete duplicate records in a table in Excel, but in this case, you may want to keep the original table. In photo-editing, its called non-destructive editing.

Let's start with the sample Financial Excel sheet from Microsoft. Its available here:

The sheet has 701 rows in it (including the header row), and as I promised, I am reusing it for this post and maybe a few more as well. To start, we'll copy the first four rows of data at the top, and paste them into the bottom of the sheet.

Then select all 705 rows, including the header.

Click on the DATA tab in the menu bar, and then click the Advanced option in the Sort and Filter section.

The Advanced Filter window will appear. At the top is an Action option. Change it to Copy to another location. The rows you have selected will be listed in the List Range field. if you only wanted to filter based on a specific column, you could do that by selecting it as part of the Criteria range field. For our purposes, we will leave that blank, since we want to make sure all fields are unique. Finally, check the Unique records only box. The box should look like the one below.


Now click OK. The Excel sheet will update, and you'll see that the duplicated rows are removed in the copy of the data. 

You can also copy the data to another tab in your workbook, if you start on the destination tab when clicking the Advanced button. Create that tab before you start if there is no other tab.

For this example, I renamed the first tab to 1 so the entire list range would show up in my screenshot. From the destination tab, click the Advanced option to repeat the above process. Switch the Action option to Copy to another location. In the list option, you can enter the relative selection information proceeded by the tab name, as I show here, or you can click the arrow button at the end of the row, and then switch to the source tab and make your selection. Once you have done that, click the down arrow as show here, and the entire Advanced Filter window will reappear.


Click in the upper left cell where you want the data to be pasted, and then check  the Unique records only button. Then click OK.
 


The Excel sheet will update, and you'll see that the duplicated rows are removed in the copy of the data on a new tab. 


Very handy when bringing records into a database or an analytics process.

Comments