Archive for the ‘duplicate’ Tag

How to remove duplicates from a list

There are tons of ways to remove duplicates from a list of items, most of which are way too complicated and technical for a noob (if you don’t know what this is, then that’s you) to perform.

Let’s look at some of them, and let me know in the comments if you think these were useful. Here they come, in order of increasing difficulty (geekness)

TextPad

removing duplicates with textpad

textpad sort & remove

TextPad is a free (well, nagware) text editor with so many built in tools I cannot talk about it without crying…

  • open the file in TextPad
  • select Tools > Sort
  • check the box at ‘remove duplicate lines’
  • click OK

Excel

Another program abused by millions (billions?) to do stuff that could be done with a 10 year old cellphone. What the bozos at Microsoft tell you is the dumbest way to do it, because you’re overwriting your original list. Here’s the smart way

  • Excel Pivot

    Excel Pivot

    Select the data

  • Click Data > PivotTable… (Office 2003) or Insert > PivotTable (Office 2007)
  • Click ‘Finish’ to create a new sheet with an empty pivot table
  • Drag the column for which you need to remove duplicates into the left part of the pivottable
  • adding flavor: you can now sort, filter, group, analyze, you name it.

SQL

If your data is in a database, and you have access to SQL, perform the following query:

  • SELECT DISTINCT(column) FROM table
  • in place of ‘column’ you type what you need to be unique
  • in place of ‘table’ you type the table name
  • tip: you can combine more columns by typing (column1, column2, …)

Linux

If you have the file on a linux or unix system, from the terminal (command line) type

  • sort -u file > output
  • where ‘file’ is the name of your file and ‘output’ is the name of the output file.

I’d like to know if you can come up with new and maybe even faster ways!

Follow

Get every new post delivered to your Inbox.