45

I need to perform 2 operations in Sublime Text: extract unique lines and extract duplicate lines. For example for input

a
b
a

Extract duplicates should result in:

a

and Extract unique should result in:

b

Is there a built-in operation or a plugin to do that?

karel
  • 13,706
Poma
  • 1,896

5 Answers5

62

You can find duplicate lines easily by running a Sort Lines then searching for this regex that uses line boundary markers ^ and $ and the back reference \1.

^(.+)$\n^\1$

Follow that with a Find All, Copy, Paste in a new tab, Permute Lines | Unique and you've extracted them.

twamley
  • 721
  • 5
  • 4
12

Unfortunately I don't have access to Sublime Text at the moment, so I'm not able to test this, but I believe something like the following might work for you:

  1. Sort the lines via the Edit -> Sort Lines command
  2. Install the Highlight Duplicates plugin, and use it to highlight all the duplicate lines
  3. Cut the highlighted lines to the Clipboard, and paste them into a New File
  4. The lines that remain in the original file are your Extract Unique lines
  5. In the New File, select all the text, and remove duplicate lines via the Edit -> Permute Lines -> Unique command
  6. The lines that remain in the New File are your Extract Duplicates lines

I'm not entirely sure that step #1 is actually necessary, but I included it just in case.

MJH
  • 1,155
2

Had the same problem (show me the dupes)... didn't find an easy Sublime-based answer and fell back to using Unix commands (my file had the data I wanted to find the duplicates of in columns 11-56):

cut -c 11-56 myfile.dat | sort | uniq -d

Posted here as an FYI to others.

Tom Hundt
  • 173
2

I found the easiest way to do this with Sublime Text was to just sort lines (f5 on mac), permute lines > unique, then view the diff with git.

0

Slightly modified @MJH answer above to get duplicated lines with Sublime 3 and DiffMerge, without using Highlight Duplicates plugin.

  1. Sort the lines via Sublime 3 Edit -> Sort Lines command
  2. Save original file as sorted_orig.txt
  3. Select all the text, and remove duplicate lines via Sublime 3 Edit -> Permute Lines -> Unique command
  4. Save modified file as no_dup_sorted.txt
  5. Start diff with DiffMerge tool with sorted_orig.txt and no_dup_sorted.txt files.
  6. Use Export -> File Diffs in DiffMerge to get a list of duplicates in clipboard or save to another file.