18

I have two music libraries, one a newer version than the other. I would like to compare them to figure which files I need to copy from new music tree to old.

I have tried diff --brief -r /oldmusicdir/ /newmusicdir/ based on another user's suggestion but ^C the process after fifteen minutes (I'm guessing diff is scanning the music files themselves -- which is not necessary).

Then I tried find /oldmusicdir/ -type d | sort > oldmusicdir for old and new, then ran diff oldmusicdir newmusicdir However, since I stored the music directories in separate locations every single entry was flagged.

Next I tried running find /musicdir/ -type d | basename -s - | sort > musicdir but then my musicdir file simply read "-"

Does anyone know how to get basename to accept from STDIN? Or, does anyone have a better way of quickly comparing two music directories?

Thanks!

curios
  • 390

5 Answers5

29

The rsync utility first popped into my mind when I saw your question. Doing something like below could quickly show what files are in directory a but not in b:

$ rsync -rcnv a/* b/

-r will recurse into the directories
-c will compare based on file checksum
-n will run it as a "dry run" and make no changes, but just print out the files 
   that would be updated
-v will print the output to stdout verbosely

This is a good option because you can compare the contents of the files as well to make sure they match. rsync's delta algorithm is optimized for this type of use case. Then if you want to make b match the contents of a, you can just remove the -n option to perform the actual sync.

Some related questions:

3

How about generating a recursive directory listing of each directory into separate files and then using diff on those two files?

Jason Aller
  • 2,360
1

If you like using some GUI-based utility, you can try some comparison tool: under Windows, my favorite is Total Commander (homepage: https://www.ghisler.com): it has a quite flexible feature that allows to compare and sync entire directory trees; a quick Diff tool is also present, allowing you to inspect the differences encoutered (useful for code lines, less for binary files).

In *ux, I know there's Midnight Commander, but I never tried it and I'm not sure if the same feature is present.

Anyway, Wikipedia has a page where the most used comparison tools are listed: maybe you can find some useful information there: https://en.wikipedia.org/wiki/Comparison_of_file_comparison_tools

Max

1

Another option is comm

comm -3 <(cd oldmusicdir/ ; find . | sort) <(cd newmusicdir/ ; find . | sort)

Note the cd which will make sure the locations listed will be relative to each base folder.

From the comm manpage:

NAME
       comm - compare two sorted files line by line

SYNOPSIS comm [OPTION]... FILE1 FILE2

DESCRIPTION Compare sorted files FILE1 and FILE2 line by line. <snip> -3 suppress column 3 (lines that appear in both files)

localhost
  • 695
0

Another approach would be the use of process substitution to pass to diff two "on demand" file descriptors that each hold stdout from the commands.

Example comparing the leaf-node files in each only:

diff -y \
  <(find /oldmusicdir -type f | xargs basename | sort -V) \
  <(find /newmusicdir -type f | xargs basename | sort -V)

Of course you can do whatever parsing/formatting of the output of find to match them up as you see fit.