1

When I download archives, they often contain images I already have in another folder, but mostly with another name or file-format.

My question: What script / software 1 do you know, that can recognize 2 duplicate images?

It should let you decide if they are really the same (and possibly remove the images with inferior quality).

1 at best cross-platform, but Linux would be sufficient

2 e.g. by their color-difference or something

Joschua
  • 223

3 Answers3

4

for windows their is a small freeware "visipic". it can search for similar pictures and give you the option to delete or move the files according to their quality.

download visipic http://www.visipics.info/index.php?title=Download

kaykay
  • 891
1

A bit of python magic can help you here. Making sure that PIL is installed:

import os, sys, glob
import ImageChops, Image


def equal(im1, im2):
    return ImageChops.difference(im1, im2).getbbox() is None

dir1 = sys.argv[1]
dir2 = sys.argv[2]

for im1 in glob.glob(os.path.join(dir1, "*.jpg")):
    for im2 in glob.glob(os.path.join(dir2, "*.jpg")):
        if im1!=im2 and equal(Image.open(im1), Image.open(im2)):
            print im1, "==", im2

Assuming the script is saved as image-diff.py:

$python image-diff.py dir1 dir2

It will look for all JPG images in dir1 and dir2, and compare them. It works in O(N^2) excluding the time ImageChops.difference takes, so may be not suitable for large set of image archives. But gives you the idea. Modify and hack as you please.

Shuaib
  • 96
0

For byte by byte copies of images, you can use the diff command in a terminal (man diff) For images that probably aren't identical logically, there's a program named fdupes that I used to use in ubuntu.

nopcorn
  • 16,982