After moving and backing up my photo collection a few times I have several duplicate photos, with different filenames in various folders scattered across my PC. So I thought I would write a quick CF (9) page to find the duplicates (and can then add code later to allow me to delete them).
I have a couple of queries:-
- At the moment I am just using file size to match the image file, but I presume matching EXIF data or matching hash of image file binary would be more reliable? 
- The code I lashed together sort of works, but how could this be done to search outside web root? 
- Is there a better way? 
p
<cfdirectory 
name="myfiles" 
directory="C:\ColdFusion9\wwwroot\images\photos" 
filter="*.jpg"
recurse="true"
sort="size DESC"
type="file" >
<cfset matchingCount=0>
<cfset duplicatesFound=0>
<table border=1>
<cfloop query="myFiles" endrow="#myfiles.recordcount#-1">
    <cfif myfiles.size is myfiles.size[currentrow + 1]>
        <!---this file is the same size as the next row--->
        <cfset matchingCount = matchingCount + 1>
        <cfset duplicatesFound=1>
    <cfelse>
        <!--- the next file is a different size --->
        <!--- if there have been matches, display them now ---> 
        <cfif matchingCount gt 0>   
            <cfset sRow=#currentrow#-#matchingCount#>
            <cfoutput><tr>
            <cfloop index="i" from="#sRow#" to="#currentrow#"> 
                    <cfset imgURL=#replace(directory[i], "C:\ColdFusion9\wwwroot\", "http://localhost:8500/")#>
                    <td><a href="#imgURL#\#name[i]#"><img height=200 width=200 src="#imgURL#\#name[i]#"></a></td>
            </cfloop></tr><tr>
            <cfloop index="i" from="#sRow#" to="#currentrow#"> 
                <td width=200>#name[i]#<br>#directory[i]#</td>
            </cfloop>
            </tr>
            </cfoutput>
            <cfset matchingCount = 0>
        </cfif> 
    </cfif>
</cfloop>
</table>
<cfif duplicatesFound is 0><cfoutput>No duplicate jpgs found</cfoutput></cfif>
 
     
     
    