It depends what type of data you're comparing/analyzing.
The basic solution is
file_get_contents gives you strings of the file data
strcmp will do a "binary-safe compare" of the data
You will probably want to explode() your data to delimit it somehow, and compare sections of the data.
Another option is to delimit, loop through, and make a "comparison coefficient" which would indicate to what degree the files deviate from a norm. For example, File 1 has cc=3, file 4 has cc=8. File 4 would be a closer match.
A final problem you'll run into is the memory limit on the server computer. You can change this in php.ini.
//EDIT
Just noticed the diff tag, but I'll leave this up anyway in case it helps somehow.