I have around 15,000 music files stored on Ubuntu server (16.04), around 50% FLAC, 25% each mp3 and m4a (aac).
I think maybe 3-5% are corrupted due to HDD hardware failure. The problems accumulated gradually for some time before I noticed. Files are now recovered to new drives using ddrescue.
Original storage was two copies of each file on separate devices, and both drives gradually failed, but independently. Result is that a file which is bad in one copy may be OK in the other copy.
I am trying to find command line validation method to use in a script to identify which titles have at least one good copy. In cases where both are bad I will need to re-rip from CD.
For FLAC, I have looped the command flac -t in a script which generates lists of good files and the bad files. I believe the flac -t command decodes without sending audio to any play device, and calculates an MD5 hash on the decoded audio and compares this to an original hash included in the file’s metadata. This is pretty fast and works fine.
I would like to achieve similar validation with the mp3 and the m4a files, but have not been able to find a suitable tool. I have looked at mp3val, but testing this against an mp3 where I deliberately damaged data in the audio does not show an error.
From what I can find researching mp3 and m4a it seems there is no hash stored, so I am not sure what other approaches to validation might be possible.
Ideally I would like to sort into definitely good / definitely bad. If this can't be done, I would still benefit from sorting into possibly good / definitely bad, or definitely good / possibly bad.
Can anyone suggest some Linux solution that could achieve this, for either/both of mp3 and m4a/aac?