Check for damaged file mp3, m4a (aac) in Linux

4

I have around 15,000 music files stored on Ubuntu server (16.04), around 50% FLAC, 25% each mp3 and m4a (aac).

I think maybe 3-5% are corrupted due to HDD hardware failure. The problems accumulated gradually for some time before I noticed. Files are now recovered to new drives using ddrescue.

Original storage was two copies of each file on separate devices, and both drives gradually failed, but independently. Result is that a file which is bad in one copy may be OK in the other copy.

I am trying to find command line validation method to use in a script to identify which titles have at least one good copy. In cases where both are bad I will need to re-rip from CD.

For FLAC, I have looped the command flac -t in a script which generates lists of good files and the bad files. I believe the flac -t command decodes without sending audio to any play device, and calculates an MD5 hash on the decoded audio and compares this to an original hash included in the file’s metadata. This is pretty fast and works fine.

I would like to achieve similar validation with the mp3 and the m4a files, but have not been able to find a suitable tool. I have looked at mp3val, but testing this against an mp3 where I deliberately damaged data in the audio does not show an error.

From what I can find researching mp3 and m4a it seems there is no hash stored, so I am not sure what other approaches to validation might be possible.

Ideally I would like to sort into definitely good / definitely bad. If this can't be done, I would still benefit from sorting into possibly good / definitely bad, or definitely good / possibly bad.

Can anyone suggest some Linux solution that could achieve this, for either/both of mp3 and m4a/aac?

BobM

Posted 2017-11-14T07:04:36.247

Reputation: 41

Perhaps you could provide examples of damaged files or how to create one that resembles damaged files you have? – slhck – 2017-11-15T08:49:50.350

Short answer - will take a bit of work to identify damaged file examples by hand. I will try but may take a day or so. estimate around 3-5% bad. The files reside on a server, played via a Sonos system. I suspect the underlying issue is lost or damaged blocks at hardware or filesystem level, and sometimes affecting metadata, sometimes audio data. Now am running a script on FLAC, this will take a few hours. After this, I will try to find sample bad files of mp3 or m4a for a closer look. Really, I just want to do a preliminary screen, so I can focus on the likely bad files. – BobM – 2017-11-15T09:29:30.217

Detect and fix mp3 errors – Tom Hale – 2019-09-16T02:51:21.400

No answers