I've modified the code from galacticninja's answer to do exactly what OP wanted. It is run in the same way, however it will move the files to a catch folder in the root C:\
directory instead of just listing the images on the command prompt.
You can find my modified code on Pastebin or below:
#This program will scan a directory and all it's subdirectories for corrupted jpg, png, gif, and bmp images and collect them in a Catch folder
#To run this program you will need to install Python 2.7 and PILLOW
#Once installed save this file in a notepad document with the .py extension
#Than run cmd.exe and type the following: C:\Python27\python.exe "C:\Directory this is saved in\this.py" "C:\Directory to be scanned"
#You must make a folder called Catch in your root C:\ directory for the corrupted images to be collected in
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
# vi:ts=4 sw=4 et
# Okay, this code is a bit ugly, with a few "anti-patterns" and "code smell".
# But it works and I don't want to refactor it *right now*.
# TODO:
# * Refactor it a little
# * Add support for custom filename filter (instead of the hardcoded one)
#Big thanks to denilsonsa for writing most of this code at https://bitbucket.org/denilsonsa/small_scripts/src/542edd54d290d476603e939027ca654b25487d85/jpeg_corrupt.py?at=default
import getopt
import fnmatch
import re
import os
import os.path
import sys
import PIL.Image
available_parameters = [
("h", "help", "Print help"),
("v", "verbose", "Also print clean files"),
]
class ProgramOptions(object):
"""Holds the program options, after they are parsed by parse_options()"""
def __init__(self):
self.globs = ['*.jpg', '*.jpe', '*.jpeg', '*.gif', '*.png', '*.bmp']
self.glob_re = re.compile('|'.join(
fnmatch.translate(g) for g in self.globs
), re.IGNORECASE)
self.verbose = False
self.args = []
def print_help():
global opt
scriptname = os.path.basename(sys.argv[0])
print "Usage: {0} [options] files_or_directories".format(scriptname)
print "Recursively checks for corrupt image files"
print ""
print "Options:"
long_length = 2 + max(len(long) for x,long,y in available_parameters)
for short, long, desc in available_parameters:
if short and long:
comma = ", "
else:
comma = " "
if short == "":
short = " "
else:
short = "-" + short[0]
if long:
long = "--" + long
print " {0}{1}{2:{3}} {4}".format(short,comma,long,long_length, desc)
print ""
print "Currently (it is hardcoded), it only checks for these files:"
print " " + " ".join(opt.globs)
def parse_options(argv, opt):
"""argv should be sys.argv[1:]
opt should be an instance of ProgramOptions()"""
try:
opts, args = getopt.getopt(
argv,
"".join(short for short,x,y in available_parameters),
[long for x,long,y in available_parameters]
)
except getopt.GetoptError as e:
print str(e)
print "Use --help for usage instructions."
sys.exit(2)
for o,v in opts:
if o in ("-h", "--help"):
print_help()
sys.exit(0)
elif o in ("-v", "--verbose"):
opt.verbose = True
else:
print "Invalid parameter: {0}".format(o)
print "Use --help for usage instructions."
sys.exit(2)
opt.args = args
if len(args) == 0:
print "Missing filename"
print "Use --help for usage instructions."
sys.exit(2)
def is_corrupt(imagefile):
"""Returns None if the file is okay, returns an error string if the file is corrupt."""
#http://stackoverflow.com/questions/1401527/how-do-i-programmatically-check-whether-an-image-png-jpeg-or-gif-is-corrupted/1401565#1401565
try:
im = PIL.Image.open(imagefile)
im.verify()
except Exception as e:
return str(e)
return None
def check_files(files):
"""Receives a list of files and check each one."""
global opt
i = 0
for f in files:
# Filtering JPEG, GIF, PNG, and BMP images
i=i+1
if opt.glob_re.match(f):
status = is_corrupt(f)
if opt.verbose and status is None:
status = "Ok"
if status:
file = "{0}".format(f, status)
print file
shorthand = file.rsplit('\\', 1)
extention =shorthand[1]
fullFileName = "C:\Catch" + "\\" + extention
os.rename(file, fullFileName)
def main():
global opt
opt = ProgramOptions()
parse_options(sys.argv[1:], opt)
for pathname in opt.args:
if os.path.isfile(pathname):
check_files([pathname])
elif os.path.isdir(pathname):
for dirpath, dirnames, filenames in os.walk(pathname):
check_files(os.path.join(dirpath, f) for f in filenames)
else:
print "ERROR: '{0}' is neither a file or a dir.".format(pathname)
if __name__ == "__main__":
main()
@Synetech Any update on that PHP script? – Hashim – 2020-01-02T00:49:48.237
You mean visual corruption, I assume? I'd LOVE this...finally I could stop eyeballing the thumbnails of my comic books for broken jpgs. – Shinrai – 2011-04-27T19:24:27.527
Visual or structural. I found one app that supposedly did this, but it missed lots of files that didn’t even have the header! – Synetech – 2011-04-27T19:27:19.993
Oh, that stuff didn't even occur to me. Yes, please...this has to exist SOMEWHERE right? – Shinrai – 2011-04-27T19:55:37.387
1Can you upload one or more examples of such a broken file and link to them in your question? – slhck – 2011-04-27T20:19:58.710
@Shinrai, examining the thumbnails is not reliable because many picture formats include a separate thumbnail version embedded in the picture, and that may be intact. That’s why sometimes a picture whose thumbnail looks fine, is corrupt when opened. – Synetech – 2011-08-17T00:41:17.660
@Synetech - You're exactly right, of course, but in practice I never actually run into that due to the way these are generally scanned and stored. It could certainly be an issue for People Who Are Not Me, though! – Shinrai – 2011-08-17T14:11:37.027
(I’m still trying to bring myself to work on this, but since there are almost 9,000 files to fix/check, I keep putting it off. What really annoys me is that the stupid
for
command didn’t work correctly. What’s even worse, is that I could/would have sworn that the volume had 64KB clusters, not 8KB because it was originally supposed to be just for cloned image backups, which of course means multi-GB files, so small clusters are pointless. If it had been 64KB like I remembered making it, the recovery process would have been drastically easier. sigh) – Synetech – 2011-09-08T21:56:58.8771I recently (re-)wrote a PHP script to scan graphics files. It is extremely promising and seems to give the most accurate results of all of my tests (other tools give lots of false positives and negatives). Once I work out the kinks, I’ll clean it up and post a version that supports graphics and archive files here. (I’ll figure something out for other types like executables later.) – Synetech – 2013-07-20T22:20:37.443