Recursively fixing image file extensions in Linux

5

3

I have a bunch of image files from misnamed scans/faxes I need to fix for our Linux users. It turns out we have a bunch of scans that are PNG files that are labled as *.jpg visa versa. Under Windows this was never an issue as Explorer/Office would just ignore the extension. But under Linux, Eye of GNOME etc end up just refusing to open the file because the contents don't match the extension.

Does anyone have any recommendations for a tool or a little piece of script that could do this? I could write a C program to do this, but that seems a bit overkill. Just sitting down and manually renaming manually isn't an option, there are thousands.

Edit: I see the file command will look at the actual contents of the file and display what it is. I'm not quite sure how to use the information from it though.

BlamKiwi

Posted 2014-09-16T12:51:27.233

Reputation: 193

Answers

9

You'll want to iterate over the files that look like image files, call file to see what they really are, then rename them appropriately

for f in *.{jpg,JPG,png,PNG,jpeg,JPEG}; do 
    type=$( file "$f" | grep -oP '\w+(?= image data)' )
    case $type in  
        PNG)  newext=png ;; 
        JPEG) newext=jpg ;; 
        *)    echo "??? what is this: $f"; continue ;; 
    esac
    ext=${f##*.}   # remove everything up to and including the last dot
    if [[ $ext != $newext ]]; then
        # remove "echo" if you're satisfied it's working
        echo mv "$f" "${f%.*}.$newext"
    fi
done

glenn jackman

Posted 2014-09-16T12:51:27.233

Reputation: 18 546

1That looks pretty close to the script I started working on. But when I try run that script I get ??? what is this: *.{jpg,JPG,png,PNG,jpeg,JPEG} – BlamKiwi – 2014-09-16T13:24:11.690

1@MorphingDragon What shell or version of Bash are you using? Did you disable brace expansion? Or is there no such file in your directory? Run shopt -s nullglob before, maybe, so that the glob doesn't expand in that case. – slhck – 2014-09-16T13:27:47.287

4.3.11 (Mint 17) – BlamKiwi – 2014-09-16T13:28:54.360

Instead of the braces you can use find commands, so the first script line would be for f in $(find -iname "*.jpg") $(find -iname "*.png") $(find -iname "*.jpeg"); do. This also allows the files to be in subdirectories, though it would be a lot more complicated if there are blanks in any of the directory or file names. – AFH – 2014-09-16T14:11:38.690

This is bash syntax, so do not sh script.sh, use bash script.sh – glenn jackman – 2014-09-16T14:32:15.840

1@AFH yes and no. That will break if the files contain whitespace which is why a glob is preferred. If you use find, you should at least combine it with -print0 and while IFS= read -d''. – terdon – 2014-09-16T16:03:44.470

@terdon read -r actually :) (to be super precise) – slhck – 2014-09-16T17:09:48.743

@slhck true, I keep forgetting backslashes. – terdon – 2014-09-16T17:10:43.627

This almost worked for me in Bash 4.3.39 with GNU grep 2.21 on Chakra Linux. I had to change the regex after it kept failing on, e.g., JPEG image data, JFIF standard 1.01, aspect ratio, density 1x1, segment length 16, Exif Standard: [TIFF image data, little-endian, direntries=2, software=Google], baseline, precision 8, 1467x1586, frames 3. I don't know Perl regex, so I used two regexes in sequence: egrep -o '(JPEG|PNG) image' | egrep -o 'JPEG|PNG'. Is there a way for a single regex to handle "JPEG TIFF" cases like this, without hardcoding the allowed formats? – Mark Haferkamp – 2016-02-05T10:24:22.240

As given, the routine failed for PNG files using file 5.0.3 on Ubuntu 10.04: Most image files return "image data" when tested by file, but PNG images return just "image". By removing "data" in the second line, grep finds all images. – Andrew P. – 2017-01-20T17:40:31.190

-3

ls | grep "png" | awk 'BEGIN{FS="\."}{print $1".png " $1".jpg"}'|xargs mv

Obvs you have to change the first part before the awk to find your png files. And this won't work if you have more than 1 . in the path.

This probably is not a solution that will work in your specific case right away, but there are all the pieces you need. If you give me more details or better yet a printout of a list of all the files you need to be changed, then I could change the command to suit you exactly.

EDIT #1: I didn't realize that he needed to examine the file contents to know which are wrong, i thought he knew the list of incorrectly named files.

EDIT #2: Commenters seem to be obsessing with the part of the code before the awk. I thought I was clear enough with my first sentence, but since I wasn't; You will have to provide the code before the awk that will appropriately list your files. Furthermore, this was intended as a basic example of how you could get started accomplishing a task like this yourself. I admitted it almost certainly wasn't going to work by you copying and pasting.

You will learn more by completing and working through the actual solution yourself than by blindly applying lines from the internet. If you are still stuck from this point feel free to comment and I can try to help.

Carlos Bribiescas

Posted 2014-09-16T12:51:27.233

Reputation: 185

1>

  • Do not parse ls! 2. You completely misunderstood the question.
  • < – jimmij – 2014-09-16T13:16:26.497

    I was assuming he knew which needed to be switched, not that he needed to open them to know. what do you mean don't parse ls? – Carlos Bribiescas – 2014-09-16T13:18:08.313

    1http://mywiki.wooledge.org/ParsingLs – jimmij – 2014-09-16T13:18:47.160

    Yes, this does break if you put new lines in your filenames... – Carlos Bribiescas – 2014-09-16T13:23:22.753

    1Or spaces, or backslashes, or tabs, or file names that contain png... – terdon – 2014-09-16T15:30:00.850

    ... Yup. He will have to construct his own set of files to pass in. – Carlos Bribiescas – 2014-09-16T16:59:01.653