Multimedia files changed in size for no apparent reason

2

1

I synchronize my data between my laptop's SSD and a pendrive. A few days ago my sync software (FreeFileSync) reported that hundreds of files has changed in size, despite that i haven't done anything with them.

The changed files are generally multimedia - *.jpg, *.png and a few *.mp3's. While binary comparison confirms that the files are, in fact different, the actual images appear to be identical (Beyond Compare confirms that).

The difference in size is varying, but there are some patterns: mp3s are ~4 KB bigger on my laptop, jpgs are quite consistently ~6.5-7 KB bigger on my laptop. On the other hand, pngs differ wildly: the differences go from a few dozen bytes to over 100KB (and there's no fixed ratio between filesize and difference between files), and in most cases the pendrive ones are bigger.

I though that maybe there is just a difference in metadata, and indeed, exif information was slightly different, but the difference between exif was smaller than the difference between files.

I've uploaded a few of these files to virustotal.com, and it didn't detect any virus. What can this be and what should i do? Should i simply overwrite bigger files with smaller ones? Something like this has never happened to me before.

I'm using Ubuntu 12.04; my laptop copy of the data is on an NTFS partition, and the pendrive is FAT32 - but i don't think that the difference of the default file cluster allocation size matters here, because i would have noticed it before.

Jan Warchoł

Posted 2013-08-02T23:37:26.170

Reputation: 210

On binary comparison, is there a big section of the files that match with just a small section that does not? Or is the entire file completely different? That is, do the files contain a copy of the image with what appears to be some extra data on the beginning or end, or are they completely across the whole file? Like file one might be "thisisthefilea" and file two might be "thisisthefiledfmnmi" or is it more like "thisisthefile" and "secondfilecompletelydiff" – Damon – 2013-08-04T03:54:13.310

The first one would be indicative of some extra metadata, the question is what metadata and is it harmful. The second one is indicative of the files being re-saved and/or re-compressed using the same settings for near identical looks, but actually different. (PNG would have to be different setting or a missing setting since it is lossless) – Damon – 2013-08-04T03:58:24.740

@damon Hex comparison tells me that png files are totally different. Jpgs and mp3s on my laptop have some extra data at the beginning (which consists mostly of null bytes and spaces), and the rest of the file is the same. You can compare one pair here. The extra data at the beginning contains one somewhat meaningful fragment which you can see here - it contains words "adobe" and "MicrosoftPhoto". What's strange, i very rarely use Windows and Adobe.

– Jan Warchoł – 2013-08-04T10:59:27.120

Answers

2

You metadata has been changed. When you compare the metadata using this online metadata viewer, of the collet files, you find that the larger file has more metadata (aprox 4.2k) which accounts for the bulk of the file size difference. The bigger file has added padding, and the thumbnail size is different, along with there simply being more exif data available to view. Point is that something has modified your exif data, probably a legit program either doings things someone told it to, like save, or someone didn't know they were telling it to (rotating in microsoft picture viewer saves even though you simply wanted to rotate. I'm sure other programs have similar pitfalls)

There is still an additional 2K +/- of metadata unaccounted for, but I would bet with some research you could figure out what it is. Identifying what program did what would be difficult in hindsight, but in answer to your question, I would say it is safe to overwrite the larger files with the smaller ones unless you think you will need the extra metadata in the future. Although with the simple addition of padding found, I would bet that the extra space might come back if whatever happened before, happens again.

There is metadata with so many multimedia files that I would assume this is the case with all the file types you listed.

Damon

Posted 2013-08-02T23:37:26.170

Reputation: 1 789

I don't think this is the reason: when i copy a file from one location to the other, the difference remains. Besides, the files are actually different - md5sums are not the same. Also, if it worked like you said, i would have encountered this every time when i sync. – Jan Warchoł – 2013-08-03T06:55:36.153

Is the file size difference the same across file types? Or different per file? – Damon – 2013-08-03T16:56:26.403

Different, but there are some patterns. mp3s are ~4 KB bigger on my laptop, jpgs are quite consistently ~6.5-7 KB bigger on my laptop. On the other hand, pngs differ wildly: the differences go from a few dozen bytes to over 100KB (and there's no fixed ratio between filesize and difference between files). – Jan Warchoł – 2013-08-03T19:50:58.593

Ok, this makes sense in case of jpgs and mp3s. As for pngs, i've checked one pair with that metadata viewer (you can see these files here) - there is a difference in metadata, but quite small. The only explanation that comes to my mind is that they were recompressed with different settings. Anyway, thanks for your help!

– Jan Warchoł – 2013-08-05T08:40:14.153