Analyze disk usage based on file type

4

I want to analyze disk usage by file type, if possible also sort and find. In other words, the output of the file utility as criteria for the analysis or the sort.

So regardless of file name, the command or script should look into the file, determine its type, and make the sort or disk usage analysis based on the result.

Is there an easy way to do this?

emk2203

Posted 2014-10-04T10:25:31.597

Reputation: 594

Answers

3

JDiskReport has a tab to display disk usage by file type, but the type data is based on file extensions, not actual content.

Otherwise here's a script that uses file to determine types:

$ ./disk_usage_by_file_type -c /dir/to/analyze
Collecting file type data, please wait ... 
Done. Now run 'disk_usage_by_file_type -s' to print disk usage.

(will take a while if directory is big)

$ ./disk_usage_by_file_type -s
...
154 Mb : application|pdf; charset=binary
170 Mb : video|x-msvideo; charset=binary
227 Mb : application|x-iso9660-image; charset=binary
690 Mb : application|octet-stream; charset=binary
810 Mb : audio|mpeg; charset=binary

To get a list of all files + sizes for a given type(s), sorted by file size:

$ ./disk_usage_by_file_type -d 'image|jpeg' | sort -n
...
590: /share/pictures/screenshot.jpg
1017: /share/pictures/cd_cover/Wheel cutout+drop.jpg
16496: /share/pictures/photos/landscape.jpg
17642: /share/pictures/photos/contrast.jpg

lemonsqueeze

Posted 2014-10-04T10:25:31.597

Reputation: 1 151

Looks useful and works perfectly. I'd love to upvote you, but I am not allowed yet. – emk2203 – 2014-10-11T11:27:02.380

Don't worry you'll be able soon, just need a little more rep. You should be able to mark the answer as accepted though, since it's your question. That earns you some rep even i think :) – lemonsqueeze – 2014-10-11T19:48:14.670

It's been a while, but with enough reputation, I upvoted you now. :) – emk2203 – 2017-09-18T20:06:33.920