5

I want to analyze disk usage by file type, if possible also sort and find. In other words, the output of the file utility as criteria for the analysis or the sort.

So regardless of file name, the command or script should look into the file, determine its type, and make the sort or disk usage analysis based on the result.

Is there an easy way to do this?

fixer1234
  • 28,064
emk2203
  • 794

1 Answers1

4

JDiskReport has a tab to display disk usage by file type, but the type data is based on file extensions, not actual content.

Otherwise here's a script that uses file to determine types:

$ ./disk_usage_by_file_type -c /dir/to/analyze
Collecting file type data, please wait ... 
Done. Now run 'disk_usage_by_file_type -s' to print disk usage.

(will take a while if directory is big)

$ ./disk_usage_by_file_type -s
...
154 Mb : application|pdf; charset=binary
170 Mb : video|x-msvideo; charset=binary
227 Mb : application|x-iso9660-image; charset=binary
690 Mb : application|octet-stream; charset=binary
810 Mb : audio|mpeg; charset=binary

To get a list of all files + sizes for a given type(s), sorted by file size:

$ ./disk_usage_by_file_type -d 'image|jpeg' | sort -n
...
590: /share/pictures/screenshot.jpg
1017: /share/pictures/cd_cover/Wheel cutout+drop.jpg
16496: /share/pictures/photos/landscape.jpg
17642: /share/pictures/photos/contrast.jpg
lemonsqueeze
  • 1,330