I would like a BASH command to list just the count of files in each subdirectory of a directory.
E.g. in directory /tmp there are dir1, dir2, ... I'd like to see :
`dir1` : x files
`dir2` : x files ...
Assuming you want a recursive count of files only, not directories and other types, something like this should work:
find . -maxdepth 1 -mindepth 1 -type d | while read dir; do
printf "%-25.25s : " "$dir"
find "$dir" -type f | wc -l
done
This task fascinated me so much that I wanted to figure out a solution myself. It doesn't even take a while loop and MAY be faster in execution speed. Needless to say, Thor's efforts helped me a lot to understand things in detail.
So here's mine:
find . -maxdepth 1 -mindepth 1 -type d -exec sh -c 'echo "{} : $(find "{}" -type f | wc -l)" file\(s\)' \;
It looks modest for a reason, for it's way more powerful than it looks. :-)
However, should you intend to include this into your .bash_aliases file, it must look like this:
alias somealias='find . -maxdepth 1 -mindepth 1 -type d -exec sh -c '\''echo "{} : $(find "{}" -type f | wc -l)" file\(s\)'\'' \;'
Note the very tricky handling of nested single quotes. And no, it is not possible to use double quotes for the sh -c argument.
Using find is definitely the way to go if you want to count recursively, but if you just want a count of the files directly under a certain directory:
ls dir1 | wc -l
find . -mindepth 1 -type d -print0 | xargs -0 -I{} sh -c 'printf "%4d : %s\n" "$(find {} -type f | wc -l)" "{}"'
I need often need to count the number of files in my sub-directories and use this command. I prefer the count to appear first.
You could use this python code. Boot up the interpreter by running python3 and paste this:
folder_path = '.'
import os, glob
for folder in sorted(glob.glob('{}/*'.format(folder_path))):
print('{:}: {:>8,}'.format(os.path.split(folder)[-1], len(glob.glob('{}/*'.format(folder)))))
Or a recursive version for nested counts:
import os, glob
def nested_count(folder_path, level=0):
for folder in sorted(glob.glob('{}/'.format(os.path.join(folder_path, '*')))):
print('{:}{:}: {:,}'.format(' '*level, os.path.split(os.path.split(folder)[-2])[-1], len(glob.glob(os.path.join(folder, '*')))))
nested_count(folder, level+1)
nested_count('.')
Example output:
>>> figures: 5
>>> misc: 1
>>> notebooks: 5
>>> archive: 65
>>> html: 12
>>> py: 12
>>> src: 14
>>> reports: 1
>>> content: 6
>>> src: 1
>>> html_download: 1
For reliable methods of counting files in a directory, see this answer of mine: What is a reliable code to count files?
In your case we additionally need to iterate over subdirectories. A for loop is good for this:
for d in */; do
c="$(cd -- "$d" && find . ! -name . -exec printf a \; | wc -c)"
if [ "$?" -ne 0 ]; then c="?"; fi
printf '%-25s : %s\n' "${d%/}" "$c"
done
Notes:
Normally */ does not match directories with names starting with a dot ("hidden" directories). In Bash shopt -s dotglob changes this.
! -name . is responsible for not counting the respective subdirectory itself.
${d%/} removes trailing /. If we used * instead of */ in the for loop, then there would be nothing to remove, but the loop would iterate over non-directories as well.
Double dash (--) is useful in case a name starts with -. If we used ./*/ in the for loop, then there would be no need for --, but you probably would want to remove the leading ./ while printing the output, this would complicate the code.
In the output ? appears in case of a problem. Failing cd (probably because of insufficient permissions) is the problem I had in mind. find unable to descend to some (sub-…)subdirectory does not qualify as a problem in this context. If you see a number then it means find found as many files there, so there are at least as many files in the directory. If you see ? then it means find most likely didn't run because cd had failed.
See the already linked answer, it will help you tailor the find command to your needs (e.g. non-recursive solution, counting files of a specific type, some optimizations).
Multi-byte characters in names will confuse printf and the output may appear misaligned.
The format that prints names is %-25s, so any name is printed as-is (plus padding). Newline characters, carriage return characters, escape sequences in names may cause results you don't expect. With printf builtin in Bash use %q instead of %s (it will be %-25q in our case) to mitigate the problem.
Output as csv format for Folders :
for f in $(find * -type d); do echo $f,$(ls ./$f | wc -l) ; done
output:
aFolder,30
bFolder,20
cFolder,10
What I use... This makes an array of all the subdirectories in the one you give as a parameter. Print the subdirectory and the count of that same subdirectory until all the subdirectories are processed.
#!/bin/bash
directories=($(/bin/ls -l $1 | /bin/grep "^d" | /usr/bin/awk -F" " '{print $9}'))
for item in ${directories[*]}
do
if [ -d "$1$item" ]; then
echo "$1$item"
/bin/ls $1$item | /usr/bin/wc -l
fi
done