2

I just installed clamav on my linux system.

I intend to do a manual scan of the whole system from root (/). But there are directories I know should be skipped (e.g. /proc, /sys, /dev).

The clamscan online documentation only gives an abbreviated list of options. clamscan --help gives much more but it ONLY says:

--exclude=REGEX                      Don't scan file names matching REGEX
--exclude-dir=REGEX                  Don't scan directories matching REGEX
--include=REGEX                      Only scan file names matching REGEX
--include-dir=REGEX                  Only scan directories matching REGEX

Nowhere can I find any description of the specific REGEX syntax to use.

In particular I want to know if I should use grep basic or extended regex syntax or perhaps some other dialect.

I also found a post where someone was using --exclude to exclude directories instead of --exclude-dir and would like to know if that should work.

1 Answers1

2

I've done some experimentation and discovered that you should use extended regex syntax.

So the characters ‘?’, ‘+’, ‘{’, ‘|’, ‘(’, and ‘)’ have their special meaning and must be escaped with a '' to be taken literally.

When --exclude is used, any file that matches (anywhere) will not be scanned. So if the regex matches a directory the file is in, the file will not be scanned. But it will be checked to see if it matches. If --exclude-dir is used if a directory matches none of it's contents will be scanned, or examined for matches.

This can be seen in the log (if specified with --log) with a single entry that a directory is excluded versus an entry for every file in that directory being excluded.

Here is a bash script I created for myself that builds a couple of --exclude-dir options to exclude root directories and subdirectories. I'm sure it can be improved, but I hope it proves a useful example of the regex and the various options I thought I'd want for running a scan.

#! /usr/bin/env bash

bool function to test if the user is root or not

is_user_root () { [ "${EUID:-$(id -u)}" -eq 0 ]; }

is_user_root || { echo 'You are just an ordinary user. Run as root.' >&2 exit 1 }

LOG_FILE=/var/log/clamscan.log EXCLUDE_ROOT_DIRS=(proc sys dev media mnt data/Downloads) EXCLUDE_SUBDIRS=('lost+found' .git)

declare -a EXCLUDE_DIRS if [[ ${#EXCLUDE_ROOT_DIRS[@]} -ne 0 ]]; then ED_RE="^/("; for xrd in ${EXCLUDE_ROOT_DIRS[@]}; do ED_RE+="$xrd|"; done; ED_RE="${ED_RE%|})" EXCLUDE_DIRS+=("--exclude-dir=$ED_RE") #EXCLUDE_DIRS+="--exclude-dir=^/("; for xrd in ${EXCLUDE_ROOT_DIRS[@]}; do EXCLUDE_DIRS+="$xrd|"; done; EXCLUDE_DIRS="${EXCLUDE_DIRS%|})"; echo $EXCLUDE_DIRS fi if [[ ${#EXCLUDE_SUBDIRS[@]} -ne 0 ]]; then ED_RE="/("; for xsd in ${EXCLUDE_SUBDIRS[@]}; do ED_RE+="$xsd|"; done; ED_RE="${ED_RE%|})" EXCLUDE_DIRS+=("--exclude-dir=$ED_RE") #EXCLUDE_DIRS+=" --exclude-dir=/("; for xsd in ${EXCLUDE_SUBDIRS[@]}; do EXCLUDE_DIRS+="$xsd|"; done; EXCLUDE_DIRS="${EXCLUDE_DIRS%|})"; echo $EXCLUDE_DIRS fi

Adding --verbose will write Scanning messages to stdout e.g.

Scanning /data/Games/henry/Steam/ubuntu12_32/steam-runtime/usr/share/doc/libglib2.0-0/README.gz

echo clamscan --suppress-ok-results --log=$LOG_FILE --max-filesize=100M --recursive ${EXCLUDE_DIRS[@]} / clamscan --suppress-ok-results --log=$LOG_FILE --max-filesize=100M --recursive ${EXCLUDE_DIRS[@]} /

NOTE: may want to edit the log file and remove the reports on Symbolic links and Empty files

sed -i -E -e '/: (Symbolic link|Empty file)$/d' $LOG_FILE

echo 'List of any FOUND infected files' grep FOUND$ /var/log/clamav.log

The commandline the above script executes is:

clamscan --suppress-ok-results --log=/var/log/clamscan.log --max-filesize=100M --recursive --exclude-dir=^/(proc|sys|dev|media|mnt|data/Downloads) --exclude-dir=/(lost\+found|.git) /