0

I have a bunch of zipped up log files and I want to search them all for a string. I tried this but it's not working:

find ./ -name "*.log.zip" -exec gzip -dc {} | grep ERROR \;

It's giving me:

find: incomplete statement
grep: can't open ;

What I want is, for each .log.zip file, unzip it and grep the output for "ERROR". Doing this on AIX, for what it's worth.

2 Answers2

0

If you don't need to know which zipped log files contain the string:

find ./ -name "*.log.zip" -type f -exec gzip -dc {} + | grep ERROR

If you do want to know which files contain the string:

find ./ -name "*.log.zip" -type f -exec sh -c 'gzip -dc -- "$1" | grep -q ERROR' findsh {} \; -print

The first command finds the files and passes those filenames to the -exec option. I added the -type f restriction to the command to be sure that we're only matching files -- imagine someone running "mkdir foo.log.zip". gzip will decompress each one to stdout; we drop any find or gzip errors with the 2>/dev/null; the stdout of that entire command is then piped through grep. The + syntax at the end of -exec will pass as many filenames as will fit, which minimizes the number of calls to gzip. Because gzip is sending all the file contents to stdout, grep now just has an incoming stream of bytes -- without filenames -- and will print any matching lines.

On the other hand, if you need to know the matching filenames, you have to capture that sooner in the pipeline.

On a GNU/Linux system (which has zgrep), you could do it directly:

find . -name "*.log.zip" -type f -exec zgrep -l ERROR {} +

This will pass (as many as will fit) filenames to zgrep which we then ask to print matching filenames (`-l option).

On an AIX system, you can recreate that functionality with a small shell script. The syntax can be a little intimidating, but let's break it down from the outside-in:

find ... -exec sh -c ' ... ' findsh {} \; -print

The above statement gathers one matching file at a time (\;) and sends it as an argument to the given sh script; if the script returns success, the filename is printed, otherwise it is not. The findsh portion is arbitrary text; it becomes the $0 argument to sh, giving the inline shell script a name.

Note:

The {} syntax needs to be outside of the shell script; otherwise, it could lead to arbitrary command execution. On AIX, the braces are not substituted inside the -exec parameter, so you would see "gzip: {}.gz: No such file or directory" errors if you tried it. On GNU/Linux, find does substitute the filename inside the shell script, which means if someone created a file named $(touch foo).log.zip, you would end up with a file named "foo" because the shell script initiates another layer of parsing on the filenames. See more at this UNIX & Linux question: Is it possible to use find -exec sh -c safely?

Once the filenames have been passed in one-by-one, the shell script is:

gzip -dc -- "$1" | grep -q ERROR

The filename is in $1, so we call gzip -dc on it. Out of habit, I try to mark the end of options before an arbitrary filename, just in case that filename begins with a hyphen -- or any other character -- that could be misinterpreted by the command as an option. Since our find command specifically starts the search with ./, all the matching filenames will begin with that string, so they'll never look like options to gzip, but it's better to have safe habits. Once gzip has piped the contents, grep searches quietly for the string. If grep finds the string, the shell will exit successfully, allowing the subsequent printing; otherwise, it will cause the -exec to return a false/failure exit code, so the filename would not print.

0

There is an error in your syntax. Find is looking for \; or \+, but reads |. Grep is trying to open a file called ";". The difference between terminating -exec with a semicolon or a plus is running the command once for all files (+) and running the command once for every file (;).

Try this:

find ./ -name "*.log.zip" -exec zcat {} \+ | grep ERROR
# or
find ./ -name "*.log.zip" -exec sh -c 'zcat {} | grep ERROR' \;