One way would be to use this quick hack:
#!/usr/bin/ruby
=begin
Quick-and-dirty way to grep in *.tar.gz archives
Assumption:
each and every file read from any of the supplied tar archives
will fit into memory. If not, the data reading has to be rewritten
(a proxy that reads line-by-line would have to be inserted)
=end
require 'rubygems'
gem 'minitar'
require 'zlib'
require 'archive/tar/minitar'
if ARGV.size < 2
STDERR.puts "#{File.basename($0)} <regexp> <file>+"
exit 1
end
regexp = Regexp.new(ARGV.shift, Regexp::IGNORECASE)
for file in ARGV
zr = Zlib::GzipReader.new(File.open(file, 'rb'))
Archive::Tar::Minitar::Reader.new(zr).each do |e|
next unless e.file?
data = e.read
if regexp =~ data
data.split(/\n/).each_with_index do |l, i|
puts "#{file},#{e.full_name}:#{i+1}:#{l}" if regexp =~ l
end
end
end
end
which is not to say I'd recommend it for bigger archives, as each file from the archive is read into memory (twice, actually).
If you want a bit more memory-efficient version, you'd either have to go with different implementation of the e.read loop... or, perhaps, with a different language altogether. ;)
I could make it a bit more efficient if you're really interested... but it will definitely not compare with C or other compiled languages, in terms of raw speed.