7

Gzipping a tar file as whole is drop dead easy and even implemented as option inside tar. So far, so good. However, from an archiver's point of view, it would be better to tar the gzipped single files. (The rationale behind it is, that data loss is minified, if there is a single corrupt gzipped file, than if your whole tarball is corrupted due to gzip or copy errors.)

Has anyone experience with this? Are there drawbacks? Are there more solid/tested solutions for this than

find folder -exec gzip '{}' \;
tar cf folder.tar folder
Boldewyn
  • 4,468

4 Answers4

11

The key disadvantage is reduce compression, especially if your archive will contain many small files.

You might be better off compressing the data the usual way (or if you have CPU cycles to spare, the slower but more space efficient 7zip) then wrapping the result in a parity based fault-tolerant format such as http://en.wikipedia.org/wiki/Parchive. This will give you much greater potential for complete recovery after data corruption due to media failure or problems in transit over the network, possibly while not compromising too much on the size of the resulting archives.

4

If you're going to do it this way, then use the tried-and-true method:

zip -r folder.zip folder
0

Why not just toss the --verify (or -W) flag at tar? This will verify that the contents match the source.

Jack M.
  • 3,483
0

What do you want to backup? If permission doesn't matter (e.g. not system files), I'd go with 7zip. Provides much better performace (multi-core/cpu) with much better compression.

Apache
  • 16,299