161

I've known gzip for years, recently I saw bzip being used at work. Are they basically equivalent, or are there significant pros and cons to one of them over the other?

ripper234
  • 11,903

7 Answers7

200

Gzip and bzip2, as well as xz and lzop, are functionally equivalent. (There once was a bzip, but it seems to have completely vanished off the face of the world.) Other common compression formats are zip, rar and 7z; these three do both compression and archiving (packing multiple files into one). Here are some typical ratings in terms of speed, availability and typical compression ratio (note that these ratings are somewhat subjective, don't take them as gospel):

decompression speed (fast > slow): lzop > gzip, zip > xz > 7z > rar > bzip2
compression speed (fast > slow): lzop > gzip, zip > xz > bzip2 > 7z > rar
compression ratio (better > worse): xz > 7z > rar, bzip2 > gzip > zip > lzop
availability (unix): gzip > bzip2 > xz > lzop > zip > 7z > rar
availability (windows): zip > rar > 7z > gzip > bzip2, lzop, xz

As you can see, there isn't a clear winner. If you want to rely on programs that are likely to be installed already, use zip on Windows (or if possible, self-extracting archives, as Windows doesn't ship with any of these) and gzip on unix. If you want maximum compression, use 7z or xz.

Non-Unix native formats (zip, rar, 7z) don't preserve all Unix metadata (ownership, permissions). If you need that, use compressed tar.

Rar also has downside that, as far as I know, there is no open source software that creates rar archives or that can unpack all rar archives. The other formats have free implementations and no (serious) patent claims.

32

As far as I can tell, gzip is overall faster, while bzip overall produces better (smaller) compression.

Lie Ryan
  • 4,517
  • 3
  • 25
  • 27
11

The algorithms have different time, memory, space tradeoffs. Bear in mind these algorithms were written quite a while back and your smartphone has many times more CPU than desktops of those days.

Your pick is between universality (.gz) and a bit more compression (.bz2). Only you can say whichyou care about more.

One advantage of .gz is that it can compress a stream, a sequence where you can't look behind. This makes it the official compressor of http streams. I needed to use gzip once because of that, but unlikely you'll need to think about it.

Rich Homolka
  • 32,350
4

Here is a list of sites that test compression algorithms, to find just bzip and gzip you will have to do some digging, but most sites will list characteristics of the algorithms. This way you can compare what is important to you, size (compression ratio), time, memory, cpu.
http://www.maximumcompression.com/benchmarks/benchmarks.php https://web.archive.org/web/20210126053224/https://maximumcompression.com/benchmarks/benchmarks.php

2

gzip is way faster, bzip2 makes way smaller archives.

since memory is cheap gzip is usually better for general usage, where bzip2 may be better for preservation of many old files.

there's also the newer zstandard format, which has alike compression ratios than gzip but performs even faster.

1

Per http://tukaani.org/lzma/benchmarks.html , gzip compresses twice as fast as bzip2, and decompresses ten times as fast.

Eg for use with s3 caching, on travis etc, where you want speed of compress/decompress, not just small sizes, gzip might be a good trade-off.

0

In my experience bzip has offered consistently better compression ratios than gzip. Plus with 7zip as manager and bzip algorithm, 7zip can make use of multi core processors.