95

I have a script that extracts a tar.gz-file to a specified subdirectory mysubfolder:

mkdir mysubfolder; tar --extract --file=sourcefile.tar.gz --strip-components=1 --directory=mysubfolder;

Is there any equivalent way of doing this with a zip-file?

Run5k
  • 16,463
  • 24
  • 53
  • 67
Fredrik
  • 1,075

4 Answers4

41

As Mathias said, unzip has no such option. But a one-liner bash script can do the job.

Problem is: the best approach depends on your archive layout. A solution that assumes a single top-level dir will fail miserably if the content is directly in the archive root: think about an archive containing /a/foo, /b/foo, /foo, and the chaos of stripping /a and /b.

And the same fail happens with tar --strip-component. There is no one-size-fits-all solution.

So, to strip the root dir, assuming there is one (and only one):

unzip -d "$dest" "$zip" && f=("$dest"/*) && mv "$dest"/*/* "$dest" && rmdir "${f[@]}"

Just make sure that second-level files/dirs do not have the same name of the top-level parent (for example, /foo/foo). But /foo/bar/foo and /foo/bar/bar are ok. If they do, or you just want to be safe, you can use a temp dir for extraction:

temp=$(mktemp -d) && unzip -d "$temp" "$zip" && mkdir -p "$dest" &&
mv "$temp"/*/* "$dest" && rmdir "$temp"/* "$temp"

If you're using Bash, you can test if top level is a single dir or not using:

f=("$temp"/*); (( ${#f[@]} == 1 )) && [[ -d "${f[0]}" ]] && echo "Single dir!"

Speaking of Bash, you should turn on dotglob to include hidden files, and you can wrap everything in a single, handy function:

# unzip featuring an enhanced version of tar's --strip-components=1
# Usage: unzip-strip ARCHIVE [DESTDIR] [EXTRA_cp_OPTIONS]
# Derive DESTDIR to current dir and archive filename or toplevel dir
unzip-strip() (
    set -eu
    local archive=$1
    local destdir=${2:-}
    shift; shift || :
    local tmpdir=$(mktemp -d)
    trap 'rm -rf -- "$tmpdir"' EXIT
    unzip -qd "$tmpdir" -- "$archive"
    shopt -s dotglob
    local files=("$tmpdir"/*) name i=1
    if (( ${#files[@]} == 1 )) && [[ -d "${files[0]}" ]]; then
        name=$(basename "${files[0]}")
        files=("$tmpdir"/*/*)
    else
        name=$(basename "$archive"); name=${archive%.*}
        files=("$tmpdir"/*)
    fi
    if [[ -z "$destdir" ]]; then
        destdir=./"$name"
    fi
    while [[ -f "$destdir" ]]; do destdir=${destdir}-$((i++)); done
    mkdir -p "$destdir"
    cp -ar "$@" -t "$destdir" -- "${files[@]}"
)

Now put that in your ~/.bashrc and you'll never have to worry about it again. Simply use as:

unzip-strip sourcefile.zip [mysubfolder] [OPTIONS]

This little beast will:

  • Create mysubfolder for you if it does not exist
  • Automatically detect if if your zip archive contains a single top-level directory and handle the extraction accordingly.
  • mysubfolder is optional. If blank it will extract to a subdir of the current directory (not necessarily the archive directory!), named after:
    • The single top-level directory in the archive, if there is one
    • or archive file name, without the (presumably .zip) extension
  • If destination path, given or derived, already exists as a file, increment the name until a suitable one is found (new path or existing directory).
  • By default this will:
    • be silent
    • overwrite any existing files
    • preserve links and attributes (mode, timestamps, etc)
  • You can pass extra OPTIONS to cp. Useful options are:
    • -v|--verbose: output each copied file, just like unzip does
    • -n|--no-clobber: do not overwrite existing files
    • -u|--update: only overwrite files that are newer than the destination
  • Such extra OPTIONS are the 3rd argument onward. Craft a proper argument parser if needed.
  • It will use double the extracted disk space during the operation, due to "extract to temp dir and copy" approach. No way around this without losing some of its flexibility/features.
MestreLion
  • 3,045
  • 4
  • 29
  • 23
13

You can use -j to junk paths (do not make directories). This is only recommended for somewhat common single-level archives. Archives with multi level directory structures will be flattened - this might even lead to name clashes for the files to extract.

From the man page of unzip:

   -j     junk  paths.   The  archive's directory structure is not recreated; all files are deposited in the
          extraction directory (by default, the current one).
10

As others have noted, unzip does not support this. However, bsdtar can extract zip files as well.

bsdtar xvf app.zip --strip-components=1 -C /opt/some-app

bsdtar can also extract streams (which unzip does not support).

curl -sSL https://... | bsdtar xvf - --strip-components=1 -C /opt/some-app
fnkr
  • 810
8

I couldn’t find such an option in the manual pages for unzip, so I’m afraid this is impossible. :(

However, (depending on the situation) you could work around it. For example, if you’re sure the only top-level directory in the zip file is named foo- followed by a version number, you could do something like this:

cd /tmp
unzip /path/to/file.zip
cd foo-*
cp -r . /path/to/destination/folder