4

I have two directories with thousands of files which contain more or less the same files.

How can I copy all files from dirA to dirB which are not in dirB or if the file exists in dirB only overwrite it if it's smaller.

I know there are a lot of examples for different timestamp or different file size but I only want to overwrite if the destination file is smaller and under no circumstances if it's bigger.

Background of my problem:
I've rendered a dynmap on my Minecraft Server but some of the tiles are missing or corrupted. Then I did the rendering again on another machine with a faster CPU and copied all the new rendered files (~50GB / 6.000.000 ~4-10 KB PNGs) on my server. After that I noticed that there are also corrupted files in my new render.

left: old render, right: new render

old 1 corrupted new 1

old 2 new 2 corrupted

Therefor I don't want to overwrite all files but only the ones which are bigger (the corrupted carry less data and are smaller).

das Keks
  • 457

5 Answers5

3

May be a dirty way, but I hope it is what you are looking for

#!/bin/bash

### Purpose:
# Copy huge amount of files from source to destination directory only if
# destination file is smaller in size than in source directory
###

src='./d1' # Source directory
dst='./d2' # Destination directory

icp() {
  f="${1}";
  [ -d "$f" ] && {
    [ ! -d "${dst}${f#$src}" ] && mkdir -p "${dst}${f#$src}";
    return
  }

  [ ! -f "${dst}/${f#$src/}" ] && { cp -a "${f}" "${dst}/${f#$src/}"; return; }
  fsizeSrc=$( stat -c %s "$f" )
  fsizeDst=$( stat -c %s "${dst}/${f#$src/}" )
  [ ${fsizeDst} -lt ${fsizeSrc} ] && cp -a "${f}" "${dst}/${f#$src/}"
}

export -f icp
export src
export dst

find ${src} -exec bash -c 'icp "$0"' {} \;
Alex
  • 6,375
3

My problem had been similar. I wanted to synchronize files from a remote folder to a local one, but only copy the remote files which were bigger than the according local files.

My workaround with rsync was like that, which in fact a bash one-liner:

for x in $(ls -1 home/me/local/folder/*)
do
    eachsize=$(stat -c "%s")
    rsync -avz --progress --max-size=${eachsize} remote:/home/you/folder/${x} .
done

I think you can get the point, since the filenames are the same between the two folders, I go through each one in the local folder and keep its size, then I place it as a limit whether rsync should copy or not the remote file of the same name but different size.

2

You can use rsync command

Syntax :

-a = archive mode
-v = increase verbosity
-z = compress file data during the transfer
--progress = show progress during transfer

rsync -avz --progress <source path> <destination path>

you can use --delete to delete extraneous files from destination directory

rsync -avz --delete --progress <source path> <destination path>

so your command will be:

rsync -avz --delete --progress dirA dirB
2
rsync -va --append source destination

just got from man rsync, and you can get more of what you want.

  • --append: append data onto shorter files
  • -v verbose
  • -a archive mode; same as -rlptgoD (no -H)

--append is very useful for me when the cp -ru a b process was interrupted several times and you also changed the updated time of files by chown user:user -R *. haha :)

Giacomo1968
  • 58,727
0

I've modified this to something like:

# Copy src to destination if the src is larger.
function copy_if_larger() {
  local src="$1"
  local dest="$2"

  [ ! -f "$1" ] return
  [ ! -f "$2" ] return

  local srcSize=$( stat -c %s "$1")
  local dstSize=$( stat -c %s "$2")

  [ ${dstSize} -lt ${srcSize} ] && {
    cp -a "$1" "$2"
  }
  return
}

Then I wrote another method to adjust the files that I want to copy and feed them into the copy_if_larger function.

function do_copy_if_larger() {
  # trim prefix
  local suffix=$(echo "$1" | cut -c 10-)
  copy_if_larger "$1" "/dest/path/$suffix"
}

# make the functions visible to the subshell.
export -f copy_if_larger
export -f do_copy_if_larger

# copy all larger jpeg files over /dest/path
find . -name '*jpg' | xargs -n 1 bash -c 'do_copy_if_larger "$@"' {}
Lar
  • 51