Background
I ran out of space on /home/data and need to transfer /home/data/repo to /home/data2.
/home/data/repo contains 1M dirs, each of which contain 11 dirs and 10 files. It totals 2TB.
/home/data is on ext3 with dir_index enabled.
/home/data2 is on ext4.
Running CentOS 6.4.
I assume these approaches are slow because of the fact that repo/ has 1 million dirs directly underneath it.
Attempt 1: mv is fast but gets interrupted
I could be done if this had finished:
/home/data> mv repo ../data2
But it was interrupted after 1.5TB was transferred. It was writing at about 1GB/min.
Attempt 2: rsync crawls after 8 hours of building file list
/home/data> rsync --ignore-existing -rv repo ../data2
It took several hours to build the 'incremental file list' and then it transfers at 100MB/min.
I cancel it to try a faster approach.
Attempt 3a: mv complains
Testing it on a subdirectory:
/home/data/repo> mv -f foobar ../../data2/repo/
mv: inter-device move failed: '(foobar)' to '../../data2/repo/foobar'; unable to remove target: Is a directory
I'm not sure what this is error about, but maybe cp can bail me out..
Attempt 3b: cp gets nowhere after 8 hours
/home/data> cp -nr repo ../data2
It reads the disk for 8 hours and I decide to cancel it and go back to rsync.
Attempt 4: rsync crawls after 8 hours of building file list
/home/data> rsync --ignore-existing --remove-source-files -rv repo ../data2
I used --remove-source-files thinking it might make it faster if I start cleanup now.
It takes at least 6 hours to build the file list then it transfers at 100-200MB/min.
But the server was burdened overnight and my connection closed.
Attempt 5: THERES ONLY 300GB LEFT TO MOVE WHY IS THIS SO PAINFUL
/home/data> rsync --ignore-existing --remove-source-files -rvW repo ../data2
Interrupted again. The -W almost seemed to make "sending incremental file list" faster, which to my understanding shouldn't make sense. Regardless, the transfer is horribly slow and I'm giving up on this one.
Attempt 6: tar
/home/data> nohup tar cf - . |(cd ../data2; tar xvfk -)
Basically attempting to re-copy everything but ignoring existing files. It has to wade thru 1.7TB of existing files but at least it's reading at 1.2GB/min.
So far, this is the only command which gives instant gratification.
Update: interrupted again, somehow, even with nohup..
Attempt 7: harakiri
Still debating this one
Attempt 8: scripted 'merge' with mv
The destination dir had about 120k empty dirs, so I ran
/home/data2/repo> find . -type d -empty -exec rmdir {} \;
Ruby script:
SRC = "/home/data/repo"
DEST = "/home/data2/repo"
`ls #{SRC} --color=never > lst1.tmp`
`ls #{DEST} --color=never > lst2.tmp`
`diff lst1.tmp lst2.tmp | grep '<' > /home/data/missing.tmp`
t = `cat /home/data/missing.tmp | wc -l`.to_i
puts "Todo: #{t}"
# Manually `mv` each missing directory
File.open('missing.tmp').each do |line|
dir = line.strip.gsub('< ', '')
puts `mv #{SRC}/#{dir} #{DEST}/`
end
DONE.