For backing up my data I use rdiff-backup (2.0.5) on Ubuntu 20.10. The amount of data is only 127 GB but 80k Files / 17k Folders (mainly source code repositories, photos).
The problem I discovered is, that rdiff-backup seems to be terribly slow when adding new files. I wrote a bash script to demonstrate that (see below).
What the script does is:
- Generate 1000 empty files
- Do initial backup
- Generate another 1000 empty files
- Do another backup
While the inital backup takes roughly 1 second, the second one takes 7 seconds (detailed results below). This does not sound much but with my real life data I end up with several hours for very few new files.
What puzzles me is, that only the "real" time seems to explode. Does rdiff-backup get stuck with other processes?
I run the example on an internal ext4 SATA SSD.
Script:
#!/bin/bash
mkdir src
mkdir dest
files=1000
printf "Creating some dummy files.\n"
for (( i=1; i<=$files; i++ ))
do
touch "src/$i.txt"
done
printf "First run."
time rdiff-backup src/ dest/
printf "\n"
sleep 1
printf "Second run, nothing changed."
time rdiff-backup src/ dest/
printf "\n"
sleep 1
printf "Creating some more dummy files.\n"
for (( i=$files+1; i<=$files*2; i++ ))
do
touch "src/$i.txt"
done
printf "Third run, adding new files to backup."
time rdiff-backup src/ dest/
printf "\n"
sleep 1
printf "Forth run, nothing changed."
time rdiff-backup src/ dest/
Output:
Creating some dummy files.
First run.
real 0m1,076s
user 0m0,869s
sys 0m0,157s
Second run, nothing changed.
real 0m0,511s
user 0m0,419s
sys 0m0,037s
Creating some more dummy files.
Third run, adding new files to backup.
real 0m7,460s <--- 7 times longer!
user 0m1,374s
sys 0m0,310s
Forth run, nothing changed.
real 0m0,747s
user 0m0,645s
sys 0m0,053s