I am trying to transfer the data on a set of 99% full 4TB RAID1 BTRFS drives to a pair of 12TB RAID1 drives also using BTRFS. Unfortunately, the vast majority of data on the drive is in the form of millions of relatively small files (KB to small MB) which needs to be accessed individually with random read/write if one were to perform a copy operation using common file tools (BTRFS snapshot, cp, rsync).
The drives I am copying from are not using a partition and the whole drive has been allocated for the BTRFS filesystem. Source and destination are not encrypted. The drives are ZSTD compressed and have an uncompressed size of roughly 6TB.
Roughly, here are the numbers I have gathered for how long the operation will take from initial attempts:
cp: 34 daysrsync: ? -> likely slower thancpdue to duplicate checking- copy with
tar: ? -> likely faster than cp, but not much improvement over 34 days - Building a compressed archive on new drives (can't do on original drives due to lack of space): 413 days
- BTRFS snapshot cloning/btrfs-clone: 9-14 days
ddraw clone: 7 hours
The obvious choice for speed is dd. dd copies sequentially at a rate hundreds of times faster than standard copying and its random access.
However, it has problems with UUID that I haven't been able to find any solutions around:
To use this method, we need to change the UUID and UUID_SUB flags for the drive, and only UUID is changeable with btrfstune -u . I can see that the clone has the same label and UUID_SUB as the original drive it was cloned from so I am too scared to mount this drive as BTRFS documentation says this will corrupt both drives and I have no backups beyond the drives as they are.
Is there a way to copy the data faster? Either using BTRFS duplication magic somehow or the dd clone?