I'm trying to figure out the optimal size for a large copy from my hard drive using dd. I'm trying to figure out what the best blocksize to use it, which I would assume is the hardware block size for that drive.
6 Answers
The lsblk command is great for this:
lsblk -o NAME,PHY-SeC
The results:
NAME PHY-SEC
sda 512
├─sda1 512
├─sda2 512
└─sda5 512
- 953
Linux exposes the physical sector size in files /sys/block/sdX/queue/physical_block_size. Although, to get the best performance you should probably do a little testing with different sizes and meassure. I could not find a clear answer in that using exactly the physical block size would get the optimal result (although I assume it cannot be a bad choice).
$ sudo hdparm -I /dev/sda | grep -i physical
Physical Sector size: 4096 bytes
- 13,835
- 306
Mine isn't intended to be a complete answer, but I hope it also helps.
Here is a little something from http://mark.koli.ch/2009/05/howto-whole-disk-backups-with-dd-gzip-and-p7zip.html
3 - Determine the Appropriate Block Size
For a quicker backup, it can help to nail down the optimal block size of the disk device you are going to backup. Assuming you are going to backup /dev/sda, here's how you can use the fdisk command to determine the best block size:
rescuecd#/> /sbin/fdisk -l /dev/sda | grep Units
Units = cylinders of 16065 * 512 = 8225280 bytes
Note the fdisk output says "cylinders of 16065 * 512". This means that there are 512 bytes per block on the disk. You can significantly improve the speed of the backup by increasing the block size by a multiple of 2 to 4. In this case, an optimal block size might be 1k (512*2) or 2k (512*4). BTW, getting greedy and using a block size of 5k (512*10) or something excessive won't help; eventually the system will bottleneck at the device itself and you won't be able to squeeze out any additional performance from the backup process.(emphasis added)
I suspect the difference in performance between a near-optimal and optimal block size for a given configuration is negligible unless the data set is enormous. Indeed, a user at FixUnix (post from 2007) claimed his optimal times were only 5% faster than the sub-optimal ones. Maybe you can squeeze a little more efficiency out by using a multiple of the "cluster" size or filesystem block size.
Of course, if you move too far away to either side of the optimal block size you'll run into trouble.
The bottom line is you will likely gain only around 5% in performance (i.e. 3 minutes per hour) with the absolute optimal block size, so consider whether it is worth your time and effort to research further. As long as you stay away from extreme values, you should not suffer.
Each disk transfer generates an interrupt that processor must handle. Typical 50Mb/s disk will want to generate 100000 of them each second at 512b block size Normal processor would handle 10s of thousands of those, thus bigger (2^x) block size would be more handy (4k as default FS block size in most systems up to 64k ISA DMA size) would be more practical...
- 2,465
Additionally, you can look thru the output of lshw to verify other results, (and also because I don't seem to have hdparm available on my distro.) This might help narrow it down:
sudo lshw | awk 'BEGIN {IGNORECASE=1;} /SCSI/,!//{print}'
- 1,330