1

I know that on ZFS, the only way to grow a RAID-Z (or RAID-Z2 or RAID-Z3) vdev is to replace the disks with larger ones; there is no way to change the geometry. However, is it possible to do so without degrading the array in the process?

As an example, suppose I have a RAID-Z array with 4 disks: 1x2TB and 3x1TB makes 3 TB usable space. With all drives working, I buy 3 more 2TBs in order to grow the array (to 6 TB usable). If I remove and replace each of the 1 TBs with a 2 TB, one at a time, to force each resilver, then I'd be unnecessarily and repeatedly degrading the array, and increasing risk of failure in the process.

What I hope is possible is to mirror each disk before replacing it. i.e. Add the 1st 2 TB to a spare bay, mirror the first 1 TB disk to it, then remove the 1 TB, and put the 2 TB in the removed drive's slot; then repeat for the 2nd and 3rd 1 TB disks. This could potentially even obviate the need to resilver - or to recalculate parity, anyway.

Is such a thing possible?

cp.engr
  • 242

1 Answers1

1

Your proposed solution is possible, but there are some substantial drawbacks:

  • You cannot write anything to the pool (zpool import -o readonly=on ${YOUR_ZFS_POOL}) while a disk clone is in progress.
  • You have to export the pool (zpool export ${YOUR_ZFS_POOL}) for each disk you switch out.

After cloning a vdev disk, you must:

  1. Export the pool (zpool export ${YOUR_ZFS_POOL}).
  2. Zap (zpool labelclear ${OLD_DISK_DEVICE}) or physically remove the old disk.
    Warning: There's no undo if you zap the disk.
  3. If necessary, grow the vdev partition on the new disk.
  4. Import the pool (zpool import ${YOUR_ZFS_POOL}).
    Warning: There's no undo after this point. The old disk can no longer be online'd in the same pool.
  5. Expand the disk (zpool online -e ${NEW_DISK_VDEV}).

Once you've repeated these steps for each disk you are replacing, the new capacity should take effect.


ZFS's built-in replacement feature is designed to avoid all this extra complexity described in the section above. If you are afraid of corrupting your RAID-Z zpool during a resilver, maybe you should:

  • have created the pool with more redundancy (RAID-Z2, RAID-Z3), or
  • back up your pool somewhere else while growing your array.

Besides, if a disk actually fails, would you still be content with RAID-Z?

If you are tight on budget and really can't afford a place to back up your datasets, it is indeed safer to execute your plan because the clones would be transactional all-or-nothing operations, but remember that your pool will read-only or offline for a while, and you also run the risk of making a mistake during the post-cloning steps.

Using zpool replace would:

  • be faster*, because only the used storage would be copied, instead of the whole block device
    * if you have the sequential scrub/resilver feature
  • have no downtime, because the pool would be fully operational while the resilver is happening
  • eliminate human error, because you'd be using the ZFS-expected workflow of replacing a disk
Deltik
  • 19,971