This question is related to Managing raid6 device with pacemaker, but nothing in the question linked is relevant to this question, except the fact that Pacemaker is involved.
For the purpose of this question, I will assume the array is not distributed, but still managed by pacemaker. I created a raid6 array that consists of 4 devices. Pacemaker manages both the array and all devices part of it, all as resources. I added the proper constraints so that the array is not assembled until at least 2 of its devices are started and available.
Troubles comes when one of these devices fail. Starting with a fully clean array, if one of the device fails, or if any of its own dependencies does, pacemaker deactivates the corresponding device via its resource agent. However, the device itself cannot be shutoff, because the active raid6 array uses it. To get around this issue, I configured pacemaker so that it first fails the device on its array, then removes it. Then, since the device is not used anymore, it can deactivate the resource and shutoff the device.
In one of my tests, I suspect pacemaker failed and remove too many devices (3 out of 4) and shutoff the array. Now, mdadm refuses to assemble the array back:
[root@ceph01 ~]# mdadm --assemble /dev/md/disk1 /dev/cluster/ceph0*-disk1
mdadm: /dev/md/disk1 assembled from 1 drive and 3 spares - not enough to start the array.
[root@ceph01 ~]#
If I examine each device, I find that 3 of them are marked as spares, and one as active:
[root@ceph01 ~]# mdadm --assemble /dev/md/disk1 /dev/cluster/ceph0*-disk1
mdadm: /dev/md/disk1 assembled from 1 drive and 3 spares - not enough to start the array.
[root@ceph01 ~]# mdadm --examine /dev/cluster/ceph01-disk1
/dev/cluster/ceph01-disk1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2929c57b:b3a7c8b6:36cb160e:b691c7a4
Name : ceph02:disk1
Creation Time : Wed May 10 16:19:13 2023
Raid Level : raid6
Raid Devices : 4
Avail Dev Size : 7812667392 sectors (3.64 TiB 4.00 TB)
Array Size : 7812667392 KiB (7.28 TiB 8.00 TB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before=264112 sectors, after=0 sectors
State : clean
Device UUID : a83c4b13:338d9a52:54863e4b:6cb96cd3
Internal Bitmap : 8 sectors from superblock
Update Time : Thu Jun 1 13:19:35 2023
Bad Block Log : 512 entries available at offset 24 sectors
Checksum : 7d767571 - correct
Events : 175
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : .... ('A' == active, '.' == missing, 'R' == replacing)
[root@ceph01 ~]# mdadm --examine /dev/cluster/ceph02-disk1
/dev/cluster/ceph02-disk1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2929c57b:b3a7c8b6:36cb160e:b691c7a4
Name : ceph02:disk1
Creation Time : Wed May 10 16:19:13 2023
Raid Level : raid6
Raid Devices : 4
Avail Dev Size : 7812667392 sectors (3.64 TiB 4.00 TB)
Array Size : 7812667392 KiB (7.28 TiB 8.00 TB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before=264112 sectors, after=0 sectors
State : clean
Device UUID : 6196eccc:bd4d6f31:b7f12e0a:45482818
Internal Bitmap : 8 sectors from superblock
Update Time : Thu Jun 1 13:19:35 2023
Bad Block Log : 512 entries available at offset 24 sectors
Checksum : 199889f1 - correct
Events : 175
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : .... ('A' == active, '.' == missing, 'R' == replacing)
[root@ceph01 ~]# mdadm --examine /dev/cluster/ceph03-disk1
/dev/cluster/ceph03-disk1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2929c57b:b3a7c8b6:36cb160e:b691c7a4
Name : ceph02:disk1
Creation Time : Wed May 10 16:19:13 2023
Raid Level : raid6
Raid Devices : 4
Avail Dev Size : 7812667392 sectors (3.64 TiB 4.00 TB)
Array Size : 7812667392 KiB (7.28 TiB 8.00 TB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before=264112 sectors, after=0 sectors
State : clean
Device UUID : f85b705c:f5b270bf:a937406f:528916df
Internal Bitmap : 8 sectors from superblock
Update Time : Thu Jun 1 13:19:35 2023
Bad Block Log : 512 entries available at offset 24 sectors
Checksum : 631d3bc1 - correct
Events : 175
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : .... ('A' == active, '.' == missing, 'R' == replacing)
[root@ceph01 ~]# mdadm --examine /dev/cluster/ceph04-disk1
/dev/cluster/ceph04-disk1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2929c57b:b3a7c8b6:36cb160e:b691c7a4
Name : ceph02:disk1
Creation Time : Wed May 10 16:19:13 2023
Raid Level : raid6
Raid Devices : 4
Avail Dev Size : 7812667392 sectors (3.64 TiB 4.00 TB)
Array Size : 7812667392 KiB (7.28 TiB 8.00 TB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before=264112 sectors, after=0 sectors
State : clean
Device UUID : 8e577afd:eaafa63b:fda2a699:5d55ed8b
Internal Bitmap : 8 sectors from superblock
Update Time : Thu Jun 1 13:18:40 2023
Bad Block Log : 512 entries available at offset 24 sectors
Checksum : 579e6b6b - correct
Events : 166
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 3
Array State : ...A ('A' == active, '.' == missing, 'R' == replacing)
[root@ceph01 ~]#
Having only 1 active device is enough to start the raid6 array, since it means 3 devices are missing and raid6 can handle only 2 failures. I can also see that all 3 devices marked as spare are set to the same Events counter (175) which is strictly higher than the one active device (166). I also know that devices 1,2 and 3 had slots 0,1 and 2 in the array, respectively, when they were part of it. However, mdadm --examine does not show that.
Since I have 3 disks out of 4, that were each of a different slot of the same 4 devices raid6 array, and that all these 3 devices are clean and have the same Events counter value, I think it should be possible to re-assemble my array using them; but I cannot find out how.
I tried using only the spare devices and the --run flag to re-assemble the array:
[root@ceph01 ~]# mdadm --assemble /dev/md/disk1 /dev/cluster/ceph0{1..3}-disk1 --run
mdadm: failed to RUN_ARRAY /dev/md/disk1: Invalid argument
mdadm: Not enough devices to start the array.
[root@ceph01 ~]#
I tried the same command as above, but with the --force flag added:
[root@ceph01 ~]# mdadm --assemble --force /dev/md/disk1 /dev/cluster/ceph0{1..3}-disk1 --run
mdadm: failed to RUN_ARRAY /dev/md/disk1: Invalid argument
mdadm: Not enough devices to start the array.
[root@ceph01 ~]#
I tried letting mdadm discovering the devices it can use, only pointing at the uuid of the array I want to assemble, and with the --force or --run, or both flags:
[root@ceph01 ~]# mdadm --assemble --scan --uuid=2929c57b:b3a7c8b6:36cb160e:b691c7a4 --run
mdadm: failed to RUN_ARRAY /dev/md/disk1: Input/output error
mdadm: Not enough devices to start the array.
mdadm: No arrays found in config file or automatically
[root@ceph01 ~]# mdadm --assemble --scan --uuid=2929c57b:b3a7c8b6:36cb160e:b691c7a4 --force
mdadm: /dev/md/disk1 assembled from 1 drive and 3 spares - not enough to start the array.
mdadm: No arrays found in config file or automatically
[root@ceph01 ~]# mdadm --assemble --scan --uuid=2929c57b:b3a7c8b6:36cb160e:b691c7a4 --force --run
mdadm: failed to RUN_ARRAY /dev/md/disk1: Input/output error
mdadm: Not enough devices to start the array.
mdadm: No arrays found in config file or automatically
[root@ceph01 ~]#
Nothing worked so far and I am out of idea or suggestions to find on the web. Is there any way to get my array back?