I have a HP Microserver at home for personal use (hosting music, pictures, videos that can be accessed from any device in the house) running Ubuntu server with a software RAID. Issues started last week when I believe it lost power during the shutdown process and now it is failing to boot.
A friend set this up for me years ago who is no longer around so I am attempting to muddle through and sort it on my own atm. There are four 2TB hard disks running as RAID 5 (or 6 I can't remember) so I have 6TB of usable storage across the disks.
md/raid:md0: device sda3 operational as raid disk 0
md/raid:md0: device sdd1 operational as raid disk 3
md/raid:md0: device sdc1 operational as raid disk 2
md/raid:md0: allocated OkB
md/raid:md0: cannot start dirty degraded array.
md/raid:md0: failed to run raid set.
md: pers->run() failed
mdadm: failed to start array /dev/nd/0: Input/output error
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
mdadm: /dev/md/0 is already in use.
Could not start RAID arrays in degraded mode. Gave up waiting for root device. Common problems:
- Boot args (cat /proc/cmdline)
- Check rootdelay= (did the system wait long enough?)
- Check root= (did the system wait for the right device?)
Missing modules (cat /proc/modules: ls /dev)
ALERT! /dev/disk/by-uuid/1eb18515-c1e0-4f77-92ec-0e22d94e4803 does not exist. Dropping to a shell!
BusyBox v1.21.1 (Ubuntu 1:1.21.0-1ubuntu1.4) built-in shell (ash)
Enter 'help' for a list of built-in commands.
(initramfs)_
I found some other posts such as these:
https://forums.debian.net//viewtopic.php?f=10&t=142536
https://bbs.archlinux.org/viewtopic.php?id=193302
https://ubuntuforums.org/showthread.php?t=854528
Restart degraded RAID array after crash
And attempted some of the suggestions. This is the details of the disks:
State : active, degraded, Not Started
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 0 0 1 removed
2 8 33 2 active sync /dev/sdc1
3 8 49 3 active sync /dev/sdd1
I'm assuming that the 'removed' disk is the one that's reserved in case a disk breaks and will take over in that situation. Or is that disk potentially the cause of the problem? I have removed each disk in turn and ensured they are connected properly
When trying this command:
echo "clean" > /sys/block/md0/md/array_state
It returns this error:
/bin/sh: can't create /sys/block/md0/md/array_state: Permission denied
Trying to force reassemble the array results in:
mdadm --assemble --force /dev/md0 /dev/sda3 /dev/sdc1 /dev/sdd1
mdadm: CREATE user root not found
mdadm: CREATE group disk not found
mdadm: /dev/sda3 is busy - skipping
mdadm: /dev/sdc1 is busy - skipping
mdadm: /dev/sdd1 is busy - skipping
Though I'm wondering should my assemble command also include /dev/sdb1?
One of the other forum posts mentions that they are busy because they are already part of an array ( https://bbs.archlinux.org/viewtopic.php?pid=1500593#p1500593 )
Is it safe for me to run mdadm --stop /dev/md0
and then try to rebuild with the assemble command but I presume also add the disk that is currently showing as removed?
Don't want to go further without any extra guidance as I don't really understand what will happen. I've never dealt with RAIDs on my own and I don't want to mess it up anymore and hopefully recover the data that's on it.
Any help and advice is greatly appreciated.