- To: linux-raid@xxxxxxxxxxxxxxx
- Subject: Raid5 crashed, need comments on possible repair solution
- From: Christoph Nelles <evilazrael@xxxxxxxxxxxxx>
- Date: Mon, 23 Apr 2012 15:56:16 +0200
- Openpgp: url=http://evilazrael.net/pgp.txt
- User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2.28) Gecko/20120306 Lightning/1.0b2 Thunderbird/3.1.20
Hi,
Linux RAID worked for me fine in the last few years, but yesterday while
reorganizing the HW in my server the RAID5 crashed. It was a
Software-RAID Level 5 with 6x 3TB drives and ran XFS on top of it. I
have no idea why it crashed, but now all superblocks are invalid (one
dump follows) and sadly i have no information on the raid disk layout
(in which sequence the drives were). All drives from the raid are
available and running.
As i cannot afford to buy 6x more drives for making a backup prior
trying to fix the situation, i need a non-destructive approach to fix
the RAID configuration and the superblocks.
>From my understanding of the RAID5 implementation the correct order of
drives is important.
First Question:
1) Am i right that the order is important and i have to try to find the
right sequence of drives?
So i would create a loop over all permutations of the drive list and for
each permutation:
- Scrub the Superblock mdadm --zero-superblock /dev/sd[bcdefg]1
- Recreate the RAID5 mdadm --create /dev/md0 -c 64 -l 5 \
-n 6 --assume-clean <drive permutation>
- Run xfs_check to see if it recognizes the FS xfs_check -s /dev/md0
- Stop the RAID mdadm --stop /dev/md0
2) Is that a promising approach to repair the RAID5 array?
3) According the man page the --assume-cleanthat no data is affected
unless you write to the array, so this effectively prevents a rebuild?
This is important for me, as i don't want to trigger a rebuild as this
will certainly send my data to hell.
4) Any other idea for repairing the RAID without loosing user data?
Thanks in advance for any answers.
Currently the RAID superblocks on each device look like this:
/dev/sdg1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 53a294b5:975244fc:343b0f94:16652fce
Name : grml:0
Creation Time : Fri Apr 15 20:55:52 2011
Raid Level : -unknown-
Raid Devices : 0
Avail Dev Size : 5860529039 (2794.52 GiB 3000.59 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 9688dc72:02140045:c16a2123:4f6cc006
Update Time : Sun Apr 22 23:56:14 2012
Checksum : 350d8d74 - correct
Events : 1
Device Role : spare
Array State : ('A' == active, '.' == missing)
Interestingly at the Update Time the system should have been shut down:
Apr 22 23:55:55 router init: Switching to runlevel: 0
[...]
Apr 22 23:56:03 router exiting on signal 15
Apr 22 23:59:21 router syslogd 1.5.0: restart.
I have really no clue what happened.
Regards
Christoph Nelles
--
Christoph Nelles
E-Mail : evilazrael@xxxxxxxxxxxxx
Jabber : eazrael@xxxxxxxxxxxxxx ICQ : 78819723
PGP-Key : ID 0x424FB55B on subkeys.pgp.net
or http://evilazrael.net/pgp.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[ATA RAID]
[Linux SCSI Target Infrastructure]
[Managing RAID on Linux]
[Linux IDE]
[Linux SCSI]
[Linux Hams]
[Device-Mapper]
[Kernel]
[Linux Books]
[Linux Admin]
[Linux Net]
[GFS]
[RPM]
[git]
[Photos]
[Yosemite Photos]
[Yosemite News]
[AMD 64]
[Linux Networking]