Re: recovering from raid5 corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Sun, 29 Apr 2012 20:46:36 -0400 Shaya Potter <spotter@xxxxxxxxx> wrote:

> On 04/29/2012 07:45 PM, NeilBrown wrote:
> > On Sun, 29 Apr 2012 19:29:10 -0400 Shaya Potter<spotter@xxxxxxxxx>  wrote:
> >
> >> On 04/29/2012 06:52 PM, NeilBrown wrote:
> >>>
> >>> You've written a new superblock 4K in to each device, where previously here
> >>> was something.   So you have probably corrupted something though we cannot
> >>> easily tell what.
> >>>
> >>> Retry your experiment with --metadata=0.90.  Hopefully one of those
> >>> combinations will work better.  If it does, make a backup of the data you
> >>> want to keep, then I would suggest rebuilding the array from scratch.
> >>
> >> ok, thanks, that was a huge help.
> >>
> >> I have it setup correctly now (obvious due to the fact that I can read
> >> the lvm configuration without any gibberish when ordered correctly).
> >
> > I should add that this only proves that you have the first device correct,
> > the rest may be wrong.
> > You need to activate the LVM, then look at the filesystem and see if it is
> > consistent before you can be sure that all devices are in the correct
> > position.
> 
> this cheat sheet came in handy
> 
> http://www.datadisk.co.uk/html_docs/redhat/rh_lvm.htm
> 
> did the method at the bottom "corrupt LVM metadata but replacing the 
> faulty disk"
> 
> copy/paste config file out of beginning of fs.
> 
> pvcreate --uuid <uuid for pv0, from config file> /dev/md0
> vgcfgrestore -f <config file> <pv name>
> vgchange -a y <pv name>
> 
> some cursory testing of large contigious files that have checksumming 
> built in seems to indicate that they are all ok.  probably have other 
> corruption due to the md 0,90 to 1.20 metadata booboo, but if that's 
> only 16k-20k (4k * 4 or 5 disks) spread out over 3tb of data, I'm very 
> happy :)  and it's mostly family photo data, so not the biggest deal if 
> the large majority is ok.
> 
> <shew> relieved.

Excellent.  Thanks for keeping us informed.

If you were using 3.3.1, 3.3.2, or 3.3.3 when this happened, then I know what
caused it and suggest upgrading to 3.3.4.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[ATA RAID]     [Linux SCSI Target Infrastructure]     [Managing RAID on Linux]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device-Mapper]     [Kernel]     [Linux Books]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Photos]     [Yosemite Photos]     [Yosemite News]     [AMD 64]     [Linux Networking]

Add to Google Powered by Linux