On May 15, 2012, at 3:59 PM, NeilBrown <neilb@xxxxxxx> wrote:
> On Fri, 11 May 2012 12:41:16 -0700 "C.J. Adams-Collier KF7BMP"
> <cjac@xxxxxxxxxxxxxxx> wrote:
>
>> Hey all,
>>
>> I've got an array that seems to have failed while I was re-synchronizing
>> one of the disks. sde fell out when I moved six disks from one chassis
>> to another. I re-added it and it was 98.8% done with 300 minutes left
>> in the process when I went to sleep last night. When I woke up, the
>> array was in a FAILED state, sdg was marked failed and sde was marked
>> spare. I removed sdg from the array and re-booted and now the array
>> won't start.
>>
>> Is there a way to re-add sdg back in to slot 5 rather than having it
>> added as a spare? AFAICT, no writes have been made to sdg or md0 since
>> I removed it from the array, so it should be pretty close to its active
>> state. sde must be nearly ready to be added in as an active participant
>> in the array, too.
>>
>> Is there anything I can do to re-build the array at this point?
>>
>> Cheers,
>>
>> C.J.
>>
>
> From the "mdadm -E" you sent me separately :
>
> Version : 0.90.00
> Raid Level : raid5
> Used Dev Size : 972848128 (927.78 GiB 996.20 GB)
> Array Size : 4864240640 (4638.90 GiB 4980.98 GB)
> Raid Devices : 6
>
> and "grep this" show:
>
> this 3 8 18 3 active sync /dev/sdb2
> this 4 8 34 4 active sync /dev/sdc2
> this 2 8 50 2 active sync /dev/sdd2
> this 6 8 66 6 spare /dev/sde2
> this 1 8 82 1 active sync /dev/sdf2
> this 6 8 98 6 spare /dev/sdg2
>
> "grep Events" shows:
>
> Events : 34795
> Events : 34795
> Events : 34795
> Events : 34795
> Events : 34795
> Events : 34794
>
> So you are missing device '0' and '5'.
>
> So presumably sdg reported an error before sde finished recovery, so
> sde remains a spare. I cannot see why "sdg" is marked as a spare though.
> It should still be marked as a member of the array. Maybe you tried to add
> it after removing it?
>
> What you need to do is decide which of 'e' and 'g' you trust most (probably
> g, but I don't know the full history) and which slot it should be in (0 or 5,
> you might be able to tell from a recent "RAID conf printout" in kernel logs).
> Then
> mdadm -S /dev/md0
> mdadm -C /dev/md0 -l5 -n6 -e 0.90 -c 64 /dev/sdg2 /dev/sdf2 /dev/sdd2 \
> /dev/sdb2 /dev/sdc2 missing
>
> The order of devices is important. This puts 'g2' in slot 0 and 'missing'
> in slot 5.
>
> Then 'fsck -n /dev/md0' or whatever is appropriate given what sort of data
> you have on md0. If that is happy, add the other device (g2 or e2) and let
> it recovery.
>
> NeilBrown
Thanks a million. I really appreciate your help.
Sent from my PDP-11--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[ATA RAID]
[Linux SCSI Target Infrastructure]
[Managing RAID on Linux]
[Linux IDE]
[Linux SCSI]
[Linux Hams]
[Device-Mapper]
[Kernel]
[Linux Books]
[Linux Admin]
[Linux Net]
[GFS]
[RPM]
[git]
[Photos]
[Yosemite Photos]
[Yosemite News]
[AMD 64]
[Linux Networking]