Re: Btrfs scrub failure for raid 6 kernel 4.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christoph Anton Mitterer posted on Mon, 28 Dec 2015 02:31:16 +0100 as
excerpted:

> On Sun, 2015-12-27 at 18:23 -0700, Chris Murphy wrote:
>> I'd want scrub to immediately fail in a degraded case, because the
>> higher workload added by the scrub itself could cause additional device
>> failures sooner. And that would negatively impact the ability to get
>> the array healthy again with a rebuild - during which time you wouldn't
>> want the scrub running anyway.
> I think that should be the default behaviour, yes.
> 
> Probebly there should be a --force like override switch,... again take
> my example of classic RAID1 with say 6 disks... (which is right now not
> possible in btrfs, I know).
> Now one fails... so it's degraded, but you still have 5 left and you're
> probably far away from complete loss. OTOH you still may want to scrub
> your data during during that time (e.g. to catch silent block errors),
> in that specific scenario.

Scrub needs to be able to run on a degraded array (possibly with a 
--force switch, I've no real opinion on that), for a number of reasons:

1) As mentioned up-thread, btrfs scrub isn't like others; we have 
checksums and being able to do a global checksum-verify via scrub, even 
on a degraded array where repair may not be possible, is a legitimate use.

2) With raid6 degraded by loss of only a single device, repair should 
still be possible (and checksums should counteract the partial-stripe-
write hole that parity normally risks, verify the "repair" and don't 
write it if it still fails checksum).

3) With the coming N-way-mirroring, degraded repair from a good copy of 
an N-way-mirrored block down to two-way should be possible, and indeed, 
that'd be the biggest reason to run N-way-mirroring with N>2 in the first 
place.


But regardless, agreed with everyone, simply crashing must be seen as a 
bug.  If it's not going to scrub correctly, it should exit normally but 
with an error status and printout to STDERR, not crash.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux