On Fri, Mar 30, 2018 at 09:21:00AM +0200, Menion wrote: > Thanks for the detailed explanation. I think that a summary of this > should go in the btrfs raid56 wiki status page, because now it is > completely inconsistent and if a user comes there, ihe may get the > impression that the raid56 is just broken > Still I have the 1 bilion dollar question: from your word I understand > that even in RAID56 the metadata are spread on the devices in a coplex > way, but shall I assume that the array can survice to the sudden death > of one (two for raid6) HDD in the array? I wouldn't assume that. There is still the write hole, and while there is a small probability of having a write hole failure, it's a probability that applies on *every* write in degraded mode, and since disks can fail at any time, the array can enter degraded mode at any time. It's similar to lottery tickets--buy one ticket, you probably won't win, but if you buy millions of tickets, you'll claim the prize eventually. The "prize" in this case is a severely damaged, possibly unrecoverable filesystem. If the data is raid5 and the metadata is raid1, the filesystem can survive a single disk failure easily; however, some of the data may be lost if writes to the remaining disks are interrupted by a system crash or power failure and the write hole issue occurs. Note that the damage is not necessarily limited to recently written data--it's any random data that is merely located adjacent to written data on the filesystem. I wouldn't use raid6 until the write hole issue is resolved. There is no configuration where two disks can fail and metadata can still be updated reliably. Some users use the 'ssd_spread' mount option to reduce the probability of write hole failure, which happens to be helpful by accident on some array configurations, but it has a fairly high cost when the array is not degraded due to all the extra balancing required. > Bye
Attachment:
signature.asc
Description: PGP signature
