Re: [PATCH] btrfs: raid56: Use correct stolen pages to calculate P/Q

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 29, 2016 at 02:52:47AM +0100, Christoph Anton Mitterer wrote:
> On Mon, 2016-11-28 at 16:48 -0500, Zygo Blaxell wrote:
> > If a drive's embedded controller RAM fails, you get corruption on the
> > majority of reads from a single disk, and most writes will be corrupted
> > (even if they were not before).
> 
> Administrating a multi-PiB Tier-2 for the LHC Computing Grid with quite
> a number of disks for nearly 10 years now, I'd have never stumbled on
> such a case of breakage so far...
> 
> Actually most cases are as simple as HDD fails to work and this is
> properly signalled to the controller.

I administer no real storage at this time, and got only 16 disks (plus a few
disk-likes) to my name right now.  Yet in a ~2 months span I've seen three
cases of silent data corruption:

* a RasPi I used for DNS recursor/DHCP/aiccu started mangling some writes,
  with no notification that something is amiss.  With ext4 being a
  silentdatalossfs, there was no clue it was a disk (ok, SD) problem at all,
  making it really "fun" to debug.  Happens on multiple SD cards, thus it's
  the machine that's at fault.

* a HDD had some link resets and silent data corruption, diagnosed to a bad
  SATA cable, the disk works fine since (obviously after extensive tests).

* a HDD that has link resets and silent data corruption (apparently
  write-time only(?)), Marduk knows why.  Happens with multiple cables and
  two machines, putting the blame somewhere on the disk.

Thus, assumption that the controller will be notified about read errors is
quite invalid.  In the above cases, if recovery was possible it'd be
beneficial to rewrite a good copy of the data.


Meow!
-- 
The bill declaring Jesus as the King of Poland fails to specify whether
the addition is at the top or end of the list of kings.  What should the
historians do?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux