Re: btrfs and ECC RAM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20 Jan 2014, at 04:13, Austin S Hemmelgarn <ahferroin7@xxxxxxxxx> wrote:

> On 01/19/2014 07:17 PM, George Eleftheriou wrote:
>> I have been wondering the same thing for quite some time after
>> having read this post (which makes a pretty clear case in favour of
>> ECC RAM)...
>> 
>> hxxp://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449/
>> 
>> ... and the ZFS on Linux FAQ 
>> hxxp://zfsonlinux.org/faq.html#DoIHaveToUseECCMemory
>> 
>> Moreover, the ZFS community seem to cite this article quite often: 
>> hxxp://research.cs.wisc.edu/adsl/Publications/zfs-corruption-fast10.pdf
>> 
>> Without having further knowledge on that matter, I tend to
>> believe (but I hope I'm wrong) that BTRFS is as vulnerable as ZFS
>> to memory errors. Since I upgraded recently, it's a bit too late
>> for purchasing ECC-capable infrastructure (change of CPU +
>> motherboard + RAM) so I just chose to ignore this risk by
>> performing a memtest86 right before every scrub (and having my
>> regular backups ready). I've been using ZFS on Linux for almost 5
>> months (having occasional issues with kernel updates) until last
>> week that I finally switched to BTRFS and I'm happy.
> AFAIK, ZFS does background data scrubbing without user intervention
> (which on a separate note can make it a huge energy hog) to correct
> on-disk errors.  For performance reasons though, it has no built-in
> check to make sure that there really is an error, it just assumes that
> if the checksum is wrong, the data on the disk must be wrong.  This is
> fine for enterprise level hardware with ECC RAM, because the disk IS
> more likely to be wrong in that case than the RAM is.  This assumption
> falls apart though on commodity hardware (ie, no ECC RAM), hence the
> warnings about using ZFS without ECC RAM.

In hxxp://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449, they talk about reconstructing corrupted data from parity information:

> Ok, no problem. ZFS will check against its parity. Oops, the parity failed since we have a new corrupted bit. Remember, the checksum data was calculated after the corruption from the first memory error occurred. So now the parity data is used to "repair" the bad data. So the data is "fixed" in RAM.

i.e. that there is parity information stored with every piece of data, and ZFS will "correct" errors automatically from the parity information.  I start to suspect that there is confusion here between checksumming for data integrity and parity information.  If this is really how ZFS works, then if memory corruption interferes with this process, then I can see how a scrub could be devastating.  I don't know if ZFS really works like this.  It sounds very odd to do this without an additional checksum check.  This sounds very different to what you say below that btrfs does, which is only to check against redundantly-stored copies, which I agree sounds much safer.  

The second link above from the ZFS FAQ just says that if you place a very high value on data integrity, you should be using ECC memory anyway, which I'm sure we can all agree with.

hxxp://zfsonlinux.org/faq.html#DoIHaveToUseECCMemory:

> 1.16 Do I have to use ECC memory for ZFS?
> Using ECC memory for ZFS is strongly recommended for enterprise environments where the strongest data integrity guarantees are required. Without ECC memory rare random bit flips caused by cosmic rays or by faulty memory can go undetected. If this were to occur ZFS (or any other filesystem) will write the damaged data to disk and be unable to automatically detect the corruption.

i.e. if the data is bad in RAM, the checksumming in ZFS isn't going to help you, but it also isn't going to make things worse.  It doesn't say that a scrub can kill all your data, like the previous link does.  

In hxxp://research.cs.wisc.edu/adsl/Publications/zfs-corruption-fast10.pdf, they essentially say that memory errors can cause problems, and it would be nice if filesystems used their checksumming to also checksum memory locations etc, which would help to detect and mitigate these.

On 20 Jan 2014, at 04:13, Austin S Hemmelgarn <ahferroin7@xxxxxxxxx> wrote:

> BTRFS however works differently, it only scrubs data when you tell it
> to.  If it encounters a checksum or read error on a data block, it
> first tries to find another copy of that block elsewhere (usually on
> another disk), if it still sees a wrong checksum there, or gets
> another read error, or can't find another copy, then it returns a read
> error to userspace, usually resulting in the program reading the data
> crashing.  In most environments other than HA clustering, this is an
> excellent compromise that still protects data integrity.

Yes, this sounds fine.

-- 
Ian Hinder
http://numrel.aei.mpg.de/people/hinder

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux