Re: btrfs and ECC RAM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have been wondering the same thing for quite some time after having
read this post (which makes a pretty clear case in favour of ECC
RAM)...

hxxp://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449/

... and the ZFS on Linux FAQ
hxxp://zfsonlinux.org/faq.html#DoIHaveToUseECCMemory

Moreover, the ZFS community seem to cite this article quite often:
hxxp://research.cs.wisc.edu/adsl/Publications/zfs-corruption-fast10.pdf

Without having further knowledge on that matter, I tend to believe
(but I hope I'm wrong) that BTRFS is as vulnerable as ZFS to memory
errors. Since I upgraded recently, it's a bit too late for purchasing
ECC-capable infrastructure (change of CPU + motherboard + RAM) so I
just chose to ignore this risk by performing a memtest86 right before
every scrub (and having my regular backups ready). I've been using ZFS
on Linux for almost 5 months (having occasional issues with kernel
updates) until last week that I finally switched to BTRFS and I'm
happy.

As for the reliability of ECC RAM (from what I've read about it) it's
just that it corrects single-bit errors and it immediately halts the
system when it finds multi-bit errors.

On Sat, Jan 18, 2014 at 1:23 AM, Ian Hinder <ian.hinder@xxxxxxxxxx> wrote:
> Hi,
>
> I have been reading a lot of articles online about the dangers of using ZFS with non-ECC RAM.  Specifically, the fact that when good data is read from disk and compared with its checksum, a RAM error can cause the read data to be incorrect, causing a checksum failure, and the bad data might now be written back to the disk in an attempt to correct it, corrupting it in the process.  This would be exacerbated by a scrub, which could run through all your data and potentially corrupt it.  There is a strong current of opinion that using ZFS without ECC RAM is "suicide for your data".
>
> I have been unable to find any discussion of the extent to which this is true for btrfs.  Does btrfs handle checksum errors in the same way as ZFS, or does it perform additional checks before writing "corrected" data back to disk?  For example, if it detects a checksum error, it could read the data again to a different memory location to determine if the error existed in the disk copy or the memory.
>
> From what I've been reading, it sounds like ZFS should not be used with non-ECC RAM.  This is reasonable, as ZFS' resource requirements mean that you probably only want to run it on server-grade hardware anyway.  But with btrfs eventually being the default filesystem for Linux, that would mean that all linux machines, even cheap consumer-grade hardware, would need ECC RAM, or forego many of the advantages of btrfs.
>
> What is the situation?
>
> --
> Ian Hinder
> http://numrel.aei.mpg.de/people/hinder
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux