I was hit by this when trying to rebalance a 16TB RAID10 to 32TB RAID10 going from 4 to 8 WD SE 4TB drives today. I cannot finish a rebalance because of failed csum. [10228.850910] BTRFS info (device sdq): csum failed ino 487 off 65536 csum 2566472073 private 151366068 [10228.850967] BTRFS info (device sdq): csum failed ino 487 off 69632 csum 2566472073 private 3056924305 [10228.850973] BTRFS info (device sdq): csum failed ino 487 off 593920 csum 2566472073 private 906093395 [10228.851004] BTRFS info (device sdq): csum failed ino 487 off 73728 csum 2566472073 private 2680502892 [10228.851014] BTRFS info (device sdq): csum failed ino 487 off 598016 csum 2566472073 private 1940162924 [10228.851029] BTRFS info (device sdq): csum failed ino 487 off 77824 csum 2566472073 private 2939385278 [10228.851051] BTRFS info (device sdq): csum failed ino 487 off 602112 csum 2566472073 private 645310077 [10228.851055] BTRFS info (device sdq): csum failed ino 487 off 81920 csum 2566472073 private 3600741549 [10228.851078] BTRFS info (device sdq): csum failed ino 487 off 86016 csum 2566472073 private 200201951 [10228.851091] BTRFS info (device sdq): csum failed ino 487 off 606208 csum 2566472073 private 1002916440 The system is running a scrub now and I will return with some more details later. I do not think systemd is logging to this volume, but the scrub wil probably show which files are affected. As this is a very serious issue for those hit by the corruption (it basically makes it impossible to run rebalance with all its consequences) hopefully this wil go upstream soon. I am on Kernel 3.11.6 by the way. Mvh Hans-Kristian Bakke Mob: 91 76 17 38 On 4 October 2013 23:19, Johannes Hirte <johannes.hirte@xxxxxxxxxxxxx> wrote: > On Fri, 27 Sep 2013 09:37:00 -0400 > Josef Bacik <jbacik@xxxxxxxxxxxx> wrote: > >> A user reported a problem where they were getting csum errors when >> running a balance and running systemd's journal. This is because >> systemd is awesome and fallocate()'s its log space and writes into >> it. Unfortunately we assume that when we read in all the csums for >> an extent that they are sequential starting at the bytenr we care >> about. This obviously isn't the case for prealloc extents, where we >> could have written to the middle of the prealloc extent only, which >> means the csum would be for the bytenr in the middle of our range and >> not the front of our range. Fix this by offsetting the new bytenr we >> are logging to based on the original bytenr the csum was for. With >> this patch I no longer see the csum errors I was seeing. Thanks, > > Any assessment when this goes upstream? Until it hit Linus tree it > won't won't appear in stable. And this seems rather important. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
