Mordechay Kaganer posted on Sun, 28 Jun 2015 19:31:31 +0300 as excerpted: > On Sun, Jun 28, 2015 at 2:17 AM, Mordechay Kaganer <mkaganer@xxxxxxxxx> > wrote: >> B.H. >> >> Hello. I'm running our backup archive on btrfs. We have MD-based RAID5 >> array with 4 6TB disks then LVM on top of it, and btrfs volume on the >> LV (we don't use btrfs's own RAID features because we want RAID5 and as >> far as i understand the support is only partial). (I see people already helping with the primary issue so won't address that here. However, addressing the above...) FWIW... btrfs raid56 (5 and 6) support is now (from kernel 3.19) "code complete". However, "code complete" is far from "stable and mature", and I (as a list regular but not a dev) have been recommending that people continue to hold off a few kernels until it has had some time to stabilize to more or less about the same point as btrfs itself is at, unless of course their purpose is actually to test the code with data they're prepared to lose, report bugs and help get them fixed, in which case, welcome aboard! =:^) Of course btrfs itself isn't really mature or entirely stable yet, tho it's reasonable for ordinary use, provided the sysadmins' rule of backups is observed: (a) If it's not backed up, by definition the data is worth less to you than the time and media required to do the backups, despite any claims to the contrary, and (b) for purposes of this rule, a would-be backup that hasn't been tested restorable isn't yet a backup. But back to raid56, my recommendation has been to wait at LEAST TWO kernel cycles, which would be the just released 4.1, and even then, consider it bleeding edge and be prepared to deal with bugs. For stability comparable to btrfs in general, my recommendation is to wait at least a year, which happens to be about five kernel cycles, so until at least 4.4. At that point, either check a few weeks of list traffic and decide for yourself based on that, or ask, but that's a reasonably educated guess. Btrfs raid56 bottom line, 4.1 is the minimal 2 kernel cycles code maturity I suggested; if you're prepared to be bleeding edge, try it. Else wait the full year, kernel 4.4 or so. (More below...) >> I wanted to move the archive to another MD array of 4 8TB drives (this >> time without LVM). So i did: >> >> btrfs replace start 1 /dev/md1 <mount_point> >> >> Where 1 is the only devid that was present and /dev/md1 is the new >> array. FWIW, I hadn't even considered the possibility of doing a replace from a single device. I had thought it required raid mode. But if it appeared to work... >> The replace run successfully until finished after more than 5 days. >> The system downloaded some fresh backups and created new snapshots >> during the ongoing replace. I've go 2 kernel warnings about replace >> task waiting for more than 120 seconds in the middle, but process >> seamed to go on anyway. >> >> After the replace have finished i did btrfs fi resize 1:max >> <mount_point> then unmounted and mounted again using the new drive. >> >> Then i've run a scrub on the FS - and got a lot of checksum errors. Had you done a pre-replace scrub on the existing device? If not, is the corruption actually new, or from before the replace and simply transferred? You don't know. Meanwhile, one reason not to particularly like the idea of btrfs over something like mdraid, is that btrfs is checksumming and operationally verifying, mdraid is not. If btrfs reports an error, was it at the media level and which raid device if so, the raid level, the btrfs level, or ?? Tho for mdraid5/6 you can do a raid scrub, and hopefully detect and correct media and raid level errors, but you still don't have raid level checksum verification. And with multiple terabyte drives that's definitely going to take awhile! With btrfs raid1/10 there will be a second, hopefully checksum-valid, copy, to use and rebuild from. And btrfs raid56 should be able to reconstruct a hopefully valid checksum from parity, tho of course at its maturity level one can't yet assume it's entirely bug-free. (Again, as I observed above the problem resolution is occurring on another subthread, so I'll leave this at the above.) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
