Re: btrfs check inconsistency with raid1, part 1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chris Murphy posted on Mon, 14 Dec 2015 00:24:21 -0700 as excerpted:

>> Personally speaking, it may be a false alert from btrfsck.
>> So in this case, I can't provide much help.
>>
>> If you're brave enough, mount it rw to see what will happen(although it
>> may mount just OK).
> 
> I'm brave enough. I'll give it a try tomorrow unless there's another
> request for more info before then.

Given the off-by-one generations and my own btrfs raid1 experience, I'm 
guessing the likely result is a good mount and either no problems or a 
good initial mount but lockup once you try actually doing too much (like 
actually reading the affected blocks) with the filesystem.

Looks like a normal generation-out-of-sync condition, common with forced 
unsynced/not-remounted-ro shutdowns.  If so, btrfs should redirect reads 
to the updated current generation device, but you'll need to do a scrub 
to get everything 100% back in sync.

The catch I found, at least when I still had the then-failing (but not 
failed, it was just finding more and more sectors that needed redirected 
to spares) ssd still in my raid1, also with an on-boot service that read 
a rather large dir into cache, was that after so many errors from the 
failing device, instead of continuing to redirect errors to the good 
device, btrfs just gives up, which resulted in a system crash, here.

But when there weren't that many errors on the failing device, or when I 
intercepted the boot process and mounted everything but didn't run normal 
post-mount services (systemd emergency target instead of my usual default 
multi-user) so the service that cached that dir didn't have a chance to 
run, so all those errors didn't trigger, I could still mount normally, 
and from there, I could run scrub, which took care of the problem without 
triggering the usual too many errors crash, and after scrub, I could 
invoke normal multi-user mode and start all services including the 
caching service, and go about my usual business.

So if I'm correct, mount normally and scrub, and you should be fine, tho 
you may have to abort a normal boot if it accesses too many bad files, in 
ordered to be able to finish the scrub before a crash.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux