A Big Thank You, and some Notes on Current Recovery Tools.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Okay, I want to start this post with a HUGE THANK YOU THANK YOU THANK
YOU to Nikolay Borisov and most especially to Qu Wenruo!

Thanks to their tireless help in answering all my dumb questions I
have managed to get my BTRFS working again! As I speak I have the
full, non-degraded, quad of drives mounted and am updating my latest
backup of their contents.

I had a 4-drive setup with 2x4T and 2x2T drives and one of the 2T
drives failed, and with help I was able to make a 100% recovery of the
lost data. I do have some observations on what I went through though.
Take this as constructive criticism, or as a point for discussing
additions to the recovery tools:

1) I had a 2T drive die with exactly 3 hard-sector errors and those 3
errors exactly coincided with the 3 super-blocks on the drive. The
odds against this happening as random independent events is so
unlikely as to be mind-boggling. (Something like odds of 1 in 10^26)
So, I'm going to guess this wasn't random chance. Its possible that
something inside the drive's layers of firmware is to blame, but it
seems more likely to me that there must be some BTRFS process that
can, under some conditions, try to update all superblocks as quickly
as possible. I think it must be that a drive failure during this
window managed to corrupt all three superblocks. It may be better to
perform an update-readback-compare on each superblock before moving
onto the next, so as to avoid this particular failure in the future. I
doubt this would slow things down much as the superblocks must be
cached in memory anyway.

2) The recovery tools seem too dumb while thinking they are smarter
than they are. There should be some way to tell the various tools to
consider some subset of the drives in a system as worth considering.
Not knowing that a superblock was a single 4096-byte sector, I had
primed my recovery by copying a valid superblock from one drive to the
clone of my broken drive before starting the ddrescue of the failing
drive. I had hoped that I could piece together a valid superblock from
a good drive, and whatever I could recover from the failing one. In
the end this turned out to be a useful strategy, but meanwhile I had
two drives that both claimed to be drive 2 of 4, and no drive claiming
to be drive 1 of 4. The tools completely failed to deal with this case
and were consistently preferring to read the bogus drive 2 instead of
the real drive 2, and it wasn't until I deliberately patched over the
magic in the cloned drive that I could use the various recovery tools
without bizarre and spurious errors. I understand how this was never
an anticipated scenario for the recovery process, but if its happened
once, it could happen again. Just dealing with a failing drive and its
clone both available in one system could cause this.

3) There don't appear to be any tools designed for dumping a full
superblock in hex notation, or for patching a superblock in place.
Seeing as I was forced to use a hex editor to do exactly that, and
then go through hoops to generate a correct CSUM for the patched
block, I would certainly have preferred there to be some sort of
utility to do the patching for me.

4) Despite having lost all 3 superblocks on one drive in a 4-drive
setup (RAID0 Data with RAID1 Metadata), it was possible to derive all
missing information needed to rebuild the lost superblock from the
existing good drives. I don't know how often it can be done, or if it
was due to some peculiarity of the particular RAID configuration I was
using, or what. But seeing as this IS possible at least under some
circumstances, it would be useful to have some recovery tools that
knew what those circumstances were, and could make use of them.

5) Finally, I want to comment on the fact that each drive only stored
up to 3 superblocks. Knowing how important they are to system
integrity, I would have been happy to have had 5 or 10 such blocks, or
had each drive keep one copy of each superblock for each other drive.
At 4K per superblock, this would seem a trivial amount to store even
in a huge raid with 64 or 128 drives in it. Could there be some method
introduced for keeping far more redundant metainformation around? I
admit I'm unclear on what the optimal numbers of these things would
be. Certainly if I hadn't lost all 3 superblocks at once, I might have
thought that number adequate.

Anyway, I hope no one takes these criticisms the wrong way. I'm a huge
fan of BTRFS and its potential, and I know its still early days for
the code base, and it's yet to fully mature in its recovery and
diagnostic tools. I'm just hoping that these points can contribute in
some small way and give back some of the help I got in fixing my
system!



-- 
Stirling Westrup
Programmer, Entrepreneur.
https://www.linkedin.com/e/fpf/77228
http://www.linkedin.com/in/swestrup
http://technaut.livejournal.com
http://sourceforge.net/users/stirlingwestrup
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux