Re: Trying to rescue my data :(

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 26/06/16 02:25, Chris Murphy wrote:
> On Fri, Jun 24, 2016 at 10:19 PM, Steven Haigh <netwiz@xxxxxxxxx> wrote:
> 
>>
>> Interesting though that EVERY crash references:
>>         kernel BUG at fs/btrfs/extent_io.c:2401!
> 
> Yeah because you're mounted ro, and if this is 4.4.13 unmodified btrfs
> from kernel.org then that's the 3rd line:
> 
> if (head->is_data) {
>     ret = btrfs_del_csums(trans, root,
>        node->bytenr,
>        node->num_bytes);
> 
> So why/what is it cleaning up if it's mounted ro? Anyway, once you're
> no longer making forward progress you could try something newer,
> although it's a coin toss what to try. There are some issues with
> 4.6.0-4.6.2 but there have been a lot of changes in btrfs/extent_io.c
> and btrfs/raid56.c between 4.4.13 that you're using and 4.6.2, so you
> could try that or even build 4.7.rc4 or rc5 by tomorrowish and see how
> that fairs. It sounds like there's just too much (mostly metadata)
> corruption for the degraded state to deal with so it may not matter.
> I'm really skeptical of btrfsck on degraded fs's so I don't think
> that'll help.

Well, I did end up recovering the data that I cared about. I'm not
really keen to ride the BTRFS RAID6 train again any time soon :\

I now have the same as I've had for years - md RAID6 with XFS on top of
it. I'm still copying data back to the array from the various sources I
had to copy it to so I had enough space to do so.

What I find interesting is that the patterns of corruption in the BTRFS
RAID6 is quite clustered. I have ~80Gb of MP3s ripped over the years -
of that, the corruption would take out 3-4 songs in a row, then the next
10 albums or so were intact. What made recovery VERY hard, is that it
got to several situations that just caused a complete system hang.

I tried it on bare metal - just in case it was a Xen thing, but it hard
hung the entire machine then. In every case, it was a flurry of csum
error messages, then instant death. I would have been much happier if
the file had been skipped or returned as unavailable instead of having
the entire machine crash.

I ended up putting the bit of script that I posted earlier in
/etc/rc.local - then just kept doing:
	xl destroy myvm && xl create /etc/xen/myvm -c

Wait for the crash, run the above again.

All in all, it took me about 350 boots with an average uptime of about 3
minutes to get the data out that I decided to keep. While not a BTRFS
loss, I did decide with how long it was going to take to not bother
recovering ~3.5Tb of other data that is easily available in other places
on the internet. If I really need the Fedora 24 KDE Spin ISO, or the
CentOS 6 Install DVD, etc etc I can download it again.

-- 
Steven Haigh

Email: netwiz@xxxxxxxxx
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux