Re: scrub: Tree block spanning stripes, ignored

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Ivan P wrote on 2016/04/06 21:39 +0200:
Ok, I'm cautiously optimistic: after running btrfsck
--init-extent-tree --repair and running scrub, it finished without
errors.
Will run a file compare against my backup copy, but it seems the
repair was successful.

Better run btrfsck again, to ensure no other problem.

For backref problem, did you rw mount the fs with some old kernel like 4.2?
IIRC, I introduced a delayed_ref regression in that version.
Maybe it's related to the bug.

Thanks,
Qu

Here is the btrfs-image btw:
https://dl.dropboxusercontent.com/u/19330332/image.btrfs (821Mb)

Maybe you will be able to track down whatever caused this.

Regards,
Ivan.

On Sun, Apr 3, 2016 at 3:24 AM, Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:


On 04/03/2016 12:29 AM, Ivan P wrote:

It's about 800Mb, I think I could upload that.

I ran it with the -s parameter, is that enough to remove all personal
info from the image?
Also, I had to run it with -w because otherwise it died on the same
corrupt node.


You can also use -c9 to further compress the data.

Thanks,
Qu


On Fri, Apr 1, 2016 at 2:25 AM, Qu Wenruo <quwenruo@xxxxxxxxxxxxxx> wrote:



Ivan P wrote on 2016/03/31 18:04 +0200:


Ok, it will take a while until I can attempt repairing it, since I
will have to order a spare HDD to copy the data to.
Should I take some sort of debug snapshot of the fs so you can take a
look at it? I think I read something about a snapshot that only
contains the fs but not the data that somewhere.


That's btrfs-image.

It would be good, but if your metadata is over 3G, I think it's would
take a
lot of time uploading.

Thanks,
Qu


Regards,
Ivan.

On Tue, Mar 29, 2016 at 3:57 AM, Qu Wenruo <quwenruo@xxxxxxxxxxxxxx>
wrote:




Ivan P wrote on 2016/03/28 23:21 +0200:



Well, the file in this inode is fine, I was able to copy it off the
disk. However, rm-ing the file causes a segmentation fault. Shortly
after that, I get a kernel oops. Same thing happens if I attempt to
re-run scrub.

How can I delete that inode? Could deleting it destroy the filesystem
beyond repair?




The kernel oops should protect you from completely destroying the fs.

However it seems that the problem is beyond kernel's handle (kernel
oops).

So no safe recovery method now.

   From now on, any repair advice from me *MAY* *destroy* your fs.
So please do backup when you still can.


The best possible try would be "btrfsck --init-extent-tree --repair".

If it works, then mount it and run "btrfs balance start <mnt>".
Lastly, umount and use btrfsck to re-check if it fixes the problem.

Thanks,
Qu



Regards,
Ivan

On Mon, Mar 28, 2016 at 3:10 AM, Qu Wenruo <quwenruo.btrfs@xxxxxxx>
wrote:





Ivan P wrote on 2016/03/27 16:31 +0200:




Thanks for the reply,

the raid1 array was created from scratch, so not converted from
ext*.
I used btrfs-progs version 4.2.3 on kernel 4.2.5 to create the
array,
btw.





I don't remember any strange behavior after 4.0, so no clue here.

Go to the subvolume 5 (the top-level subvolume), find inode 71723 and
try
to
remove it.
Then, use 'btrfs filesystem sync <mount point>' to sync the inode
removal.

Finally use latest btrfs-progs to check if the problem disappears.

This problem seems to be quite strange, so I can't locate the root
cause,
but try to remove the file and hopes kernel can handle it.

Thanks,
Qu





Is there a way to fix the current situation without taking the whole
data off the disk?
I'm not familiar with file systems terms, so what exactly could I
have
lost, if anything?

Regards,
Ivan

On Sun, Mar 27, 2016 at 4:23 PM, Qu Wenruo <quwenruo.btrfs@xxxxxxx
<mailto:quwenruo.btrfs@xxxxxxx>> wrote:



        On 03/27/2016 05:54 PM, Ivan P wrote:

            Read the info on the wiki, here's the rest of the
requested
            information:

            # uname -r
            4.4.5-1-ARCH

            # btrfs fi show
            Label: 'ArchVault'  uuid:
cd8a92b6-c5b5-4b19-b5e6-a839828d12d8
                     Total devices 1 FS bytes used 2.10GiB
                     devid    1 size 14.92GiB used 4.02GiB path
/dev/sdc1

            Label: 'Vault'  uuid:
013cda95-8aab-4cb2-acdd-2f0f78036e02
                     Total devices 2 FS bytes used 800.72GiB
                     devid    1 size 931.51GiB used 808.01GiB path
/dev/sda
                     devid    2 size 931.51GiB used 808.01GiB path
/dev/sdb

            # btrfs fi df /mnt/vault/
            Data, RAID1: total=806.00GiB, used=799.81GiB
            System, RAID1: total=8.00MiB, used=128.00KiB
            Metadata, RAID1: total=2.00GiB, used=936.20MiB
            GlobalReserve, single: total=320.00MiB, used=0.00B

            On Fri, Mar 25, 2016 at 3:16 PM, Ivan P
<chrnosphered@xxxxxxxxx
            <mailto:chrnosphered@xxxxxxxxx>> wrote:

                Hello,

                using kernel  4.4.5 and btrfs-progs 4.4.1, I today
ran a
                scrub on my
                2x1Tb btrfs raid1 array and it finished with 36
                unrecoverable errors
                [1], all blaming the treeblock 741942071296. Running
"btrfs
                check
                --readonly" on one of the devices lists that extent
as
                corrupted [2].

                How can I recover, how much did I really lose, and
how
can
I
                prevent
                it from happening again?
                If you need me to provide more info, do tell.

                [1] http://cwillu.com:8080/188.110.141.36/1


        This message itself is normal, it just means a tree block is
        crossing 64K stripe boundary.
        And due to scrub limit, it can check if it's good or bad.
        But....

                [2] http://pastebin.com/xA5zezqw

        This one is much more meaningful, showing several strange
bugs.

        1. corrupt extent record: key 741942071296 168 1114112
        This means, this is a EXTENT_ITEM(168), and according to the
offset,
        it means the length of the extent is, 1088K, definitely not a
valid
        tree block size.

        But according to [1], kernel think it's a tree block, which
is
quite
        strange.
        Normally, such mismatch only happens in fs converted from
ext*.

        2. Backref 741942071296 root 5 owner 71723 offset 2589392896
        num_refs 0 not found in extent tree

        num_refs 0, this is also strange, normal backref won't have a
zero
        refrence number.

        3. bad metadata [741942071296, 741943185408) crossing stripe
boundary
        It could be a false warning fixed in latest btrfsck.
        But you're using 4.4.1, so I think that's the problem.

        4. bad extent [741942071296, 741943185408), type mismatch
with
chunk
        This seems to explain the problem, a data extent appears in a
        metadata chunk.
        It seems that you're really using converted btrfs.

        If so, just roll it back to ext*. Current btrfs-convert has
known
        bug but fix is still under review.

        If want to use btrfs, use a newly created one instead of
btrfs-convert.

        Thanks,
        Qu


                Regards,
                Soukyuu

                P.S.: please add me to CC when replying as I did not
                subscribe to the
                mailing list. Majordomo won't let me use my hotmail
address
                and I
                don't want that much traffic on this address.

            --
            To unsubscribe from this list: send the line "unsubscribe
            linux-btrfs" in
            the body of a message to majordomo@xxxxxxxxxxxxxxx
            <mailto:majordomo@xxxxxxxxxxxxxxx>
            More majordomo info at
http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html








--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html






--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux