Re: USB reset + raid6 = majority of files unreadable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chris,

Apologies, I was half way through replying and managed to send the
e-mail by mistake so here is second half.

These are NAS-specific hard discs and the SCT ERC timeout is set to 70:
SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

On Thu, 27 Feb 2020 at 00:39, Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
> > and at that point the device remove aborted with an I/O error.
>
> OK well you didn't include that so we have no idea if this I/O error
> is about the same failed device or another device. If it's another
> device it's more complicated what can happen to the array. Hence why
> timeout mismatches are important. And why it's important to have
> monitoring so you aren't running a degraded array for three days.

When I first tried this there was nothing in the log except the
checksum errors.  Nothing from btrfs and nothing from the block device
driver either to indicate that there have been any hardware errors.

I did work out what code was being run within the kernel, added some
extra messages and got as far working out that I found the error is
being detected here, in relocate_file_extent_cluster, relocation.c
starting around line 3336:

        if (!PageUptodate(page)) {
            btrfs_readpage(NULL, page);
            lock_page(page);
            if (!PageUptodate(page)) {
                unlock_page(page);
                put_page(page);
                btrfs_delalloc_release_metadata(BTRFS_I(inode),
                            PAGE_SIZE, true);
                btrfs_delalloc_release_extents(BTRFS_I(inode),
                                   PAGE_SIZE);
                btrfs_err(fs_info, "relocate_file_extent_cluster:
err#%d from btrfs_readpage/PageUptodate", ret);
                ret = -EIO;
                goto out;
            }
        }

> This sounds like a bug. The default space cache is stored in the data
> block group which for you should be raid6, with a missing device it's
> effectively raid5. But there's some kind of conversion happening
> during the balance/missing device removal, hence the clearing of the
> raid56 flag per block group, and maybe this corruption is happening
> related to that removal.

Presumably it is still a bug with RAID5 - given that there are now no
hardware errors being logged btrfs presumably should not corrupt the
space cache.  I can work around it, of course, by clearing the cache
but there is still about 65Gb of data I cannot balance away from the
failed device so that it properly resilient on the working devices.

Is there anything I can do here to narrow down this bug?  I probably
can't send you 14Tb of data but I could run tools on this filesystem
or apply patches and post the output.



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux