Re: [PATCH 0/5][v2] btrfs: fix hole corruption issue with !NO_HOLES

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 7, 2020 at 7:43 PM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
>
> v1->v2:
> - fixed a bug in 'btrfs: use the file extent tree infrastructure' that would
>   result in 0 length files because btrfs_truncate_inode_items() was clearing the
>   file extent map when we fsync'ed multiple times.  Validated this with a
>   modified fsx and generic/521 that reproduced the problem, those modifications
>   were sent up as well.
> - dropped the RFC
>
> ----------------- Original Message -----------------------
> We've historically had this problem where you could flush a targeted section of
> an inode and end up with a hole between extents without a hole extent item.
> This of course makes fsck complain because this is not ok for a file system that
> doesn't have NO_HOLES set.  Because this is a well understood problem I and
> others have been ignoring fsck failures during certain xfstests (generic/475 for
> example) because they would regularly trigger this edge case.
>
> However this isn't a great behavior to have, we should really be taking all fsck
> failures seriously, and we could potentially ignore fsck legitimate fsck errors
> because we expect it to be this particular failure.
>
> In order to fix this we need to keep track of where we have valid extent items,
> and only update i_size to encompass that area.  This unfortunately means we need
> a new per-inode extent_io_tree to keep track of the valid ranges.  This is
> relatively straightforward in practice, and helpers have been added to manage
> this so that in the case of a NO_HOLES file system we just simply skip this work
> altogether.
>
> I've been hammering on this for a week now and I'm pretty sure its ok, but I'd
> really like Filipe to take a look and I still have some longer running tests
> going on the series.  All of our boxes internally are btrfs and the box I was
> testing on ended up with a weird RPM db corruption that was likely from an
> earlier, broken version of the patch.  However I cannot be 100% sure that was
> the case, so I'm giving it a few more days of testing before I'm satisfied
> there's not some weird thing that RPM does that xfstests doesn't cover.
>
> This has gone through several iterations of xfstests already, including many
> loops of generic/475 for validation to make sure it was no longer failing.  So
> far so good, but for something like this wider testing will definitely be
> necessary.  Thanks,

So a comment that applies to the whole patchset.

On power failures we can now end up with non-prealloc extents beyond
the disk_i_size after mounting the filesystem.

Not entirely sure if it will give any potential problems other then
non-reclaimed space for a long time (unless the file is truncated or
written to or beyond the extent's offset), have you tested this
scenario?

I suppose the test cases from fstests that use dm's log writes target
exercise this easily.

Thanks

>
> Josef
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux