On 12/31/19 7:25 AM, Qu Wenruo wrote:
On 2019/12/31 上午5:31, Josef Bacik wrote:
We've historically had this problem where you could flush a targeted section of
an inode and end up with a hole between extents without a hole extent item.
This of course makes fsck complain because this is not ok for a file system that
doesn't have NO_HOLES set. Because this is a well understood problem I and
others have been ignoring fsck failures during certain xfstests (generic/475 for
example) because they would regularly trigger this edge case.
However this isn't a great behavior to have, we should really be taking all fsck
failures seriously, and we could potentially ignore fsck legitimate fsck errors
because we expect it to be this particular failure.
In order to fix this we need to keep track of where we have valid extent items,
and only update i_size to encompass that area. This unfortunately means we need
a new per-inode extent_io_tree to keep track of the valid ranges. This is
relatively straightforward in practice, and helpers have been added to manage
this so that in the case of a NO_HOLES file system we just simply skip this work
altogether.
Not an expert of this problem, but AFAIK this is caused by mixing
buffered and direct IO, right?
Since that deadly mix is not recommended anyway, can we make things
simpler by just block any buffered IO if the same inode is under going
any direct IO?
This can happen if you write 100mb and then sync_file_range 1mb in the middle of
the file, it's not just restricted to O_DIRECT. Thanks,
Josef