Re: [PATCH] Btrfs: fix data loss after truncate when using the no-holes feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 14, 2017 at 08:35:26PM +0000, fdmanana@xxxxxxxxxx wrote:
> From: Filipe Manana <fdmanana@xxxxxxxx>
> 
> If we have a file with an implicit hole (NO_HOLES feature enabled) that
> has an extent following the hole, delayed writes against regions of the
> file behind the hole happened before but were not yet flushed and then
> we truncate the file to a smaller size that lies inside the hole, we
> end up persisting a wrong disk_i_size value for our inode that leads to
> data loss after umounting and mounting again the filesystem or after
> the inode is evicted and loaded again.
> 
> This happens because at inode.c:btrfs_truncate_inode_items() we end up
> setting last_size to the offset of the extent that we deleted and that
> followed the hole. We then pass that value to btrfs_ordered_update_i_size()
> which updates the inode's disk_i_size to a value smaller then the offset
> of the buffered (delayed) writes.
> 
> Example reproducer:
> 
>  $ mkfs.btrfs -f /dev/sdb
>  $ mount /dev/sdb /mnt
> 
>  $ xfs_io -f -c "pwrite -S 0x01 0K 32K" /mnt/foo
>  $ xfs_io -d -c "pwrite -S 0x02 -b 32K 64K 32K" /mnt/foo
>  $ xfs_io -c "truncate 60K" /mnt/foo
>    --> inode's disk_i_size updated to 0
> 
>  $ md5sum /mnt/foo
>  3c5ca3c3ab42f4b04d7e7eb0b0d4d806  /mnt/foo
> 
>  $ umount /dev/sdb
>  $ mount /dev/sdb /mnt
> 
>  $ md5sum /mnt/foo
>  d41d8cd98f00b204e9800998ecf8427e  /mnt/foo
>    --> Empty file, all data lost!
> 
> Cc: <stable@xxxxxxxxxxxxxxx>  # 3.14+
> Fixes: 16e7549f045d ("Btrfs: incompatible format change to remove hole extents")
> Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
> ---
>  fs/btrfs/inode.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 8eb6703..a0a38ce 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -4703,8 +4703,12 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
>  			btrfs_abort_transaction(trans, ret);
>  	}
>  error:
> -	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID)
> +	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
> +		if (last_size > new_size &&
> +		    btrfs_fs_incompat(fs_info, NO_HOLES))
> +			last_size = new_size;
>  		btrfs_ordered_update_i_size(inode, last_size, NULL);
> +	}
>

Looks good to me.

Reviewed-by: Liu Bo <bo.li.liu@xxxxxxxxxx>

We can also remove the code that sets last_size = new_size in the case of
NO_HOLES inside the while loop, as I think this patch covers that, too.

Because we've held the inode mutex and wait for the dirty pages beyond the
current i_size, in fact, I couldn't think of a case that with NO_HOLE last_size
< new_size, maybe we could use new_size directly and add a ASSERT just in case?

Thanks,

-liubo
>  	btrfs_free_path(path);
>  
> -- 
> 2.7.0.rc3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux