Re: [PATCH] Btrfs: fix race leading to fs corruption after transaction abortion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 25, 2019 at 11:27:04AM +0100, fdmanana@xxxxxxxxxx wrote:
> From: Filipe Manana <fdmanana@xxxxxxxx>
> 
> When one transaction is finishing its commit, it is possible for another
> transaction to start and enter its initial commit phase as well. If the
> first ends up getting aborted, we have a small time window where the second
> transaction commit does not notice that the previous transaction aborted
> and ends up committing, writing a superblock that points to btrees that
> reference extent buffers (nodes and leafs) that were not persisted to disk.
> The consequence is that after mounting the filesystem again, we will be
> unable to load some btree nodes/leafs, either because the content on disk
> is either garbage (or just zeroes) or corresponds to the old content of a
> previouly COWed or deleted node/leaf, resulting in the well known error
> messages "parent transid verify failed on ...".
> The following sequence diagram illustrates how this can happen.
> 
>         CPU 1                                           CPU 2
> 
>  <at transaction N>
> 
>  btrfs_commit_transaction()
>    (...)
>    --> sets transaction state to
>        TRANS_STATE_UNBLOCKED
>    --> sets fs_info->running_transaction
>        to NULL
> 
>                                                     (...)
>                                                     btrfs_start_transaction()
>                                                       start_transaction()
>                                                         wait_current_trans()
>                                                           --> returns immediately
>                                                               because
>                                                               fs_info->running_transaction
>                                                               is NULL
>                                                         join_transaction()
>                                                           --> creates transaction N + 1
>                                                           --> sets
>                                                               fs_info->running_transaction
>                                                               to transaction N + 1
>                                                           --> adds transaction N + 1 to
>                                                               the fs_info->trans_list list
>                                                         --> returns transaction handle
>                                                             pointing to the new
>                                                             transaction N + 1
>                                                     (...)
> 
>                                                     btrfs_sync_file()
>                                                       btrfs_start_transaction()
>                                                         --> returns handle to
>                                                             transaction N + 1
>                                                       (...)
> 
>    btrfs_write_and_wait_transaction()
>      --> writeback of some extent
>          buffer fails, returns an
> 	 error
>    btrfs_handle_fs_error()
>      --> sets BTRFS_FS_STATE_ERROR in
>          fs_info->fs_state
>    --> jumps to label "scrub_continue"
>    cleanup_transaction()
>      btrfs_abort_transaction(N)
>        --> sets BTRFS_FS_STATE_TRANS_ABORTED
>            flag in fs_info->fs_state
>        --> sets aborted field in the
>            transaction and transaction
> 	   handle structures, for
>            transaction N only
>      --> removes transaction from the
>          list fs_info->trans_list
>                                                       btrfs_commit_transaction(N + 1)
>                                                         --> transaction N + 1 was not
> 							    aborted, so it proceeds
>                                                         (...)
>                                                         --> sets the transaction's state
>                                                             to TRANS_STATE_COMMIT_START
>                                                         --> does not find the previous
>                                                             transaction (N) in the
>                                                             fs_info->trans_list, so it
>                                                             doesn't know that transaction
>                                                             was aborted, and the commit
>                                                             of transaction N + 1 proceeds
>                                                         (...)
>                                                         --> sets transaction N + 1 state
>                                                             to TRANS_STATE_UNBLOCKED
>                                                         btrfs_write_and_wait_transaction()
>                                                           --> succeeds writing all extent
>                                                               buffers created in the
>                                                               transaction N + 1
>                                                         write_all_supers()
>                                                            --> succeeds
>                                                            --> we now have a superblock on
>                                                                disk that points to trees
>                                                                that refer to at least one
>                                                                extent buffer that was
>                                                                never persisted
> 
> So fix this by updating the transaction commit path to check if the flag
> BTRFS_FS_STATE_TRANS_ABORTED is set on fs_info->fs_state if after setting
> the transaction to the TRANS_STATE_COMMIT_START we do not find any previous
> transaction in the fs_info->trans_list. If the flag is set, just fail the
> transaction commit with -EROFS, as we do in other places. The exact error
> code for the previous transaction abort was already logged and reported.
> 
> Fixes: 49b25e0540904b ("btrfs: enhance transaction abort infrastructure")
> Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>

Reviewed-by: David Sterba <dsterba@xxxxxxxx>

Queued for 5.3, thanks.



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux