On sun, 16 Jun 2013 13:38:42 +0300, Alex Lyakas wrote: > Hi Miao, > > On Thu, Jun 13, 2013 at 6:08 AM, Miao Xie <miaox@xxxxxxxxxxxxxx> wrote: >> On wed, 12 Jun 2013 23:11:02 +0300, Alex Lyakas wrote: >>> I reviewed the code starting from: >>> 69aef69a1bc154 Btrfs: don't wait for all the writers circularly during >>> the transaction commit >>> until >>> 2ce7935bf4cdf3 Btrfs: remove the time check in btrfs_commit_transaction() >>> >>> It looks very good. Let me check if I understand the fix correctly: >>> # When transaction starts to commit, we want to wait only for external >>> writers (those that did ATTACH/START/USERSPACE) >>> # We guarantee at this point that no new external writers will hop on >>> the committing transaction, by setting ->blocked state, so we only >>> wait for existing extwriters to detach from transaction > > I have a doubt about this point with your new code. Example: > Task1 - external writer > Task2 - transaction kthread > > Task1 Task2 > |start_transaction(TRANS_START) | > |-wait_current_trans(blocked=0, so it doesn't wait) | > |-join_transaction() | > |--lock(trans_lock) | > |--can_join_transaction() YES | > | > |-btrfs_commit_transaction() > | > |--blocked=1 > | > |--in_commit=1 > | > |--wait_event(extwriter== 0); > | > |--btrfs_flush_all_pending_stuffs() > | | > |--extwriter_counter_inc() | > |--unlock(trans_lock) | > | > | lock(trans_lock) > | > | trans_no_join=1 > > Basically, the "blocked/in_commit" check is not synchronized with > joining a transaction. After checking "blocked", the external writer > may proceed and join the transaction. Right before joining, it calls > can_join_transaction(). But this function checks in_commit flag under > fs_info->trans_lock. But btrfs_commit_transaction() sets this flag not > under trans_lock, but under commit_lock, so checking this flag is not > synchronized. > > Or maybe I am wrong, because btrfs_commit_transaction() locks and > unlocks trans_lock to check for previous transaction, so by accident > there is no problem, and above scenario cannot happen? Your analysis at the last section is right, so the right process is: Task1 Task2 |start_transaction(TRANS_START) | |-wait_current_trans(blocked=0, so it doesn't wait) | |-join_transaction() | |--lock(trans_lock) | |--can_join_transaction() YES | | |-btrfs_commit_transaction() | |--blocked=1 | |--in_commit=1 |--extwriter_counter_inc() | |--unlock(trans_lock) | | |--lock(trans_lock) | |--... | |--unlock(trans_lock) | |--... | |--wait_event(extwriter== 0); | |--btrfs_flush_all_pending_stuffs() The problem you worried can not happen. Anyway, it is not good that the "blocked/in_commit" check is not synchronized with joining a transaction. So I modified the relative code in this patchset. Miao -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
