On 20.05.19 г. 12:10 ч., Filipe Manana wrote:
> On Mon, May 20, 2019 at 9:23 AM Nikolay Borisov <nborisov@xxxxxxxx> wrote:
>>
>>
>>
>> On 20.05.19 г. 11:11 ч., Nikolay Borisov wrote:
>>> This patch removes stray smp_mb before root->log_root from join_running_log_trans
>>> as well as the unlocked check for root->log_root. log_root is only set in
>>> btrfs_add_log_tree, called from start_log_trans under root->log_mutex.
>>> Furthermore, log_root is freed in btrfs_free_log, called from commit_fs_root,
>>> which in turn is called from transaction's critical section (TRANS_COMMIT_DOING).
>>> Those 2 locking invariants ensure join_running_log_trans don't see invalid
>>> values of ->log_root.
>>>
>>> Additionally this results in around 26% improvement when deleting 500k files/dir.
>>> All values are in seconds.
>>>
>>> With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
>>> 80 78 91 90
>>> 63 62 93 91
>>> 65 64 92 90
>>> 67 65 93 90
>>> 75 73 90 88
>>> 75 73 91 89
>>> 75 73 93 90
>>> 74 73 89 87
>>> 76 74 91 89
>>> stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
>>> mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
>>> median 75 73 91 90
>>>
>> With Patch (real) With patch (sys) Without patch (real) Without patch (sys)
>> 80 78 91 90
>> 63 62 93 91
>> 65 64 92 90
>> 67 65 93 90
>> 75 73 90 88
>> 75 73 91 89
>> 75 73 93 90
>> 74 73 89 87
>> 76 74 91 89
>> stddev 5.76146200581454 5.45690184791497 1.42400062421959 1.22474487139159
>> mean 72.2222222222222 70.5555555555556 91.4444444444444 89.3333333333333
>> median 75 73 91 90
>
> Great.
>
> How was that test done?
> Simply deleting the files with nothing else running in parallel?
Yes
>
> How does it behave if while the files are being deleted, there are
> concurrent fsyncs on other files of the same subvolume, that is, while
> the mutex is held?
>
> Because that check without holding the mutex, is likely there for
> performance reasons.
So I will repeat the test when there are concurrent fsyncs running and
also with no concurrent fsyncs running but just removing the smp_mb, I
think the performance gain should be from removing the smp_mb and not
necessarily the check. In the worst case I can leave the check intact
and just remove the smp_mb because it doesn't add any value
correctness-wise.
>
> Thanks.
>
>
>>
>>
>> Here's the perf data without being butchered.
>>
>>> Signed-off-by: Nikolay Borisov <nborisov@xxxxxxxx>
>>> ---
>>>
>>> This passed full xfstest run and the performance results were obtained with the
>>> following testcase:
>>>
>>> #!/bin/bash
>>> for i in {1..10}; do
>>> echo "Testun run : %i"
>>> ./ltp/fsstress -z -d /media/scratch/ -f creat=100 -n 500000
>>> sync
>>> time rm -rf /media/scratch/*
>>> sync
>>> done
>>>
>>> fs/btrfs/tree-log.c | 4 ----
>>> 1 file changed, 4 deletions(-)
>>>
>>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
>>> index 6adcd8a2c5c7..61744d8af106 100644
>>> --- a/fs/btrfs/tree-log.c
>>> +++ b/fs/btrfs/tree-log.c
>>> @@ -188,10 +188,6 @@ static int join_running_log_trans(struct btrfs_root *root)
>>> {
>>> int ret = -ENOENT;
>>>
>>> - smp_mb();
>>> - if (!root->log_root)
>>> - return -ENOENT;
>>> -
>>> mutex_lock(&root->log_mutex);
>>> if (root->log_root) {
>>> ret = 0;
>>>
>
>
>
> --
> Filipe David Manana,
>
> “Whether you think you can, or you think you can't — you're right.”
>