Re: [PATCH] btrfs: Simplify join_running_log_trans

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 20, 2019 at 10:13 AM Nikolay Borisov <nborisov@xxxxxxxx> wrote:
>
>
>
> On 20.05.19 г. 12:10 ч., Filipe Manana wrote:
> > On Mon, May 20, 2019 at 9:23 AM Nikolay Borisov <nborisov@xxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 20.05.19 г. 11:11 ч., Nikolay Borisov wrote:
> >>> This patch removes stray smp_mb before root->log_root from join_running_log_trans
> >>> as well as the unlocked check for root->log_root. log_root is only set in
> >>> btrfs_add_log_tree, called from start_log_trans under root->log_mutex.
> >>> Furthermore, log_root is freed in btrfs_free_log, called from commit_fs_root,
> >>> which in turn is called from transaction's critical section (TRANS_COMMIT_DOING).
> >>> Those 2 locking invariants ensure join_running_log_trans don't see invalid
> >>> values of ->log_root.
> >>>
> >>> Additionally this results in around 26% improvement when deleting 500k files/dir.
> >>> All values are in seconds.
> >>>
> >>>       With Patch (real)       With patch (sys)        Without patch (real)    Without patch (sys)
> >>>           80                                  78                                              91                                              90
> >>>               63                                      62                                              93                                              91
> >>>               65                                      64                                              92                                              90
> >>>               67                                      65                                              93                                              90
> >>>               75                                      73                                              90                                              88
> >>>               75                                      73                                              91                                              89
> >>>               75                                      73                                              93                                              90
> >>>               74                                      73                                              89                                              87
> >>>               76                                      74                                              91                                              89
> >>> stddev        5.76146200581454        5.45690184791497        1.42400062421959        1.22474487139159
> >>> mean  72.2222222222222        70.5555555555556        91.4444444444444        89.3333333333333
> >>> median  75                                    73                                              91                                              90
> >>>
> >>     With Patch (real)   With patch (sys)    Without patch (real)    Without patch (sys)
> >>         80                   78                     91                      90
> >>         63                   62                     93                      91
> >>         65                   64                     92                      90
> >>         67                   65                     93                      90
> >>         75                   73                     90                      88
> >>         75                   73                     91                      89
> >>         75                   73                     93                      90
> >>         74                   73                     89                      87
> >>         76                   74                     91                      89
> >> stddev  5.76146200581454    5.45690184791497    1.42400062421959    1.22474487139159
> >> mean    72.2222222222222    70.5555555555556    91.4444444444444    89.3333333333333
> >> median  75                   73                     91                      90
> >
> > Great.
> >
> > How was that test done?
> > Simply deleting the files with nothing else running in parallel?
>
> Yes
>
> >
> > How does it behave if while the files are being deleted,  there are
> > concurrent fsyncs on other files of the same subvolume, that is, while
> > the mutex is held?
> >
> > Because that check without holding the mutex, is likely there for
> > performance reasons.
>
> So I will repeat the test when there are concurrent fsyncs running and
> also with no concurrent fsyncs running but just removing the smp_mb, I
> think the performance gain should be from removing the smp_mb and not
> necessarily the check. In the worst case I can leave the check intact
> and just remove the smp_mb because it doesn't add any value
> correctness-wise.

So the barrier is likely there to make sure we see non-null log_root
after it was set, concurrently by someone calling start_log_trans().
It pairs, implicitly, with the mutex_unlock there, at
start_log_trans(), since that implies a memory barrier.

With your patch, we always end up doing a mutex_lock(), which also
implies a memory barrier.

Sorry I missed this before, but your test doesn't make any sense. How
can a memory barrier have such a big impact on a code path (unlink)
that does lot of much heavier stuff? Like btree searches/deletes,
deleting  delayed items, etc.
Your test case is even flawed because join_running_log_trans() can
never be called.

Look at its only two callers, btrfs_del_inode_ref_in_log() and
btrfs_del_dir_entries_in_log(), they both do:

if (dir->logged_trans < trans->transid)
return 0;

ret = join_running_log_trans(root);
if (ret)
return 0;

and

if (inode->logged_trans < trans->transid)
return 0;

ret = join_running_log_trans(root);
if (ret)
return 0;

Since you never fsync the files nor the directory containing them in
that test, we never end up calling join_running_log_trans().

So I don't know where you got the 26%...

Thanks.

>
> >
> > Thanks.
> >
> >
> >>
> >>
> >> Here's the perf data without being butchered.
> >>
> >>> Signed-off-by: Nikolay Borisov <nborisov@xxxxxxxx>
> >>> ---
> >>>
> >>> This passed full xfstest run and the performance results were obtained with the
> >>> following testcase:
> >>>
> >>> #!/bin/bash
> >>> for i in {1..10}; do
> >>>     echo "Testun run : %i"
> >>>     ./ltp/fsstress -z -d /media/scratch/ -f creat=100 -n 500000
> >>>     sync
> >>>     time rm -rf /media/scratch/*
> >>>     sync
> >>> done
> >>>
> >>>  fs/btrfs/tree-log.c | 4 ----
> >>>  1 file changed, 4 deletions(-)
> >>>
> >>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> >>> index 6adcd8a2c5c7..61744d8af106 100644
> >>> --- a/fs/btrfs/tree-log.c
> >>> +++ b/fs/btrfs/tree-log.c
> >>> @@ -188,10 +188,6 @@ static int join_running_log_trans(struct btrfs_root *root)
> >>>  {
> >>>       int ret = -ENOENT;
> >>>
> >>> -     smp_mb();
> >>> -     if (!root->log_root)
> >>> -             return -ENOENT;
> >>> -
> >>>       mutex_lock(&root->log_mutex);
> >>>       if (root->log_root) {
> >>>               ret = 0;
> >>>
> >
> >
> >
> > --
> > Filipe David Manana,
> >
> > “Whether you think you can, or you think you can't — you're right.”
> >



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux