Re: [RFC PATCH 00/19] btrfs: async discard support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 11, 2019 at 10:49:20AM +0300, Nikolay Borisov wrote:
> 
> 
> On 7.10.19 г. 23:17 ч., Dennis Zhou wrote:
> > Hello,
> > 
> 
> <snip>
> 
> > 
> > With async discard, we try to emphasize discarding larger regions
> > and reusing the lba (implicit discard). The first is done by using the
> > free space cache to maintain discard state and thus allows us to get
> > coalescing for fairly cheap. A background workqueue is used to scan over
> > an LRU kept list of the block groups. It then uses filters to determine
> > what to discard next hence giving priority to larger discards. While
> > reusing an lba isn't explicitly attempted, it happens implicitly via
> > find_free_extent() which if it happens to find a dirty extent, will
> > grant us reuse of the lba. Additionally, async discarding skips metadata
> 
> By 'dirty' I assume you mean not-discarded-yet-but-free extent?
> 

Yes.

> > block groups as these should see a fairly high turnover as btrfs is a
> > self-packing filesystem being stingy with allocating new block groups
> > until necessary.
> > 
> > Preliminary results seem promising as when a lot of freeing is going on,
> > the discarding is delayed allowing for reuse which translates to less
> > discarding (in addition to the slower discarding). This has shown a
> > reduction in p90 and p99 read latencies on a test on our webservers.
> > 
> > I am currently working on tuning the rate at which it discards in the
> > background. I am doing this by evaluating other workloads and drives.
> > The iops and bps rate limits are fairly aggressive right now as my
> > basic survey of a few drives noted that the trim command itself is a
> > significant part of the overhead. So optimizing for larger trims is the
> > right thing to do.
> 
> Do you intend on sharing performance results alongside the workloads
> used to obtain them? Since this is a performance improvement patch in
> its core that is of prime importance!
> 

I'll try and find some stuff to share for v2. As I'm just running this
on production machines, I don't intend to share any workloads. However,
there is an iocost workload that demonstrates the problem nicely that
might already be shared.

The win really is moving the work from transaction commit to completely
background work, effectively making discard a 2nd class citizen. On more
loaded machines, it's not great that discards are blocking transaction
commit. The other thing is it's very drive dependent. Some drives just
have really bad discard implementations and there will be a bigger win
than say on some high end nvme drive.

> > 
> 
> <snip>
> > 
> > Thanks,
> > Dennis
> > 



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux