On Fri, Oct 11, 2019 at 10:49:20AM +0300, Nikolay Borisov wrote: > > > On 7.10.19 г. 23:17 ч., Dennis Zhou wrote: > > Hello, > > > > <snip> > > > > > With async discard, we try to emphasize discarding larger regions > > and reusing the lba (implicit discard). The first is done by using the > > free space cache to maintain discard state and thus allows us to get > > coalescing for fairly cheap. A background workqueue is used to scan over > > an LRU kept list of the block groups. It then uses filters to determine > > what to discard next hence giving priority to larger discards. While > > reusing an lba isn't explicitly attempted, it happens implicitly via > > find_free_extent() which if it happens to find a dirty extent, will > > grant us reuse of the lba. Additionally, async discarding skips metadata > > By 'dirty' I assume you mean not-discarded-yet-but-free extent? > Yes. > > block groups as these should see a fairly high turnover as btrfs is a > > self-packing filesystem being stingy with allocating new block groups > > until necessary. > > > > Preliminary results seem promising as when a lot of freeing is going on, > > the discarding is delayed allowing for reuse which translates to less > > discarding (in addition to the slower discarding). This has shown a > > reduction in p90 and p99 read latencies on a test on our webservers. > > > > I am currently working on tuning the rate at which it discards in the > > background. I am doing this by evaluating other workloads and drives. > > The iops and bps rate limits are fairly aggressive right now as my > > basic survey of a few drives noted that the trim command itself is a > > significant part of the overhead. So optimizing for larger trims is the > > right thing to do. > > Do you intend on sharing performance results alongside the workloads > used to obtain them? Since this is a performance improvement patch in > its core that is of prime importance! > I'll try and find some stuff to share for v2. As I'm just running this on production machines, I don't intend to share any workloads. However, there is an iocost workload that demonstrates the problem nicely that might already be shared. The win really is moving the work from transaction commit to completely background work, effectively making discard a 2nd class citizen. On more loaded machines, it's not great that discards are blocking transaction commit. The other thing is it's very drive dependent. Some drives just have really bad discard implementations and there will be a bigger win than say on some high end nvme drive. > > > > <snip> > > > > Thanks, > > Dennis > >
