On 2019/5/27 下午10:10, David Sterba wrote: > On Mon, May 27, 2019 at 01:10:54PM +0800, Qu Wenruo wrote: >> In convert we use trans->block_reserved >= 4096 as a threshold to commit >> transaction, where block_reserved is the number of new tree blocks >> allocated inside a transaction. >> >> The problem is, we still have a hidden bug in delayed ref implementation >> in btrfs-progs, when we have a large enough transaction, delayed ref may >> failed to find certain tree blocks in extent tree and cause transaction >> abort. >> >> This workaround will workaround it by committing transaction at a much >> lower threshold. >> >> The old 4096 means 4096 new tree blocks, when using default (16K) >> nodesize, it's 64M, which can contain over 12k inlined data extent or >> csum for around 60G, or over 800K file extents. >> >> The new threshold will limit the size of new tree blocks to 2M, aligning >> with the chunk preallocator threshold, and reducing the possibility to >> hit that delayed ref bug. >> >> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> > > Added to devel, thanks. > BTW, this is really just a workaround. The ENOSPC itself should be solved by patches: "btrfs-progs: Fix false ENOSPC alert by tracking used space correctly" and "[PATCH 0/2] btrfs-progs: Metadata preallocation enhancement". The root cause of the delayed ref "unable to find backref" bug is still unknown, but considering how large the transaction needs to be before hitting that, this workaround should work. Thanks, Qu
Attachment:
signature.asc
Description: OpenPGP digital signature
