On 3/25/20 9:51 AM, David Sterba wrote:
On Fri, Mar 13, 2020 at 05:12:15PM -0400, Josef Bacik wrote:
While debugging Zygo's delayed ref problems it was clear there were a bunch of
cases that we're running delayed refs when we don't need to be, and they result
in a lot of weird latencies.
Each patch has their individual explanations. But the gist of it is we run
delayed refs in a lot of arbitrary ways that have just accumulated throughout
the years, so clean up all of these so we can have more consistent performance.
It would be fine to remove the delayed refs being run from so many
places but I vaguely remember some patches adding them with "we have to
run delayed refs here or we will miss something and that would be a
corruption". The changelogs in patches from 3 on don't point out any
specific problems and I miss some reasoning about correctness, ideally
for each line of btrfs_run_delayed_refs removed.
As a worst case I really don't want to get to a situation where we start
getting reports that something broke because of the missing delayed
refs, followed by series of "oh yeah I forgot we need it here, add it
back".
Yeah I went through and checked each of these spots to see why we had them. A
lot of it had to do with how poorly delayed refs were run previously. You could
end up with weird ordering cases and missing our flags.
These problems are all gone now, we no longer have to run delayed refs to work
around ordering weirdness because I fixed all of those problems. Now these are
just old relics of the past that need to die. The only case where I didn't
touch them is for qgroups, likely because it still matters for the before/after
lookups there.
But everywhere else it was working around some deficiency in how we ran delayed
refs, either in the ordering issues or space related. Both those problems no
longer exist, so we can drop these workarounds.
The branch with this patchset is in for-next but I'm still not
comfortable with adding it to misc-next as I can't convince myself it's
safe, so more reviews are welcome.
Yeah I'm targeting the merge window after the upcoming one with these, there's
still a lot more testing I want to get done. I mostly threw them up because
they were no longer blowing up constantly for Zygo, and I wanted Filipe to get
an early look at them. Thanks,
Josef