As 'scanning' has another meaning in btrfs, the subject should say
something about writeback or at least write cache range scanning.
On Fri, Jan 03, 2020 at 10:38:44AM -0500, Josef Bacik wrote:
> We noticed that we were having regular CG OOM kills in cases where there
> was still enough dirty pages to avoid OOM'ing. It turned out there's
> this corner case in btrfs's handling of range_cyclic where files that
> were being redirtied were not getting fully written out because of how
> we do range_cyclic writeback.
>
> We unconditionally were setting scanned = 1; the first time we found any
> pages in the inode. This isn't actually what we want, we want it to be
> set if we've scanned the entire file. For range_cyclic we could be
> starting in the middle or towards the end of the file, so we could write
> one page and then not write any of the other dirty pages in the file
> because we set scanned = 1.
>
> Fix this by not setting scanned = 1 if we find pages. The rules for
> setting scanned should be
>
> 1) !range_cyclic. In this case we have a specified range to write out.
> 2) range_cyclic && index == 0. In this case we've started at the
> beginning and there is no need to loop around a second time.
> 3) range_cyclic && we started at index > 0 and we've reached the end of
> the file without satisfying our nr_to_write.
>
> This patch fixes both of our writepages implementations to make sure
> these rules hold true. This fixed our over zealous CG OOMs in
> production.
>
> Fixes: d1310b2e0cd9 ("Btrfs: Split the extent_map code into two parts")
> Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
> ---
> fs/btrfs/extent_io.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 410f5a64d3a6..d634cb0c39e3 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -3965,6 +3965,7 @@ int btree_write_cache_pages(struct address_space *mapping,
> if (wbc->range_cyclic) {
> index = mapping->writeback_index; /* Start from prev offset */
> end = -1;
> + scanned = (index == 0);
It's explained in the changelog but I think a comment would be good here
too. I'll add it at commit time.