On 21.11.18 г. 17:10 ч., Nikolay Borisov wrote:
> Running btrfs/124 in a loop hung up on me sporadically with the
> following call trace:
> btrfs D 0 5760 5324 0x00000000
> Call Trace:
> ? __schedule+0x243/0x800
> schedule+0x33/0x90
> btrfs_start_ordered_extent+0x10c/0x1b0 [btrfs]
> ? wait_woken+0xa0/0xa0
> btrfs_wait_ordered_range+0xbb/0x100 [btrfs]
> btrfs_relocate_block_group+0x1ff/0x230 [btrfs]
> btrfs_relocate_chunk+0x49/0x100 [btrfs]
> btrfs_balance+0xbeb/0x1740 [btrfs]
> btrfs_ioctl_balance+0x2ee/0x380 [btrfs]
> btrfs_ioctl+0x1691/0x3110 [btrfs]
> ? lockdep_hardirqs_on+0xed/0x180
> ? __handle_mm_fault+0x8e7/0xfb0
> ? _raw_spin_unlock+0x24/0x30
> ? __handle_mm_fault+0x8e7/0xfb0
> ? do_vfs_ioctl+0xa5/0x6e0
> ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
> do_vfs_ioctl+0xa5/0x6e0
> ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> ksys_ioctl+0x3a/0x70
> __x64_sys_ioctl+0x16/0x20
> do_syscall_64+0x60/0x1b0
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> This happens because during page writeback it's valid for
> writepage_delalloc to instantiate a delalloc range which doesn't
> belong to the page currently being written back.
>
> The reason this case is valid is due to find_lock_delalloc_range
> returning any available range after the passed delalloc_start and
> ignorting whether the page under writeback is within that range.
> In turn ordered extents (OE) are always created for the returned range
> from find_lock_delalloc_range. If, however, a failure occurs while OE
> are being created then the clean up code in btrfs_cleanup_ordered_extents
> will be called.
>
> Unfortunately the code in btrfs_cleanup_ordered_extents doesn't consider
> the case of such 'foreign' range being processed and instead it always
> assumes that the range OE are created for belongs to the page. This
> leads to the first page of such foregin range to not be cleaned up since
> it's deliberately missed skipped by the current cleaning up code.
>
> Fix this by correctly checking whether the current page belongs to the
> range being instantiated and if so adjsut the range parameters
> passed for cleaning up. If it doesn't, then just clean the whole OE
> range directly.
>
> Signed-off-by: Nikolay Borisov <nborisov@xxxxxxxx>
> Reviewed-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
> ---
> V3:
> * Re-worded the commit for easier comprehension
> * Added RB tag from Josef
>
> V2:
> * Fix compilation failure due to missing parentheses
> * Fixed the "Fixes" tag.
> fs/btrfs/inode.c | 29 ++++++++++++++++++++---------
> 1 file changed, 20 insertions(+), 9 deletions(-)
>
Ping,
Also this patch needs:
Fixes: 524272607e88 ("btrfs: Handle delalloc error correctly to avoid
ordered extent hang") and it needs to be applied to the stable releases 4.14
<snip>