Re: [PATCH v2] btrfs: balance dirty metadata pages in btrfs_finish_ordered_io

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19 Dec 2018, at 5:33, ethanlien wrote:

> Martin Raiber 於 2018-12-17 22:00 寫到:
>
>>>>>
>>>> I had lockups with this patch as well. If you put e.g. a loop 
>>>> device on
>>>> top of a btrfs file, loop sets PF_LESS_THROTTLE to avoid a feed 
>>>> back
>>>> loop causing delays. The task balancing dirty pages in
>>>> btrfs_finish_ordered_io doesn't have the flag and causes 
>>>> slow-downs. In
>>>> my case it managed to cause a feedback loop where it queues other
>>>> btrfs_finish_ordered_io and gets stuck completely.
>>>>
>>>
>>> The data writepage endio will queue a work for
>>> btrfs_finish_ordered_io() in a separate workqueue and clear page's
>>> writeback, so throttling in btrfs_finish_ordered_io() should not 
>>> slow
>>> down flusher thread. One suspicious point is while the caller is
>>> waiting a range of ordered_extents to complete, they will be
>>> blocked until balance_dirty_pages_ratelimited() make some
>>> progress, since we finish ordered_extents in
>>> btrfs_finish_ordered_io().
>>> Do you have call stack information for stuck processes or using
>>> fsync/sync frequently? If this is the case, maybe we should pull
>>> this thing out and try balance dirty metadata pages somewhere.
>>
>> Yeah like,
>>
>> [875317.071433] Call Trace:
>> [875317.071438]  ? __schedule+0x306/0x7f0
>> [875317.071442]  schedule+0x32/0x80
>> [875317.071447]  btrfs_start_ordered_extent+0xed/0x120
>> [875317.071450]  ? remove_wait_queue+0x60/0x60
>> [875317.071454]  btrfs_wait_ordered_range+0xa0/0x100
>> [875317.071457]  btrfs_sync_file+0x1d6/0x400
>> [875317.071461]  ? do_fsync+0x38/0x60
>> [875317.071463]  ? btrfs_fdatawrite_range+0x50/0x50
>> [875317.071465]  do_fsync+0x38/0x60
>> [875317.071468]  __x64_sys_fsync+0x10/0x20
>> [875317.071470]  do_syscall_64+0x55/0x100
>> [875317.071473]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> so I guess the problem is that the calling balance_dirty_pages causes
>> fsyncs to the same btrfs (via my unusual setup of loop+fuse)? Those
>> fsyncs are deadlocked because they are called indirectly from
>> btrfs_finish_ordered_io... It is a unusal setup, which is why I did 
>> not
>> post it to the mailing list initially.
>
> To me this is not like a real deadlock. The fsync call invokes two 
> steps:
> (1) flushing dirty data pages, (2) update corresponding metadata to
> point to those flushed data. Since step1 consume dirty pages and
> step2 produce more dirty pages, in this patch we leave step1
> unchanged and block step2 in btrfs_finish_ordered_io(), which
> seems reasonable to a OOM fix. The problem is, if there are
> other processes continually writing new data, the fsync call will
> need to wait the metadata update for a long time, even its dirty
> data has been flushed long time ago.
>
> Back to the deadlock problem, what Chris found is really a deadlock,
> and it can be fixed by adding a check of free space inode.

I think we should have a  better understanding of your original OOM 
problem before we keep the balance_dirty_pages().  This isn't a great 
place to throttle, and while it's also not a great place to make a huge 
burst of dirty pages, I'd like to make sure we're really fixing the 
right problem against today's kernel.

-chris




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux