Re: [PATCH] btrfs: use tagged writepage to mitigate livelock of snapshot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1 Nov 2018, at 6:21, ethanlien wrote:

> Nikolay Borisov 於 2018-11-01 18:01 寫到:
>> On 1.11.18 г. 11:56 ч., ethanlien wrote:
>>> Nikolay Borisov 於 2018-11-01 16:59 寫到:
>>>> On 1.11.18 г. 8:49 ч., Ethan Lien wrote:
>>>>> Snapshot is expected to be fast. But if there are writers steadily
>>>>> create dirty pages in our subvolume, the snapshot may take a very 
>>>>> long
>>>>> time to complete. To fix the problem, we use tagged writepage for
>>>>> snapshot
>>>>> flusher as we do in the generic write_cache_pages(), so we can 
>>>>> ommit
>>>>> pages
>>>>> dirtied after the snapshot command.
>>>>
>>>> So the gist of this commit really is that you detect when 
>>>> filemap_flush
>>>> has been called from snapshot context and tag all pages at *that* 
>>>> time
>>>> as PAGECACHE_TAG_TOWRITE and then write them, ignoring any pages 
>>>> that
>>>> might have been dirtied beforehand. Your description kind of dances
>>>> around this idea without really saying it explicitly. Those 
>>>> semantics
>>>> make sense, however i'd like to be stated more explicitly in the 
>>>> change
>>>>  log.
>>>>
>>>> However, this is done at the expense of consistency, so we have 
>>>> faster
>>>> snapshots but depending the file which are stored in them they 
>>>> might be
>>>> broken (i.e a database) since they will be missing pages.
>>>>
>>>
>>> tag_pages_for_writeback() will tag all pages with 
>>> PAGECACHE_TAG_DIRTY.
>>> If a dirty
>>> page get missed here, it means someone has initiated the flush 
>>> before
>>> us, so we
>>> will wait that dirty page to complete in create_snapshot() ->
>>> btrfs_wait_ordered_extents().
>>
>> And what happens in the scenario where you have  someone dirtying 
>> pages,
>> you issue the snapshot ioctl, mark all currently dirtied pages as
>> TOWRITE and then you have more delalloc pages being dirtied following
>> initial call to tag_pages_for_writeback , you will miss those, no ?
>>
>
> We don't freeze the filesystem when doing snapshot, so there is no 
> guarantee
> the page dirtied right after we send the ioctl, will be included in 
> snapshot.
> If a page is dirtied while we scan the dirty pages and its offset is 
> before our
> current index, we miss it in our current snapshot implementation too.

This looks like a pretty good compromise to me between btrfs v0's don't 
flush at all on snapshot and today's full sync on snapshot.  The full 
sync is always going to be a little racey wrt concurrent writes.

-chris




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux