Re: Questions regarding logging upon fsync in btrfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 30 September 2013 22:11, Josef Bacik <jbacik@xxxxxxxxxxxx> wrote:
> On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta wrote:
>> On 29 September 2013 15:12, Josef Bacik <jbacik@xxxxxxxxxxxx> wrote:
>> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta wrote:
>> >> Thank you very much for the reply. That clarifies a lot of things.
>> >>
>> >> I was trying a small test case that opens a file, writes a block of
>> >> data, calls fsync and then closes the file. If I understand correctly,
>> >> fsync would return only after all in-memory buffers have been
>> >> committed to disk. I have added few print statements in the
>> >> __extent_writepage function, and I notice that the function gets
>> >> called a bit later after fsync returns. It seems that I am not
>> >> guaranteed to see the data going to disk by the time fsync returns.
>> >>
>> >> Am I doing something wrong, or am I looking at the wrong place for
>> >> disk write? This happens both with tree logging enabled as well as
>> >> with notreelog.
>> >>
>> >
>> > So 3.1 was a long time ago and to be sure it had issues I don't think it was
>> > _that_ broken.  You are probably better off instrumenting a recent kernel, 3.11
>> > or just build btrfs-next from git.  But if I were to make a guess I'd say that
>> > __extent_writepage was how both data and metadata was written out at the time (I
>> > don't think I changed it until 3.2 or something later) so what you are likely
>> > seeing is the normal transaction commit after the fsync.  In the case of
>> > notreelog we are likely starting another transaction and you are seeing that
>> > commit (at the time the transaction kthread would start a transaction even if
>> > none had been started yet.)  Thanks,
>> >
>> > Josef
>>
>> Is there any special handling for very small file write, less than 4K? As
>> I understand there is an optimization to inline the first extent in a file if
>> it is smaller than 4K, does it affect the writeback on fsync as well? I did
>> set the max_inline mount option to 0, but even then it seems there is
>> some difference in fsync behaviour for writing first extent of less than 4K
>> size and writing 4K or more.
>>
>
> Yeah if the file is an inline extent then it will be copied into the log
> directly and the log will be written out, no going through the data write path
> at all.  Max inline == 0 should make it so we don't inline, so if it isn't
> honoring that then that may be a bug.  Thanks,
>
> Josef

I tried it on 3.12-rc2 release, and it seems there is a bug then.
Please find attached logs to confirm.
Also, probably on the older release.

Thanks,
Aastha.

-- 
Aastha Mehta
MPI-SWS, Germany
E-mail: aasthakm@xxxxxxxxxxx
[ 2808.125838] [do_writepages] inum: 260, nr_to_write: 9223372036854775807, writepages fn: ffffffff8129ce40
[ 2808.125977] CPU: 0 PID: 10215 Comm: bb Not tainted 3.12.0-rc2-latest+ #4
[ 2808.125980] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
[ 2808.125983]  ffff88003d6b5e60 ffff88003d6b5e28 ffffffff817a8a6f ffff88003fc0eb48
[ 2808.125987]  ffff88002e325b48 ffff88003d6b5e48 ffffffff81105c1f ffff88002e325a00
[ 2808.125990]  ffff880037b1f300 ffff88003d6b5e88 ffffffff810fb689 ffff88003fc12c00
[ 2808.125993] Call Trace:
[ 2808.126003]  [<ffffffff817a8a6f>] dump_stack+0x46/0x58
[ 2808.126010]  [<ffffffff81105c1f>] do_writepages+0x4f/0x90
[ 2808.126013]  [<ffffffff810fb689>] __filemap_fdatawrite_range+0x49/0x50
[ 2808.126013]  [<ffffffff810fc41e>] filemap_fdatawrite_range+0xe/0x10
[ 2808.126013]  [<ffffffff812aafdc>] btrfs_sync_file+0xac/0x380
[ 2808.126098]  [<ffffffff817aedd2>] ? __schedule+0x692/0x730
[ 2808.126115]  [<ffffffff811752d8>] do_fsync+0x58/0x90
[ 2808.126149]  [<ffffffff81148a6d>] ? SyS_write+0x4d/0xa0
[ 2808.126165]  [<ffffffff8117565b>] SyS_fsync+0xb/0x10
[ 2808.126182]  [<ffffffff817b83a2>] system_call_fastpath+0x16/0x1b
[ 2808.127447] CPU: 0 PID: 10215 Comm: bb Not tainted 3.12.0-rc2-latest+ #4
[ 2808.127450] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
[ 2808.127452]  ffffea0000802440 ffff88003d6b5ba8 ffffffff817a8a6f ffff88003d6b5de0
[ 2808.127456]  ffff88002e325870 ffff88003d6b5cb8 ffffffff812b7c9f ffff88003e2083d0
[ 2808.127459]  0000000000000000 ffff88003d6b5c48 ffffffff817aedd2 ffff88003c3707b0
[ 2808.127462] Call Trace:
[ 2808.127467]  [<ffffffff817a8a6f>] dump_stack+0x46/0x58
[ 2808.127473]  [<ffffffff812b7c9f>] __extent_writepage+0x6f/0x750
[ 2808.127479]  [<ffffffff817aedd2>] ? __schedule+0x692/0x730
[ 2808.127483]  [<ffffffff810fac71>] ? find_get_pages_tag+0x121/0x160
[ 2808.127487]  [<ffffffff812b8552>] extent_write_cache_pages.isra.24.constprop.31+0x1d2/0x380
[ 2808.127492]  [<ffffffff81005877>] ? show_trace_log_lvl+0x57/0x70
[ 2808.127496]  [<ffffffff812b89d9>] extent_writepages+0x49/0x60
[ 2808.127501]  [<ffffffff8129f160>] ? btrfs_submit_direct+0x6c0/0x6c0
[ 2808.127505]  [<ffffffff8129ce63>] btrfs_writepages+0x23/0x30
[ 2808.127510]  [<ffffffff81105c3e>] do_writepages+0x6e/0x90
[ 2808.127513]  [<ffffffff810fb689>] __filemap_fdatawrite_range+0x49/0x50
[ 2808.127517]  [<ffffffff810fc41e>] filemap_fdatawrite_range+0xe/0x10
[ 2808.127520]  [<ffffffff812aafdc>] btrfs_sync_file+0xac/0x380
[ 2808.127524]  [<ffffffff817aedd2>] ? __schedule+0x692/0x730
[ 2808.127529]  [<ffffffff811752d8>] do_fsync+0x58/0x90
[ 2808.127533]  [<ffffffff81148a6d>] ? SyS_write+0x4d/0xa0
[ 2808.127538]  [<ffffffff8117565b>] SyS_fsync+0xb/0x10
[ 2808.127542]  [<ffffffff817b83a2>] system_call_fastpath+0x16/0x1b
[ 2808.127646] [btrfs_sync_file] inum: (260, 5), called write_range, start: 0, end: 9223372036854775807, ret: 0
[ 2808.127650] [btrfs_sync_file] inum: (260, 5), calling btrfs_wait_ordered_range, start: 0, end: 9223372036854775807
[ 2808.127653] [do_writepages] inum: 260, nr_to_write: 9223372036854775807, writepages fn: ffffffff8129ce40
[ 2808.127656] CPU: 0 PID: 10215 Comm: bb Not tainted 3.12.0-rc2-latest+ #4
[ 2808.127658] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
[ 2808.127660]  ffff88003d6b5e20 ffff88003d6b5de8 ffffffff817a8a6f 0000000000000007
[ 2808.127663]  ffff88002e325b48 ffff88003d6b5e08 ffffffff81105c1f 0000000000000000
[ 2808.127666]  ffff88002e325a00 ffff88003d6b5e48 ffffffff810fb689 ffff880037b1f300
[ 2808.127669] Call Trace:
[ 2808.127673]  [<ffffffff817a8a6f>] dump_stack+0x46/0x58
[ 2808.127678]  [<ffffffff81105c1f>] do_writepages+0x4f/0x90
[ 2808.127681]  [<ffffffff810fb689>] __filemap_fdatawrite_range+0x49/0x50
[ 2808.127684]  [<ffffffff810fc41e>] filemap_fdatawrite_range+0xe/0x10
[ 2808.127688]  [<ffffffff812b2a98>] btrfs_wait_ordered_range+0x58/0x100
[ 2808.127691]  [<ffffffff812ab0d8>] btrfs_sync_file+0x1a8/0x380
[ 2808.127696]  [<ffffffff817aedd2>] ? __schedule+0x692/0x730
[ 2808.127700]  [<ffffffff811752d8>] do_fsync+0x58/0x90
[ 2808.127703]  [<ffffffff81148a6d>] ? SyS_write+0x4d/0xa0
[ 2808.127707]  [<ffffffff8117565b>] SyS_fsync+0xb/0x10
[ 2808.127712]  [<ffffffff817b83a2>] system_call_fastpath+0x16/0x1b
[ 2808.127913] [btrfs_sync_file] inum: (260, 5), return from log_dentry, start: 0, end: 9223372036854775807, ret: 0
[ 2808.127938] [btrfs_sync_log] 1. called write_marked_extents, ret: 0
[ 2808.127955] [btrfs_sync_log] 2. called write_marked_extents, ret: 0
[ 2838.176145] [do_writepages] inum: 260, nr_to_write: 3072, writepages fn: ffffffff8129ce40
[ 2838.176153] CPU: 0 PID: 711 Comm: kworker/u2:5 Not tainted 3.12.0-rc2-latest+ #4
[ 2838.176155] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
[ 2838.176163] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[ 2838.176167]  ffff88003c6bdb60 ffff88003c6bda88 ffffffff817a8a6f ffff88003fc0eb48
[ 2838.176170]  ffff88002e325b48 ffff88003c6bdaa8 ffffffff81105c1f ffff88002e325a00
[ 2838.176173]  ffff88003c6bdb60 ffff88003c6bdb08 ffffffff8116ea30 ffff88003dc257b0
[ 2838.176176] Call Trace:
[ 2838.176184]  [<ffffffff817a8a6f>] dump_stack+0x46/0x58
[ 2838.176190]  [<ffffffff81105c1f>] do_writepages+0x4f/0x90
[ 2838.176194]  [<ffffffff8116ea30>] __writeback_single_inode+0xa0/0x260
[ 2838.176199]  [<ffffffff8116f741>] writeback_sb_inodes+0x2b1/0x430
[ 2838.176203]  [<ffffffff8116ea77>] ? __writeback_single_inode+0xe7/0x260
[ 2838.176207]  [<ffffffff8116f93e>] __writeback_inodes_wb+0x7e/0xc0
[ 2838.176211]  [<ffffffff8117000b>] wb_writeback+0x18b/0x2e0
[ 2838.176215]  [<ffffffff811701e3>] wb_check_old_data_flush+0x83/0xb0
[ 2838.176220]  [<ffffffff81171aac>] bdi_writeback_workfn+0x1dc/0x440
[ 2838.176225]  [<ffffffff8105f19c>] process_one_work+0x2bc/0x430
[ 2838.176229]  [<ffffffff810600c5>] worker_thread+0x215/0x320
[ 2838.176233]  [<ffffffff8105feb0>] ? manage_workers.isra.18+0x2b0/0x2b0
[ 2838.176237]  [<ffffffff810666f6>] kthread+0xb6/0xc0
[ 2838.176241]  [<ffffffff81066640>] ? flush_kthread_worker+0xa0/0xa0
[ 2838.176246]  [<ffffffff817b82fc>] ret_from_fork+0x7c/0xb0
[ 2838.176250]  [<ffffffff81066640>] ? flush_kthread_worker+0xa0/0xa0

I installed 3.12.0-rc2 and added dump_stack() at the beginning of do_writepages() and __extent_writepage(). When I run my test case (fopen, fwrite a small random string, fsync, fclose), it returns soon, and this is how my dmesg looks like:

[    5.471036] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   10.848064] EXT4-fs (sda2): re-mounted. Opts: acl,user_xattr,commit=0
[   10.859285] EXT4-fs (sdc1): re-mounted. Opts: acl,user_xattr,commit=0
[   10.861832] EXT4-fs (sdb1): re-mounted. Opts: acl,user_xattr,commit=0
[   33.763486] btrfs: device fsid 4d2767ea-85cc-443c-88bd-2d8c693f7c7c devid 1 transid 2072 /dev/sdd
[   33.764218] btrfs: max_inline at 0
[   33.764220] btrfs: disk space caching is enabled

[   61.083202] [do_writepages] inum: 258, nr_to_write: 9223372036854775807, writepages fn: ffffffff8129ce40
[   61.083204] CPU: 0 PID: 4165 Comm: bb Not tainted 3.12.0-rc2-latest+ #4
[   61.083205] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
[   61.083205]  ffff8800399ade20 ffff8800399adde8 ffffffff817a8a6f 0000000000000007
[   61.083207]  ffff88002e324f30 ffff8800399ade08 ffffffff81105c1f 0000000000000000
[   61.083208]  ffff88002e324de8 ffff8800399ade48 ffffffff810fb689 ffff88003d9f2c00
[   61.083209] Call Trace:
[   61.083211]  [<ffffffff817a8a6f>] dump_stack+0x46/0x58
[   61.083218]  [<ffffffff81105c1f>] do_writepages+0x4f/0x90
[   61.083220]  [<ffffffff810fb689>] __filemap_fdatawrite_range+0x49/0x50
[   61.083221]  [<ffffffff810fc41e>] filemap_fdatawrite_range+0xe/0x10
[   61.083224]  [<ffffffff812b2a98>] btrfs_wait_ordered_range+0x58/0x100
[   61.083225]  [<ffffffff812ab0d8>] btrfs_sync_file+0x1a8/0x380
[   61.083227]  [<ffffffff817aedd2>] ? __schedule+0x692/0x730
[   61.083229]  [<ffffffff811752d8>] do_fsync+0x58/0x90
[   61.083231]  [<ffffffff81148a6d>] ? SyS_write+0x4d/0xa0
[   61.083232]  [<ffffffff8117565b>] SyS_fsync+0xb/0x10
[   61.083234]  [<ffffffff817b83a2>] system_call_fastpath+0x16/0x1b               <<<<<<<<< immediately after the test program exits

[   96.144059] [do_writepages] inum: 258, nr_to_write: 3072, writepages fn: ffffffff8129ce40   <<<<<<<<< appears after ~35ms
[   96.144064] CPU: 0 PID: 711 Comm: kworker/u2:5 Not tainted 3.12.0-rc2-latest+ #4
[   96.144065] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
[   96.144070] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[   96.144072]  ffff88003c6bdb60 ffff88003c6bda88 ffffffff817a8a6f ffff88003fc0eb48
[   96.144074]  ffff88002e324f30 ffff88003c6bdaa8 ffffffff81105c1f ffff88002e324de8
[   96.144075]  ffff88003c6bdb60 ffff88003c6bdb08 ffffffff8116ea30 ffff88003dc25130
[   96.144077] Call Trace:
[   96.144081]  [<ffffffff817a8a6f>] dump_stack+0x46/0x58
[   96.144084]  [<ffffffff81105c1f>] do_writepages+0x4f/0x90
[   96.144087]  [<ffffffff8116ea30>] __writeback_single_inode+0xa0/0x260
[   96.144089]  [<ffffffff8116f741>] writeback_sb_inodes+0x2b1/0x430
[   96.144091]  [<ffffffff8116f93e>] __writeback_inodes_wb+0x7e/0xc0
[   96.144093]  [<ffffffff8117000b>] wb_writeback+0x18b/0x2e0
[   96.144095]  [<ffffffff811701e3>] wb_check_old_data_flush+0x83/0xb0
[   96.144097]  [<ffffffff81171aac>] bdi_writeback_workfn+0x1dc/0x440
[   96.144100]  [<ffffffff8105f19c>] process_one_work+0x2bc/0x430
[   96.144102]  [<ffffffff810600c5>] worker_thread+0x215/0x320
[   96.144103]  [<ffffffff8105feb0>] ? manage_workers.isra.18+0x2b0/0x2b0
[   96.144106]  [<ffffffff810666f6>] kthread+0xb6/0xc0
[   96.144107]  [<ffffffff81066640>] ? flush_kthread_worker+0xa0/0xa0
[   96.144110]  [<ffffffff817b82fc>] ret_from_fork+0x7c/0xb0
[   96.144111]  [<ffffffff81066640>] ? flush_kthread_worker+0xa0/0xa0
[   96.144116] CPU: 0 PID: 711 Comm: kworker/u2:5 Not tainted 3.12.0-rc2-latest+ #4
[   96.144116] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
[   96.144118] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[   96.144119]  ffffea00007eb100 ffff88003c6bd808 ffffffff817a8a6f ffff88003c6bda40
[   96.144121]  ffff88002e324c58 ffff88003c6bd918 ffffffff812b7c9f ffff88003c6bd878
[   96.144122]  ffffffff817a364e ffff88003c6bd888 ffffffff00000020 ffff88003c6bd888
[   96.144123] Call Trace:
[   96.144125]  [<ffffffff817a8a6f>] dump_stack+0x46/0x58
[   96.144128]  [<ffffffff812b7c9f>] __extent_writepage+0x6f/0x750
[   96.144130]  [<ffffffff817a364e>] ? printk+0x5c/0x5e
[   96.144131]  [<ffffffff81066640>] ? flush_kthread_worker+0xa0/0xa0
[   96.144133]  [<ffffffff810fac71>] ? find_get_pages_tag+0x121/0x160
[   96.144135]  [<ffffffff812b8552>] extent_write_cache_pages.isra.24.constprop.31+0x1d2/0x380
[   96.144138]  [<ffffffff81005877>] ? show_trace_log_lvl+0x57/0x70
[   96.144139]  [<ffffffff812b89d9>] extent_writepages+0x49/0x60
[   96.144142]  [<ffffffff8129f160>] ? btrfs_submit_direct+0x6c0/0x6c0
[   96.144144]  [<ffffffff8129ce63>] btrfs_writepages+0x23/0x30
[   96.144146]  [<ffffffff81105c3e>] do_writepages+0x6e/0x90
[   96.144147]  [<ffffffff8116ea30>] __writeback_single_inode+0xa0/0x260
[   96.144149]  [<ffffffff8116f741>] writeback_sb_inodes+0x2b1/0x430
[   96.144151]  [<ffffffff8116f93e>] __writeback_inodes_wb+0x7e/0xc0
[   96.144153]  [<ffffffff8117000b>] wb_writeback+0x18b/0x2e0
[   96.144154]  [<ffffffff811701e3>] wb_check_old_data_flush+0x83/0xb0
[   96.144156]  [<ffffffff81171aac>] bdi_writeback_workfn+0x1dc/0x440
[   96.144164]  [<ffffffff8105f19c>] process_one_work+0x2bc/0x430
[   96.144166]  [<ffffffff810600c5>] worker_thread+0x215/0x320
[   96.144168]  [<ffffffff8105feb0>] ? manage_workers.isra.18+0x2b0/0x2b0
[   96.144169]  [<ffffffff810666f6>] kthread+0xb6/0xc0
[   96.144171]  [<ffffffff81066640>] ? flush_kthread_worker+0xa0/0xa0
[   96.144173]  [<ffffffff817b82fc>] ret_from_fork+0x7c/0xb0
[   96.144174]  [<ffffffff81066640>] ? flush_kthread_worker+0xa0/0xa0

This is a contiguous dmesg output.



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux