regression in 4.12 caused by a7e3b975a0f9 ("Btrfs: fix reported number of inode blocks")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Filipe, 

It seems the above commit causes  various underflows in 
some btrfs internal counters. This is evident during generic/439 running: 

The first one is in BTRFS_I(inode)->csum_bytes

[   16.710508] ------------[ cut here ]------------
[   16.710524] WARNING: CPU: 3 PID: 1080 at fs/btrfs/inode.c:9449 btrfs_destroy_inode+0x275/0x2c0
[   16.710525] Modules linked in:
[   16.710530] CPU: 3 PID: 1080 Comm: umount Not tainted 4.12.0-rc5-nbor #317
[   16.710531] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   16.710533] task: ffff8801309bd4c0 task.stack: ffff880132ff4000
[   16.710535] RIP: 0010:btrfs_destroy_inode+0x275/0x2c0
[   16.710537] RSP: 0018:ffff880132ff7d08 EFLAGS: 00010286
[   16.710540] RAX: 0000000000000000 RBX: ffff8801369ee000 RCX: 0000000000000000
[   16.710541] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880131f21aa0
[   16.710542] RBP: ffff880132ff7d38 R08: 0000000000000000 R09: 0000000000000000
[   16.710543] R10: 0000000000000001 R11: 0000000080b489d8 R12: ffff880131f21aa0
[   16.710544] R13: ffff88012ff60000 R14: ffff8801396c57a8 R15: ffff8801396c57a8
[   16.710546] FS:  00007f4ca857a840(0000) GS:ffff88013fcc0000(0000) knlGS:0000000000000000
[   16.710548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.710550] CR2: 00007f4ca7da0170 CR3: 00000001347fe000 CR4: 00000000000006a0
[   16.710553] Call Trace:
[   16.710562]  destroy_inode+0x3b/0x60
[   16.710565]  evict+0x135/0x1c0
[   16.710567]  dispose_list+0x4d/0x80
[   16.710570]  evict_inodes+0x16d/0x180
[   16.710588]  generic_shutdown_super+0x3f/0x120
[   16.710590]  kill_anon_super+0x12/0x20
[   16.710597]  btrfs_kill_super+0x16/0xa0
[   16.710599]  deactivate_locked_super+0x4c/0x80
[   16.710602]  deactivate_super+0x45/0x60
[   16.710604]  cleanup_mnt+0x3f/0x70
[   16.710605]  __cleanup_mnt+0x12/0x20
[   16.710609]  task_work_run+0x7e/0xc0
[   16.710614]  exit_to_usermode_loop+0x6c/0x93
[   16.710618]  syscall_return_slowpath+0x9f/0xd0
[   16.710625]  entry_SYSCALL_64_fastpath+0xc0/0xc2

What's peculiar about this one is that it always underflows by a single page size. 
When the warning triggers csum_bytes is -4096. Perhaps some off-by-one which is then 
multiplied by PAGE_SIZE? I've investigated it a bit further and here are the 2 types of
stack traces that lead to freeing csum_bytes: 

==========================
Freeing 4096 bytes for ino 257
 0xffffffff813d0042 : calc_csum_metadata_size+0x62/0x100 [kernel]
 0xffffffff813d94b1 : btrfs_delalloc_release_metadata+0x1a1/0x210 [kernel]
 0xffffffff813f8d66 : btrfs_clear_bit_hook+0xd6/0x410 [kernel]
 0xffffffff814182ce : clear_state_bit+0x6e/0x180 [kernel] clear_state_bit at fs/btrfs/extent_io.c:549
 0xffffffff81418517 : __clear_extent_bit+0x137/0x370 [kernel] fs/btrfs/extent_io.c:726
 0xffffffff81418c2a : clear_extent_bit+0x2a/0x30 [kernel]
 0xffffffff8140b565 : lock_and_cleanup_extent_if_need+0x1e5/0x370 [kernel] fs/btrfs/file.c:1514
 0xffffffff8140c9b4 : __btrfs_buffered_write+0x204/0x6e0 [kernel]
 0xffffffff814104cc : btrfs_file_write_iter+0x19c/0x500 [kernel]
 0xffffffff811d19c5 : __vfs_write+0xc5/0x140 [kernel]
 0xffffffff811d39bc : vfs_write+0xcc/0x1a0 [kernel]
 0xffffffff811d3e65 : sys_pwrite64+0x85/0xa0 [kernel]
 0xffffffff8178ca85 : entry_SYSCALL_64_fastpath+0x23/0xc2 [kernel]
 0x0 (inexact)

==========================
Freeing 155648 bytes for ino 257
	 UNDERFLOW !!!
 0xffffffff813d0042 : calc_csum_metadata_size+0x62/0x100 [kernel]
 0xffffffff813d94b1 : btrfs_delalloc_release_metadata+0x1a1/0x210 [kernel]
 0xffffffff8140103d : btrfs_finish_ordered_io+0x16d/0x6a0 [kernel] fs/btrfs/inode.c:3038
 0xffffffff81401585 : finish_ordered_fn+0x15/0x20 [kernel]
 0xffffffff8142d083 : normal_work_helper+0x93/0x650 [kernel]
 0xffffffff8142d832 : btrfs_endio_write_helper+0x12/0x20 [kernel]
 0xffffffff8107259b : process_one_work+0x1cb/0x5f0 [kernel]
 0xffffffff81072a4d : worker_thread+0x4d/0x380 [kernel]
 0xffffffff81079ce4 : kthread+0x114/0x150 [kernel]
 0xffffffff8178cd1a : ret_from_fork+0x2a/0x40 [kernel]
 0x0 (inexact)

The underflow always happens from finish_ordered_io. 

The next one is: 

[   55.007633] ------------[ cut here ]------------
[   55.007652] WARNING: CPU: 1 PID: 1078 at fs/btrfs/extent-tree.c:5709 btrfs_free_block_groups+0x365/0x3f0
[   55.007654] Modules linked in:
[   55.007660] CPU: 1 PID: 1078 Comm: umount Tainted: G        W       4.12.0-rc5-nbor #318
[   55.007662] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   55.007663] task: ffff8801329eb880 task.stack: ffff880130c10000
[   55.007666] RIP: 0010:btrfs_free_block_groups+0x365/0x3f0
[   55.007668] RSP: 0018:ffff880130c13d68 EFLAGS: 00010286
[   55.007672] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffff880130c13cbc
[   55.007673] RDX: 0000000000000002 RSI: 0000000000000002 RDI: ffff880136bf5000
[   55.007675] RBP: ffff880130c13d98 R08: 0000000000000000 R09: 0000000000000000
[   55.007677] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880136bf4400
[   55.007679] R13: ffff88012fe20000 R14: ffff88012fe200c0 R15: ffff88012fe20100
[   55.007681] FS:  00007fcbfc1b9840(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000
[   55.007683] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   55.007685] CR2: 00007fcbfb9e1903 CR3: 000000012fd8f000 CR4: 00000000000006a0
[   55.007690] Call Trace:
[   55.007698]  close_ctree+0x13b/0x330
[   55.007703]  ? evict_inodes+0x16d/0x180
[   55.007710]  btrfs_put_super+0x15/0x20
[   55.007714]  generic_shutdown_super+0x6a/0x120
[   55.007718]  kill_anon_super+0x12/0x20
[   55.007721]  btrfs_kill_super+0x16/0xa0
[   55.007725]  deactivate_locked_super+0x4c/0x80
[   55.007728]  deactivate_super+0x45/0x60
[   55.007732]  cleanup_mnt+0x3f/0x70
[   55.007734]  __cleanup_mnt+0x12/0x20
[   55.007738]  task_work_run+0x7e/0xc0
[   55.007743]  exit_to_usermode_loop+0x6c/0x93
[   55.007749]  syscall_return_slowpath+0x9f/0xd0
[   55.007760]  entry_SYSCALL_64_fastpath+0xc0/0xc2
[   55.007874] ---[ end trace 06f22a84784e2e1d ]---

[   55.007876] fs_info->delalloc_block_rsv.size == 1

And this one: 

[   55.007888] WARNING: CPU: 1 PID: 1078 at fs/btrfs/extent-tree.c:5712 btrfs_free_block_groups+0x3d5/0x3f0
[   55.007890] Modules linked in:
[   55.007894] CPU: 1 PID: 1078 Comm: umount Tainted: G        W       4.12.0-rc5-nbor #318
[   55.007895] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   55.007897] task: ffff8801329eb880 task.stack: ffff880130c10000
[   55.007899] RIP: 0010:btrfs_free_block_groups+0x3d5/0x3f0
[   55.007901] RSP: 0018:ffff880130c13d68 EFLAGS: 00010206
[   55.007904] RAX: 0000000000000029 RBX: 0000000000000000 RCX: 0000000000000006
[   55.007906] RDX: 0000000000000000 RSI: 0000000000138000 RDI: ffff88013fc4c9d0
[   55.007908] RBP: ffff880130c13d98 R08: 0000000000000000 R09: 0000000000000000
[   55.007909] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880136bf4400
[   55.007911] R13: ffff88012fe20000 R14: ffff88012fe200c0 R15: ffff88012fe20100
[   55.007913] FS:  00007fcbfc1b9840(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000
[   55.007915] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   55.007917] CR2: 00007fcbfb9e1903 CR3: 000000012fd8f000 CR4: 00000000000006a0
[   55.007920] Call Trace:
[   55.007926]  close_ctree+0x13b/0x330
[   55.007929]  ? evict_inodes+0x16d/0x180
[   55.007934]  btrfs_put_super+0x15/0x20
[   55.007937]  generic_shutdown_super+0x6a/0x120
[   55.007941]  kill_anon_super+0x12/0x20
[   55.007944]  btrfs_kill_super+0x16/0xa0
[   55.007948]  deactivate_locked_super+0x4c/0x80
[   55.007951]  deactivate_super+0x45/0x60
[   55.007954]  cleanup_mnt+0x3f/0x70
[   55.007957]  __cleanup_mnt+0x12/0x20
[   55.007959]  task_work_run+0x7e/0xc0
[   55.007964]  exit_to_usermode_loop+0x6c/0x93
[   55.007968]  syscall_return_slowpath+0x9f/0xd0
[   55.007972]  entry_SYSCALL_64_fastpath+0xc0/0xc2
[   55.008080] ---[ end trace 06f22a84784e2e1e ]---
[   55.008082] fs_info->delalloc_block_rsv.reserved == 1277952


There's also this one: 

[   16.729373] ------------[ cut here ]------------
[   16.729375] WARNING: CPU: 3 PID: 1080 at fs/btrfs/extent-tree.c:9836 btrfs_free_block_groups+0x2c5/0x3e0
[   16.729375] Modules linked in:
[   16.729378] CPU: 3 PID: 1080 Comm: umount Tainted: G        W       4.12.0-rc5-nbor #317
[   16.729379] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   16.729380] task: ffff8801309bd4c0 task.stack: ffff880132ff4000
[   16.729381] RIP: 0010:btrfs_free_block_groups+0x2c5/0x3e0
[   16.729382] RSP: 0018:ffff880132ff7d68 EFLAGS: 00010286
[   16.729384] RAX: ffff88013089e0e8 RBX: 0000000000000000 RCX: 0000000000000006
[   16.729385] RDX: 0000000000000000 RSI: ffff8801309bdfc8 RDI: ffff88013fccc9d0
[   16.729386] RBP: ffff880132ff7d98 R08: 0000000000000000 R09: 0000000000000000
[   16.729387] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88013089e800
[   16.729388] R13: ffff88012ff60000 R14: ffff88013089e0e8 R15: ffff88013089e000
[   16.729391] FS:  00007f4ca857a840(0000) GS:ffff88013fcc0000(0000) knlGS:0000000000000000
[   16.729393] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.729394] CR2: 00007f4ca7da0170 CR3: 00000001347fe000 CR4: 00000000000006a0
[   16.729396] Call Trace:
[   16.729399]  close_ctree+0x13b/0x330
[   16.729401]  ? evict_inodes+0x16d/0x180
[   16.729404]  btrfs_put_super+0x15/0x20
[   16.729406]  generic_shutdown_super+0x6a/0x120
[   16.729408]  kill_anon_super+0x12/0x20
[   16.729410]  btrfs_kill_super+0x16/0xa0
[   16.729412]  deactivate_locked_super+0x4c/0x80
[   16.729415]  deactivate_super+0x45/0x60
[   16.729416]  cleanup_mnt+0x3f/0x70
[   16.729418]  __cleanup_mnt+0x12/0x20
[   16.729420]  task_work_run+0x7e/0xc0
[   16.729422]  exit_to_usermode_loop+0x6c/0x93
[   16.729425]  syscall_return_slowpath+0x9f/0xd0
[   16.729427]  entry_SYSCALL_64_fastpath+0xc0/0xc2
[   16.729519] BTRFS info (device vdc): space_info 1 has 7974912 free, is not full
[   16.729537] BTRFS info (device vdc): space_info total=8388608, used=417792, pinned=0, reserved=0, may_use=18446744073709547520, readonly=0

Here may_use has clearly underflowed and if you convert it to a signed value
it actually becomes -4096. 

This doesn't happen when DIO is used. Perhaps this commit should be reverted?
In the meantime I will continue investigating why this is happening. 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux