Re: BUG: KASAN: use-after-free in btrfs_init_reloc_root+0x2bf/0x330 [btrfs]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[...]
> 
>> My config results something doesn't make sense at all.
>>
>>> [49294.092645] Write of size 8 at addr ffff8885b4901440 by task kworker/u24:6/10894
>>>
>>> [49294.092657] CPU: 8 PID: 10894 Comm: kworker/u24:6 Tainted: P           OE     5.3.0-rc8-rc-kasan #2
>>> [49294.092661] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X99 Extreme4, BIOS P3.80 04/06/2018
>>> [49294.092726] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
>>> [49294.092730] Call Trace:
>>> [49294.092743]  dump_stack+0x71/0xa0
>>> [49294.092751]  print_address_description+0x67/0x323
>>> [49294.092817]  ? btrfs_init_reloc_root+0x2bf/0x330 [btrfs]
>>> [49294.092879]  ? btrfs_init_reloc_root+0x2bf/0x330 [btrfs]
>>> [49294.092884]  __kasan_report.cold+0x1a/0x3d
>>> [49294.092945]  ? btrfs_init_reloc_root+0x2bf/0x330 [btrfs]
>>> [49294.092951]  kasan_report+0xe/0x12
>>> [49294.093012]  btrfs_init_reloc_root+0x2bf/0x330 [btrfs]
>>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> We need code contex of this,
> 
> That would be +0x2cd on my 5.2.11 build, which according to GDB is here:
> 
> (gdb)  list *(btrfs_init_reloc_root+0x2cd)
> 0x1dcbdd is in btrfs_init_reloc_root (fs/btrfs/relocation.c:1456).
> 1451		reloc_root = create_reloc_root(trans, root, root->root_key.objectid);
> 1452		if (clear_rsv)
> 1453			trans->block_rsv = rsv;
> 1454	
> 1455		ret = __add_reloc_root(reloc_root);
> 1456		BUG_ON(ret < 0);
> 1457		root->reloc_root = reloc_root;
> 1458		return 0;
> 1459	}
> 1460
> 
> But according to faddr2line, it is at relocation.c:1438.
> 
> $ ./scripts/faddr2line fs/btrfs/btrfs.ko btrfs_init_reloc_root+0x2cd
> btrfs_init_reloc_root+0x2cd/0x340:
> btrfs_init_reloc_root at /path/to/linux/fs/btrfs/relocation.c:1438
> 
> That is here:
> 
> (gdb) list relocation.c:1438
> 1433		int clear_rsv = 0;
> 1434		int ret;
> 1435	
> 1436		if (root->reloc_root) {
> 1437			reloc_root = root->reloc_root;
> 1438			reloc_root->last_trans = trans->transid;
> 1439			return 0;
> 1440		}
> 1441	
> 1442		if (!rc || !rc->create_reloc_tree ||
> 
>>> [49294.093066]  record_root_in_trans+0x2ba/0x3a0 [btrfs]
>>> [49294.093119]  btrfs_record_root_in_trans+0xd2/0x150 [btrfs]
>>> [49294.093170]  start_transaction+0x1c3/0xea0 [btrfs]
>>> [49294.093226]  btrfs_finish_ordered_io+0x811/0x1610 [btrfs]
>>> [49294.093233]  ? syscall_return_via_sysret+0xf/0x7f
>>> [49294.093238]  ? syscall_return_via_sysret+0xf/0x7f
>>> [49294.093243]  ? __switch_to_asm+0x40/0x70
>>> [49294.093248]  ? __switch_to_asm+0x34/0x70
>>> [49294.093300]  ? btrfs_unlink_subvol+0xa30/0xa30 [btrfs]
>>> [49294.093307]  ? finish_task_switch+0x1a1/0x760
>>> [49294.093312]  ? __switch_to+0x457/0xe90
>>> [49294.093317]  ? __switch_to_asm+0x34/0x70
>>> [49294.093378]  normal_work_helper+0x15a/0xb90 [btrfs]
>>> [49294.093387]  process_one_work+0x706/0x1200
>>> [49294.093394]  worker_thread+0x92/0xfb0
>>> [49294.093401]  ? __kthread_parkme+0x85/0x100
>>> [49294.093406]  ? process_one_work+0x1200/0x1200
>>> [49294.093410]  kthread+0x2ba/0x3b0
>>> [49294.093414]  ? kthread_park+0x130/0x130
>>> [49294.093420]  ret_from_fork+0x35/0x40
>>>
>>> [49294.093431] Allocated by task 11689:
>>> [49294.093441]  __kasan_kmalloc.part.0+0x3c/0xa0
>>> [49294.093493]  btrfs_read_tree_root+0x8f/0x350 [btrfs]
>>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> This,
> 
> GDB points here:
> 
> (gdb) list *(btrfs_read_tree_root+0x92)
> 0xb07c2 is in btrfs_read_tree_root (./include/linux/slab.h:742).
> 737	 * @size: how many bytes of memory are required.
> 738	 * @flags: the type of memory to allocate (see kmalloc).
> 739	 */
> 740	static inline void *kzalloc(size_t size, gfp_t flags)
> 741	{
> 742		return kmalloc(size, flags | __GFP_ZERO);
> 743	}
> 744	
> 745	/**
> 746	 * kzalloc_node - allocate zeroed memory from a particular memory node.
> 
> Whereas faddr2line points at slab.h:547, here:
> 
> (gdb) list slab.h:547
> 542			index = kmalloc_index(size);
> 543	
> 544			if (!index)
> 545				return ZERO_SIZE_PTR;
> 546	
> 547			return kmem_cache_alloc_trace(
> 548					kmalloc_caches[kmalloc_type(flags)][index],
> 549					flags, size);
> 550	#endif
> 551		}
> 
> btrfs_alloc_root calls kzalloc, so it looks like this is the inlined
> result of that. It is called here:
> 
> (gdb) list disk-io.c:1434
> 1429	
> 1430		path = btrfs_alloc_path();
> 1431		if (!path)
> 1432			return ERR_PTR(-ENOMEM);
> 1433	
> 1434		root = btrfs_alloc_root(fs_info, GFP_NOFS);
> 1435		if (!root) {
> 1436			ret = -ENOMEM;
> 1437			goto alloc_fail;
> 1438		}

It's btrfs_root causing the problem. Now we have a clue.
> 
>>> [49294.093542]  btrfs_read_fs_root+0xc/0xc0 [btrfs]
>>> [49294.093601]  create_reloc_root+0x445/0x920 [btrfs]
>>> [49294.093660]  btrfs_init_reloc_root+0x1da/0x330 [btrfs]
>>> [49294.093709]  record_root_in_trans+0x2ba/0x3a0 [btrfs]
>>> [49294.093758]  btrfs_record_root_in_trans+0xd2/0x150 [btrfs]
>>> [49294.093806]  start_transaction+0x1c3/0xea0 [btrfs]
>>> [49294.093856]  __btrfs_prealloc_file_range+0x1c2/0xa50 [btrfs]
>>> [49294.093907]  btrfs_prealloc_file_range+0x10/0x20 [btrfs]
>>> [49294.093966]  prealloc_file_extent_cluster+0x24e/0x4a0 [btrfs]
>>> [49294.094025]  relocate_file_extent_cluster+0x193/0xc90 [btrfs]
>>> [49294.094083]  relocate_data_extent+0x1f2/0x460 [btrfs]
>>> [49294.094142]  relocate_block_group+0x5a5/0xf50 [btrfs]
>>> [49294.094200]  btrfs_relocate_block_group+0x38f/0x990 [btrfs]
>>> [49294.094258]  btrfs_relocate_chunk+0x5c/0x100 [btrfs]
>>> [49294.094315]  btrfs_balance+0x1292/0x2f00 [btrfs]
>>> [49294.094373]  btrfs_ioctl_balance+0x4c2/0x6a0 [btrfs]
>>> [49294.094430]  btrfs_ioctl+0xe56/0x82d0 [btrfs]
>>> [49294.094434]  do_vfs_ioctl+0x99f/0xf10
>>> [49294.094437]  ksys_ioctl+0x5e/0x90
>>> [49294.094440]  __x64_sys_ioctl+0x6f/0xb0
>>> [49294.094446]  do_syscall_64+0xa0/0x370
>>> [49294.094451]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>
>>> [49294.094457] Freed by task 11689:
>>> [49294.094464]  __kasan_slab_free+0x144/0x1f0
>>> [49294.094468]  kfree+0x95/0x2a0
>>> [49294.094516]  btrfs_drop_snapshot+0x1529/0x1c40 [btrfs]
>>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> And this.
> 
> (gdb) list *(btrfs_drop_snapshot+0x154c)
> 0x7643c is in btrfs_drop_snapshot (fs/btrfs/disk-io.h:110).
> 105	}
> 106	
> 107	static inline void btrfs_put_fs_root(struct btrfs_root *root)
> 108	{
> 109		if (refcount_dec_and_test(&root->refs))
> 110			kfree(root);
> 111	}
> 112	
> 113	void btrfs_mark_buffer_dirty(struct extent_buffer *buf);
> 114	int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid,
> 
> faddr2line agrees with GDB on this one.
> 
> There is one direct call to btrfs_put_fs_root in btrfs_drop_snapshot,
> here:
> 
> (gdb) list extent-tree.c:9446
> 9441		if (test_bit(BTRFS_ROOT_IN_RADIX, &root->state)) {
> 9442			btrfs_add_dropped_root(trans, root);
> 9443		} else {
> 9444			free_extent_buffer(root->node);
> 9445			free_extent_buffer(root->commit_root);
> 9446			btrfs_put_fs_root(root);
> 9447		}
> 9448		root_dropped = true;
> 9449	out_end_trans:
> 9450		btrfs_end_transaction_throttle(trans);

Thank you very much for the detailed report!

> 
>>> [49294.094573]  clean_dirty_subvols+0x23f/0x380 [btrfs]
>>> [49294.094632]  relocate_block_group+0x95b/0xf50 [btrfs]
>>> [49294.094691]  btrfs_relocate_block_group+0x38f/0x990 [btrfs]
>>> [49294.094748]  btrfs_relocate_chunk+0x5c/0x100 [btrfs]
>>> [49294.094805]  btrfs_balance+0x1292/0x2f00 [btrfs]
>>> [49294.094863]  btrfs_ioctl_balance+0x4c2/0x6a0 [btrfs]
>>> [49294.094920]  btrfs_ioctl+0xe56/0x82d0 [btrfs]
>>> [49294.094923]  do_vfs_ioctl+0x99f/0xf10
>>> [49294.094926]  ksys_ioctl+0x5e/0x90
>>> [49294.094929]  __x64_sys_ioctl+0x6f/0xb0
>>> [49294.094934]  do_syscall_64+0xa0/0x370
>>> [49294.094939]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>
>>> [49294.094946] The buggy address belongs to the object at ffff8885b4901100
>>>                 which belongs to the cache kmalloc-2k of size 2048
>>> [49294.094953] The buggy address is located 832 bytes inside of
>>>                 2048-byte region [ffff8885b4901100, ffff8885b4901900)
>>> [49294.094957] The buggy address belongs to the page:
>>> [49294.094962] page:ffffea0016d24000 refcount:1 mapcount:0 mapping:ffff88864400e800 index:0x0 compound_mapcount: 0
>>> [49294.094968] flags: 0x2ffff0000010200(slab|head)
>>> [49294.094976] raw: 02ffff0000010200 dead000000000100 dead000000000122 ffff88864400e800
>>> [49294.094981] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
>>> [49294.094983] page dumped because: kasan: bad access detected
>>>
>>> [49294.094987] Memory state around the buggy address:
>>> [49294.094992]  ffff8885b4901300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> [49294.094997]  ffff8885b4901380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> [49294.095002] >ffff8885b4901400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> [49294.095006]                                            ^
>>> [49294.095010]  ffff8885b4901480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> [49294.095015]  ffff8885b4901500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> [49294.095018] ==================================================================
>>> [49301.893672] BTRFS info (device sdi1): 1 enospc errors during balance
>>> [49301.893675] BTRFS info (device sdi1): balance: ended with status: -28
>>>
>>>
>>> Without KASAN, I would eventually get an oops like this:
>>>
>>> [...]
>>>
>>> I only noticed this bug because I keep an eye on dmesg. In one instance,
>>> I ignored it for a few hours, then my graphics driver crashed because of
>>> memory allocation failure and/or heap corruption. Aside from that, I
>>> have seen no obvious effects.
>>>
>>> I have hit this bug at least 5 times over the last two weeks. Every
>>> time, it has been caused by a balance on various volumes (typically to
>>> balance a single block group). I was able to trigger it somewhat
>>> reliably by attempting a balance on a volume with a size of 596.18GiB
>>> and 1.68GiB of estimated free space, but that stopped working
>>> eventually.
>>
>> Is it always that particular fs?
>> Have you ever hit it on another fs?
>> Furthermore, did you hit it in v5.1?
> 
> I usually hit this on a raid0 data volume, but I got the same KASAN
> report once while attemtping to balance my raid1 system volume.
> 
> I got the data volume into a state where it would consistently trigger
> the bug (2.05GiB of free space), and did a few git bisects.
> 
> On v5.0, the balance correctly fails with ENOSPC.
> 
> As of commit d2311e69857815ae2f728b48e6730f833a617092 ("btrfs:
> relocation: Delay reloc tree deletion after merge_reloc_roots"), first
> appearing in v5.1-rc1, I get "kernel BUG at fs/btrfs/relocation.c:1413!"
> at create_reloc_root+0x6a1/0x920 when attempting a balance.

That means you have an existing reloc tree already.

> 
> As of commit 30d40577e322b670551ad7e2faa9570b6e23eb2b ("btrfs: reloc:
> Also queue orphan reloc tree for cleanup to avoid BUG_ON()"), first
> appearing in v5.2-rc3, I get the KASAN report instead of the BUG_ON.

So with that fix, we're queuing the already orphan reloc tree first, but
then we're by somehow still trying to use that tree and get the things
screwed up.

Would you like to provide the following dump on the fs where you can
always reproduce the bug?

# btrfs ins dump-tree -t root <device>

Thanks,
Qu

> 
>> I guess there is a chance my previous change of reloc tree lifespan
>> screwed up something, but it's introduced in v5.1...
>>
>> Thanks,
>> Qu
>>

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux