Re: [PATCH v2] btrfs: Don't submit any btree write bio after transaction is aborted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/2/4 下午9:11, Josef Bacik wrote:
> On 2/4/20 4:03 AM, Qu Wenruo wrote:
>> [BUG]
>> There is a fuzzed image which could cause KASAN report at unmount time.
>>
>>    ==================================================================
>>    BUG: KASAN: use-after-free in btrfs_queue_work+0x2c1/0x390
>>    Read of size 8 at addr ffff888067cf6848 by task umount/1922
>>
>>    CPU: 0 PID: 1922 Comm: umount Tainted: G        W         5.0.21 #1
>>    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> 1.10.2-1ubuntu1 04/01/2014
>>    Call Trace:
>>     dump_stack+0x5b/0x8b
>>     print_address_description+0x70/0x280
>>     kasan_report+0x13a/0x19b
>>     btrfs_queue_work+0x2c1/0x390
>>     btrfs_wq_submit_bio+0x1cd/0x240
>>     btree_submit_bio_hook+0x18c/0x2a0
>>     submit_one_bio+0x1be/0x320
>>     flush_write_bio.isra.41+0x2c/0x70
>>     btree_write_cache_pages+0x3bb/0x7f0
>>     do_writepages+0x5c/0x130
>>     __writeback_single_inode+0xa3/0x9a0
>>     writeback_single_inode+0x23d/0x390
>>     write_inode_now+0x1b5/0x280
>>     iput+0x2ef/0x600
>>     close_ctree+0x341/0x750
>>     generic_shutdown_super+0x126/0x370
>>     kill_anon_super+0x31/0x50
>>     btrfs_kill_super+0x36/0x2b0
>>     deactivate_locked_super+0x80/0xc0
>>     deactivate_super+0x13c/0x150
>>     cleanup_mnt+0x9a/0x130
>>     task_work_run+0x11a/0x1b0
>>     exit_to_usermode_loop+0x107/0x130
>>     do_syscall_64+0x1e5/0x280
>>     entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> [CAUSE]
>> The fuzzed image has a completely screwd up extent tree:
>>    leaf 29421568 gen 9 total ptrs 6 free space 3587 owner
>> 18446744073709551610
>>    refs 2 lock (w:0 r:0 bw:0 br:0 sw:0 sr:0) lock_owner 0 current 5938
>>            item 0 key (12587008 168 4096) itemoff 3942 itemsize 53
>>                    extent refs 1 gen 9 flags 1
>>                    ref#0: extent data backref root 5 objectid 259
>> offset 0 count 1
>>            item 1 key (12591104 168 8192) itemoff 3889 itemsize 53
>>                    extent refs 1 gen 9 flags 1
>>                    ref#0: extent data backref root 5 objectid 271
>> offset 0 count 1
>>            item 2 key (12599296 168 4096) itemoff 3836 itemsize 53
>>                    extent refs 1 gen 9 flags 1
>>                    ref#0: extent data backref root 5 objectid 259
>> offset 4096 count 1
>>            item 3 key (29360128 169 0) itemoff 3803 itemsize 33
>>                    extent refs 1 gen 9 flags 2
>>                    ref#0: tree block backref root 5
>>            item 4 key (29368320 169 1) itemoff 3770 itemsize 33
>>                    extent refs 1 gen 9 flags 2
>>                    ref#0: tree block backref root 5
>>            item 5 key (29372416 169 0) itemoff 3737 itemsize 33
>>                    extent refs 1 gen 9 flags 2
>>                    ref#0: tree block backref root 5
>>
>> 1. Wrong owner
>>     The corrupted leaf has owner -6ULL, which matches
>>     BTRFS_TREE_LOG_OBJECTID.
>>
>> 2. Missing backref for extent tree itself
>>     So extent allocator can allocate tree block 29421568 as a new tree
>>     block.
>>
>> Above corruption leads to the following sequence to happen:
>> - Btrfs needs to COW extent tree
>>    Extent allocator choose to allocate eb at bytenr 29421568 (which is
>>    current extent tree root).
>>
>>    And since the owner is copied from old eb, it's -6ULL, so
>>    btrfs_init_new_buffer() will not mark the range dirty in
>>    transaction->dirty_pages, but root->dirty_log_pages.
> 
> That's not what it checks though, it checks the root that was passed in,
> so I assume this means that the root that is cow'ing the block is a log
> root?

Oh, right. That's the same. The eb is for the log tree.

> 
>>
>>    Also, since the eb is already under usage, extent root doesn't even
>>    get marked DIRTY, nor added to dirty_cowonly_list.
>>    Thus even we try to iterate dirty roots to cleanup their
>>    dirty_log_pages, this particular extent tree won't be iterated.
> 
> But what about the original root that actually allocated this log root? 
> That should still be on the dirty list, correct?  And thus be cleaned up
> properly?

I tried that solution, by going through all dirty trees, and clean their
root->dirty_log_pages (both EXTENT_NEW and EXTENT_DIRTY).

But strangely, only csum tree is dirty.
Even we have CoWed tree block for extent tree for delayed refs, it's not
in dirty cowonly list.

That's why my initial attempt to clean dirty_log_pages failed miserably.

> 
> I'm not quite understanding how we get into this situation, we have the
> above block that is pointed to by the extent root correct?

Yes, the initial image has 29421568 as extent root.

>  But the
> block itself isn't in the extent tree, and thus appears to be free,
> correct?

Yep.

> 
> So then what you are seeing is
> 
> fsync(inode)
>   log the inode, which attaches inode->root to dirty_list
>      create a log root, attach it to the root
>        allocate this bogus block, set it dirty in log_root->dirty_log_pages
> <some time later, boom>

By some reason, it's not that easy.

We abort trans as run_delayed_refs() from btrfs_commit_transaction().
At this point, we dump extent tree, which has generation 9 (8 + 1, so
it's the current running transaction), owner is -6ULL.

So yes fsync() happened and allocated the extent buffer, which is also
extent root.

> we try to write out the above block
> 
> Why isn't the log root getting cleaned up properly?  It doesn't matter
> who owned the block originally, it should still be attached to a log
> root that is attached to a real root and thus be cleaned up properly. 

Then this is the problem.

I don't see btrfs_cleanup_one_transaction() doing any log tree related
cleanup.

And since it's log tree, my previous check on dirty_cowonly_list doesn't
make much sense, as those trees aren't updated if they have log tree.

I'll check how to iterate through log roots then.

Thanks,
Qu

> Something else appears to be going screwy here.  Thanks,
> 
> Josef

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux