On Fri, Jan 23, 2015 at 02:38:09PM +0000, Holger Hoffstätte wrote: > On Fri, 23 Jan 2015 15:01:28 +0100, Martin Steigerwald wrote: > > > Hi! > > > > Anyone seen this? > > > > Reported as: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=91911 > > You might be interested in: > > https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git/commit/?h=evict-softlockup&id=29249e14d6e3379a5c4bb098dd4beddfefbc606f > > and > > https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git/commit/?h=evict-softlockup&id=e4a58b71ff981b098ac3371f4d573dc6a90006ce > > I'm sure everyone would love to hear how this works out for you ;-) I merged both commits and I've been running with them since Friday. Several softlockups since then, in unlinkat() and renameat2(). Some typical stacks: [<ffffffff81386214>] ? free_extent_state.part.29+0x34/0xb0 [<ffffffff81386715>] ? free_extent_state+0x25/0x30 [<ffffffff81386e6a>] ? __set_extent_bit+0x3aa/0x4f0 [<ffffffff8185de02>] ? _raw_spin_unlock_irqrestore+0x32/0x70 [<ffffffff8109ec61>] ? get_parent_ip+0x11/0x50 [<ffffffff8185a2d9>] schedule+0x29/0x70 [<ffffffff81387dc0>] lock_extent_bits+0x1b0/0x200 [<ffffffff810b4df0>] ? add_wait_queue+0x60/0x60 [<ffffffff81375e99>] btrfs_evict_inode+0x139/0x550 [<ffffffff8120d708>] evict+0xb8/0x190 [<ffffffff8120dec5>] iput+0x105/0x1a0 [<ffffffff812001d9>] do_unlinkat+0x189/0x2d0 [<ffffffff811f775a>] ? SyS_newlstat+0x2a/0x40 [<ffffffff814a52ce>] ? trace_hardirqs_on_thunk+0x3a/0x3c [<ffffffff81202e26>] SyS_unlink+0x16/0x20 [<ffffffff8185e96d>] system_call_fastpath+0x1a/0x1f Note that the above stack is _very_ typical. I've caught machines with well over 100 processes stuck in "D" state with an identical stack trace from "btrfs_evict_inode" to "system_call_fastpath". [<ffffffff81390100>] lock_extent_bits+0x1b0/0x200 [<ffffffff8137e0aa>] btrfs_evict_inode+0x12a/0x540 [<ffffffff81214978>] evict+0xb8/0x190 [<ffffffff81215135>] iput+0x105/0x1a0 [<ffffffff81210cb0>] __dentry_kill+0x190/0x200 [<ffffffff812112ba>] dput+0xba/0x190 [<ffffffff8120a8b0>] SyS_renameat2+0x510/0x580 [<ffffffff8120a95e>] SyS_rename+0x1e/0x20 [<ffffffff818711ad>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff The above is a typical renameat2() softlockup stack. [<ffffffff81179888>] wait_on_page_bit+0xb8/0xc0 [<ffffffff8118e584>] shrink_page_list+0x8c4/0xb20 [<ffffffff8118edcd>] shrink_inactive_list+0x19d/0x500 [<ffffffff8118fa7d>] shrink_lruvec+0x59d/0x760 [<ffffffff8118fcc3>] shrink_zone+0x83/0x1c0 [<ffffffff811903de>] do_try_to_free_pages+0x16e/0x460 [<ffffffff8119080e>] try_to_free_mem_cgroup_pages+0x9e/0x180 [<ffffffff811e393e>] mem_cgroup_reclaim+0x4e/0xe0 [<ffffffff811e48ad>] try_charge+0x15d/0x500 [<ffffffff811e729d>] mem_cgroup_try_charge+0x8d/0x1a0 [<ffffffff8117997f>] __add_to_page_cache_locked+0x8f/0x280 [<ffffffff81179b98>] add_to_page_cache_lru+0x28/0x80 [<ffffffff8117a08b>] pagecache_get_page+0xab/0x1d0 [<ffffffffc02fb5a4>] alloc_extent_buffer+0xe4/0x380 [btrfs] [<ffffffffc02d228f>] btrfs_find_create_tree_block+0x1f/0x30 [btrfs] [<ffffffffc02d238f>] readahead_tree_block+0x1f/0x60 [btrfs] [<ffffffffc02ac9b0>] reada_for_balance+0x160/0x1e0 [btrfs] [<ffffffffc02b4f57>] btrfs_search_slot+0x687/0xac0 [btrfs] [<ffffffffc02ceddf>] btrfs_lookup_inode+0x2f/0xa0 [btrfs] [<ffffffffc032ee25>] __btrfs_update_delayed_inode+0x65/0x210 [btrfs] [<ffffffffc03303ea>] btrfs_commit_inode_delayed_inode+0x13a/0x150 [btrfs] [<ffffffffc02e52ba>] btrfs_evict_inode+0x2ca/0x520 [btrfs] [<ffffffff8120d838>] evict+0xb8/0x190 [<ffffffff8120dff5>] iput+0x105/0x1a0 [<ffffffff81209bd8>] __dentry_kill+0x1b8/0x210 [<ffffffff8120a31a>] dput+0xba/0x190 [<ffffffff812037d0>] SyS_renameat2+0x440/0x530 [<ffffffff812038fe>] SyS_rename+0x1e/0x20 [<ffffffff817a836d>] system_call_fastpath+0x1a/0x1f [<ffffffffffffffff>] 0xffffffffffffffff The last one is a little older (from 3.17.4) but it's a bit more interesting. Since mem cgroups were involved, I allocated a lot more RAM to the cgroup and it seems to have helped reduce the frequency of this bug occurring. > > -h > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: Digital signature
