On Thu, May 14, 2015 at 12:07 AM, Zygo Blaxell
<ce3g8jdj@xxxxxxxxxxxxxxxxxxxxx> wrote:
> [Apologies for the duplication, if any. I gave the original, longer
> version 18 hours to appear, and it doesn't seem to have shown up yet.]
>
> On Fri, May 08, 2015 at 09:21:02AM -0400, Zygo Blaxell wrote:
>> On Fri, May 08, 2015 at 11:32:07AM +0100, Filipe David Manana wrote:
>> > On Thu, May 7, 2015 at 11:20 PM, Zygo Blaxell
>> > <ce3g8jdj@xxxxxxxxxxxxxxxxxxxxx> wrote:
>> > > This is the simplest repro recipe for this that I have found so far.
>> > > It takes only a few minutes for the rm processes to get stuck here:
>> > >
>> > > # cat /proc/28396/stack Thu May 7 18:13:05 2015
>> > >
>> > > [<ffffffff813c8a2d>] lock_extent_bits+0x1ad/0x200
>> > > [<ffffffff813b5dfa>] btrfs_evict_inode+0x17a/0x5e0
>> > > [<ffffffff8123fc68>] evict+0xb8/0x1b0
>> > > [<ffffffff81240813>] iput+0x1f3/0x260
>> > > [<ffffffff81233c68>] do_unlinkat+0x1d8/0x360
>> > > [<ffffffff812346db>] SyS_unlinkat+0x1b/0x40
>> > > [<ffffffff8190024d>] system_call_fastpath+0x16/0x1b
>> > > [<ffffffffffffffff>] 0xffffffffffffffff
>> > >
>> > >
>> > > Run these three scripts in a directory that is the top of a subvol:
>
> New versions of these scripts make the results a little more reproducible:
>
>
> #!/bin/bash
> set -x
> # Script #1: randomly create or delete snapshots
> # v2: no significant changes
> while date; do
> if [ $[RANDOM%2] = 0 ]; then
> btrfs sub snap . snaps-$RANDOM
> else
> for x in snaps-*; do
> btrfs sub del $x
> break
> done
> btrfs sub sync .
> fi
> sleep 1
> done
>
>
> #!/bin/bash
> # Script #2: create a bunch of files of random sizes
> # v2: create our own test file instead of using Chromium's 1.6M copyright file
> while echo -ne "\r$(date)"; do
> [ -s tester ] || head -c 1024k /dev/urandom > tester
> d=$[RANDOM%9]/$[RANDOM%9]/$[RANDOM%9]/$[RANDOM%9]
> mkdir -p ${d%/*}
> head -c $[RANDOM%1024]k tester > $d
> done
>
>
>
> #!/bin/bash
> set -x
> # Script #3: read and immediately delete all the files
> # v3: let script #2 create some more files
> while date; do
> find *[0-9] -type f -exec sh -c 'cat >/dev/null "$@"' -- {} \; -exec rm -fv {} \;
>
> # Allow some files to build up between runs
> sleep 1m
>
> # Make sure we are not reading from cache.
> # These are not strictly necessary but they reduce
> # the repro time by a minute or so.
> sync
> sysctl vm.drop_caches=1
> done
>
>
>> > Tried that for over 3 hours, on a 4.1-rc2 kernel with a few patches
>> > from the list, with several combinations of mount options (compress,
>> > autodefrag, nodatacow, etc) and didn't got any issue.
>> >
>> > What kernel version are you testing? Any specific combination of mount options?
>>
>> I've seen it on the field on versions from v3.15 to v4.0.1. The test I did
>> yesterday was v4.0.1.
>
> I just verified that the issue is still present in v4.1-rc3. Tested on
> bare hardware and kvm, and a mix of AMD and Intel CPUs.
>
> The issue appears immediately after the test file collection becomes too
> large to fit in the host RAM. In my test environment I used RAM sizes
> from 3GB to 16GB with a 16GB btrfs filesystem. The test ran without
> incident until the filesystem used space (reported by df) exceeded the
> RAM size, then rm hung a few seconds later.
>
> If I reboot after an rm hang and run scripts #1 and #3 (snapshots and
> rm), it hangs almost immediately as soon as subvol delete and rm are
> running at the same time.
>
> If script #3 (remove files) runs too quickly (i.e. your disks are too
> fast ;), try delaying script #3 until after #1 and #2 have accumulated
> enough data to exceed RAM size.
>
> I used default mkfs and mount options this time. For kvm tests I used
> a freshly debootstrapped Debian Jessie, and for the bare hardware tests
> I used some random Debian Wheezy systems.
>
> My kernel config file, logs, and repro scripts are available at:
>
> http://furryterror.org/~zblaxell/tmp/.ma12/
>
> This is part of the kernel log after a typical failure (the whole thing
> is available at the URL above):
>
> May 13 04:59:34 testhost kernel: [ 720.290141] rm D ffff8800ab8ebc78 0 4994 23006 0x00000000
> May 13 04:59:34 testhost kernel: [ 720.329903] ffff8800ab8ebc78 ffffffff814291b8 00000000ffffffff ffff8800aa831000
> May 13 04:59:34 testhost kernel: [ 720.330512] ffff8800ac4ed000 00000000000b0000 ffff8800ab8ec000 ffff8800acd670f0
> May 13 04:59:34 testhost kernel: [ 720.331090] ffff8800acd670d0 00000000000b0000 ffff8800aa62dae0 ffff8800ab8ebc98
> May 13 04:59:34 testhost kernel: [ 720.400161] Call Trace:
> May 13 04:59:34 testhost kernel: [ 720.401020] [<ffffffff814291b8>] ? lock_extent_bits+0x1a8/0x200
> May 13 04:59:34 testhost kernel: [ 720.462318] [<ffffffff819a8297>] schedule+0x37/0x90
> May 13 04:59:34 testhost kernel: [ 720.467741] [<ffffffff814291bd>] lock_extent_bits+0x1ad/0x200
> May 13 04:59:34 testhost kernel: [ 720.468489] [<ffffffff810dfa30>] ? wait_woken+0xc0/0xc0
> May 13 04:59:34 testhost kernel: [ 720.492572] [<ffffffff814156ea>] btrfs_evict_inode+0x19a/0x760
> May 13 04:59:34 testhost kernel: [ 720.493048] [<ffffffff8127fc88>] evict+0xb8/0x1b0
> May 13 04:59:34 testhost kernel: [ 720.494419] [<ffffffff812808fe>] iput+0x2be/0x3e0
> May 13 04:59:34 testhost kernel: [ 720.494598] [<ffffffff81272cb8>] do_unlinkat+0x208/0x330
> May 13 04:59:34 testhost kernel: [ 720.495086] [<ffffffff81265cda>] ? SyS_newfstatat+0x2a/0x40
> May 13 04:59:34 testhost kernel: [ 720.495511] [<ffffffff81566925>] ? lockdep_sys_exit_thunk+0x12/0x14
> May 13 04:59:34 testhost kernel: [ 720.495750] [<ffffffff8127359b>] SyS_unlinkat+0x1b/0x40
> May 13 04:59:34 testhost kernel: [ 720.496101] [<ffffffff819af5b2>] system_call_fastpath+0x16/0x7a
> May 13 04:59:34 testhost kernel: [ 720.628063] 1 lock held by rm/4994:
> May 13 04:59:34 testhost kernel: [ 720.628239] #0: (sb_writers#3){.+.+.+}, at: [<ffffffff812861f4>] mnt_want_write+0x24/0x50
Thanks. After over 1 hour running these scripts I was able to
reproduce. I'll see if I can figure out why it happens and fix it.
--
Filipe David Manana,
"Reasonable men adapt themselves to the world.
Unreasonable men adapt the world to themselves.
That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html