On 17/7/20 7:25 pm, Nikolay Borisov wrote:
On 17.07.20 г. 13:05 ч., Anand Jain wrote:
Martin reported the following test case which reproduces lockdep
warning on his machine.
modprobe scsi_debug dev_size_mb=1024 num_parts=2
sleep 3
mkfs.btrfs /dev/sda1
mount /dev/sda1 /mnt
cp -v /bin/ls /mnt
umount /mnt
btrfstune -S 1 /dev/sda1
mount /dev/sda1 /mnt
btrfs dev add /dev/sda2 /mnt
umount /mnt
mount /dev/sda2 /mnt
<splat>
kernel: ======================================================
kernel: WARNING: possible circular locking dependency detected
kernel: 5.8.0-rc1+ #575 Not tainted
kernel: ------------------------------------------------------
kernel: mount/1024 is trying to acquire lock:
kernel: ffff888065e0a4e0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: clone_fs_devices+0x46/0x1f0 [btrfs]
kernel: #012but task is already holding lock:
kernel: ffff8880610508d0 (&fs_info->chunk_mutex){+.+.}-{3:3}, at: btrfs_read_chunk_tree+0xd1/0x390 [btrfs]
kernel: #012which lock already depends on the new lock.
kernel: #012the existing dependency chain (in reverse order) is:
kernel: #012-> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}:
kernel: __lock_acquire+0x798/0xe50
kernel: lock_acquire+0x15a/0x4d0
kernel: __mutex_lock+0x116/0xbd0
kernel: btrfs_remove_chunk+0x769/0xaa0 [btrfs]
kernel: btrfs_delete_unused_bgs+0xa2c/0xfe0 [btrfs]
kernel: cleaner_kthread+0x27c/0x2a0 [btrfs]
kernel: kthread+0x1d6/0x200
kernel: ret_from_fork+0x22/0x30
kernel: #012-> #0 (&fs_devs->device_list_mutex){+.+.}-{3:3}:
kernel: check_prev_add+0xf5/0xf50
kernel: validate_chain+0xca7/0x1920
kernel: __lock_acquire+0x798/0xe50
kernel: lock_acquire+0x15a/0x4d0
kernel: __mutex_lock+0x116/0xbd0
kernel: clone_fs_devices+0x46/0x1f0 [btrfs]
kernel: read_one_dev+0x15f/0x930 [btrfs]
kernel: btrfs_read_chunk_tree+0x333/0x390 [btrfs]
kernel: open_ctree+0xa72/0x15a6 [btrfs]
kernel: btrfs_mount_root.cold+0xe/0xf1 [btrfs]
kernel: legacy_get_tree+0x82/0xd0
kernel: vfs_get_tree+0x4c/0x140
kernel: fc_mount+0xf/0x60
kernel: vfs_kern_mount.part.0+0x71/0x90
kernel: btrfs_mount+0x1d7/0x610 [btrfs]
kernel: legacy_get_tree+0x82/0xd0
kernel: vfs_get_tree+0x4c/0x140
kernel: do_mount+0xad1/0xe70
kernel: __x64_sys_mount+0xbe/0x100
kernel: do_syscall_64+0x56/0xa0
kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
kernel: #012other info that might help us debug this:
kernel: Possible unsafe locking scenario:
kernel: CPU0 CPU1
kernel: ---- ----
kernel: lock(&fs_info->chunk_mutex);
kernel: lock(&fs_devs->device_list_mutex);
kernel: lock(&fs_info->chunk_mutex);
kernel: lock(&fs_devs->device_list_mutex);
kernel: #012 *** DEADLOCK ***
================================================
Lockdep warning is complaining about the violation of the lock order of
device_list_mutex and chunk_mutex[1]. And, lockdep warning isn't entirely
correct, as it appears that it can't understand the different filesystem
instances. Here, chunk_mutex was held by the mounting sprout filesystem,
and device_list_mutex was held belongs to the seed filesystem as the sprout
does not want the seed device to be freed.
>>
[1]
open_ctree <== mount sprout
btrfs_read_chunk_tree()
mutex_lock(&uuid_mutex) <== global fsid lock
mutex_lock(&fs_info->chunk_mutex) <== sprout fs
read_one_dev()
open_seed_devices()
clone_fs_devices()
mutex_lock(&orig->device_list_mutex) <== seed fs_devices
open_seed_devices doesn't delete the seed device
>
There are two function stacks [1] and [2] leading to clone_fs_devices().
[2]
btrfs_init_new_device()
mutex_lock(&uuid_mutex);
btrfs_prepare_sprout()
lockdep_assert_held(&uuid_mutex);
clone_fs_devices()
and neither does prepare_sprout. So why are you mentioning them in the
context of freeing seed devices? By the way clone_fs_devices also
doesn't free the seed.
As mentioned as the sprout does not want the seed device to be freed
while it is being cloned. Quiz: Who could free the seed?
They both hold the uuid_mutex which is sufficient to protect from
freeing the seed device. That's because a seed device can not be
As such the first sentence is false, because holding the uuid_mutex in
those 2 paths has absolutely no relevance to freeing the seed devices.
Uh? Have you looked/re-looked into the related parts of code before
making this comment? Or are you just guessing?
freed while it is mounted because it is read-only and an unmounted
seed device (but registered) can be freed only by the command forget
or making it stale. Which is handled by the function
btrfs_free_stale_devices() which also needs uuid_mutex.
Both of those cases seem to be handled by btrfs_forget_devices which
indeed holds the uuid_mutex so this is correct.
I am surprised you missed to connect the dots.
Thanks Anand
So remove the unnecessary seed->device_list_mutex in clone_fs_devices.
Reported-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
Signed-off-by: Anand Jain <anand.jain@xxxxxxxxxx>
Tested-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
---
The above test case is similar to fstests btrfs/161 so no new
test case will be required.
fs/btrfs/volumes.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c35603b5595a..9dc3b826be0d 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -561,6 +561,8 @@ static int btrfs_free_stale_devices(const char *path,
struct btrfs_device *device, *tmp_device;
int ret = 0;
+ lockdep_assert_held(&uuid_mutex);
+
if (path)
ret = -ENOENT;
@@ -985,7 +987,6 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig)
if (IS_ERR(fs_devices))
return fs_devices;
- mutex_lock(&orig->device_list_mutex);
fs_devices->total_devices = orig->total_devices;
list_for_each_entry(orig_dev, &orig->devices, dev_list) {
@@ -1017,10 +1018,8 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig)
device->fs_devices = fs_devices;
fs_devices->num_devices++;
}
- mutex_unlock(&orig->device_list_mutex);
return fs_devices;
error:
- mutex_unlock(&orig->device_list_mutex);
free_fs_devices(fs_devices);
return ERR_PTR(ret);
}