On 6/15/20 10:17 PM, Qu Wenruo wrote:
[BUG]
When a lot of subvolumes are created, there is a user report about
transaction aborted:
------------[ cut here ]------------
BTRFS: Transaction aborted (error -24)
WARNING: CPU: 17 PID: 17041 at fs/btrfs/transaction.c:1576 create_pending_snapshot+0xbc4/0xd10 [btrfs]
RIP: 0010:create_pending_snapshot+0xbc4/0xd10 [btrfs]
Call Trace:
create_pending_snapshots+0x82/0xa0 [btrfs]
btrfs_commit_transaction+0x275/0x8c0 [btrfs]
btrfs_mksubvol+0x4b9/0x500 [btrfs]
btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs]
btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs]
btrfs_ioctl+0x11a4/0x2da0 [btrfs]
do_vfs_ioctl+0xa9/0x640
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 33f2f83f3d5250e9 ]---
BTRFS: error (device sda1) in create_pending_snapshot:1576: errno=-24 unknown
BTRFS info (device sda1): forced readonly
BTRFS warning (device sda1): Skipping commit of aborted transaction.
BTRFS: error (device sda1) in cleanup_transaction:1831: errno=-24 unknown
[CAUSE]
The root cause is we don't have unlimited resource for anonymous block
device number.
The anonymous block device pool only contains 1<<20 devices, and is
shared across a several fses, like ceph and overlayfs.
While btrfs has support for 1<<48 subvolumes, so it's just a problem of
time to hit such limit.
[WORKAROUND]
Since it's not possible to completely solve the problem, we can only
workaround it.
Firstly, we can reduce the user of anon_dev. Data reloc tree is not visible
to users, thus it doesn't need anon_dev at all.
This patch will do extra check on root objectid, to rule out roots who
don't need anon_dev.
Although currently it's only data reloc tree and orphan roots.
Reported-by: Greed Rong <greedrong@xxxxxxxxx>
Link: https://lore.kernel.org/linux-btrfs/CA+UqX+NTrZ6boGnWHhSeZmEY5J76CTqmYjO2S+=tHJX7nb9DPw@xxxxxxxxxxxxxx/
Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
Reviewed-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
Thanks,
Josef