On 6/15/20 10:17 PM, Qu Wenruo wrote:
[BUG]
When a lot of subvolumes are created, there is a user report about
transaction aborted:
------------[ cut here ]------------
BTRFS: Transaction aborted (error -24)
WARNING: CPU: 17 PID: 17041 at fs/btrfs/transaction.c:1576 create_pending_snapshot+0xbc4/0xd10 [btrfs]
RIP: 0010:create_pending_snapshot+0xbc4/0xd10 [btrfs]
Call Trace:
create_pending_snapshots+0x82/0xa0 [btrfs]
btrfs_commit_transaction+0x275/0x8c0 [btrfs]
btrfs_mksubvol+0x4b9/0x500 [btrfs]
btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs]
btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs]
btrfs_ioctl+0x11a4/0x2da0 [btrfs]
do_vfs_ioctl+0xa9/0x640
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 33f2f83f3d5250e9 ]---
BTRFS: error (device sda1) in create_pending_snapshot:1576: errno=-24 unknown
BTRFS info (device sda1): forced readonly
BTRFS warning (device sda1): Skipping commit of aborted transaction.
BTRFS: error (device sda1) in cleanup_transaction:1831: errno=-24 unknown
[CAUSE]
The root cause is we don't have unlimited resource for anonymous block
device number.
The anonymous block device pool only contains 1<<20 devices, and is
shared across a several fses, like ceph and overlayfs.
While btrfs has support for 1<<48 subvolumes, so it's just a problem of
time to hit such limit.
[WORKAROUND]
Here we can free btrfs_root::anon_dev as long as the subvolume is no
longer visible to users.
By freeing it earlier we reclaim the anon_dev quicker, hopefully to
reduce the chance of exhausting the pool.
Why isn't this happening as part of the root teardown once all the references to
it are gone? Thanks,
Josef