Re: BTRFS: Transaction aborted (error -24)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This server is used for network storage. When a new client arrives, I
create a snapshot of the workspace subvolume for this client. And
delete it when the client disconnects.
Most workspaces are PC game programs. It contains thousands of files
and Its size ranges from 1GB to 20GB.
About 200 windows clients access this server through samba. About 20
snapshots create/delete in one minute.

# lsof | wc -l
47405

# sysctl fs.file-max
fs.file-max = 39579457

# sysctl fs.file-nr
fs.file-nr = 5120    0    39579457

# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1547267
max locked memory       (kbytes, -l) 16384
max memory size         (kbytes, -m) unlimited
open files                      (-n) 102400
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1547267
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

On Thu, Jun 11, 2020 at 9:52 PM David Sterba <dsterba@xxxxxxx> wrote:
>
> On Thu, Jun 11, 2020 at 08:37:11PM +0800, Qu Wenruo wrote:
> >
> >
> > On 2020/6/11 下午7:20, David Sterba wrote:
> > > On Thu, Jun 11, 2020 at 06:29:34PM +0800, Greed Rong wrote:
> > >> Hi,
> > >> I have got this error several times. Are there any suggestions to avoid this?
> > >>
> > >> # dmesg
> > >> [7142286.563596] ------------[ cut here ]------------
> > >> [7142286.564499] BTRFS: Transaction aborted (error -24)
> > >
> > > EMFILE          24      /* Too many open files */
> > >
> > > you can increase the open file limit but it's strange that this happens,
> > > first time I see this.
> >
> > Not something from btrfs code, thus it must come from the VFS/MM code.
>
> Yeah, this is VFS. Creating a new root will need a new inode and dentry
> and the limits are applied.
>
> > The offending abort transaction is from btrfs_read_fs_root_no_name(),
> > which is updated to btrfs_get_fs_root() in upstream kernel.
> > Overall, it's not much different between the upstream and the 5.0.10 kernel.
> >
> > But with latest btrfs_get_fs_root(), after a quick glance, there isn't
> > any obvious location to introduce the EMFILE error.
> >
> > Any extra info about the worload to trigger the bug?
>
> I think it's from get_anon_bdev, that's called from btrfs_init_fs_root
> (in btrfs_get_fs_root):
>
> 1073 int get_anon_bdev(dev_t *p)
> 1074 {
> 1075         int dev;
> 1076
> 1077         /*
> 1078          * Many userspace utilities consider an FSID of 0 invalid.
> 1079          * Always return at least 1 from get_anon_bdev.
> 1080          */
> 1081         dev = ida_alloc_range(&unnamed_dev_ida, 1, (1 << MINORBITS) - 1,
> 1082                         GFP_ATOMIC);
> 1083         if (dev == -ENOSPC)
> 1084                 dev = -EMFILE;
> 1085         if (dev < 0)
> 1086                 return dev;
> 1087
> 1088         *p = MKDEV(0, dev);
> 1089         return 0;
> 1090 }
> 1091 EXPORT_SYMBOL(get_anon_bdev);
>
> And comment says "Return: 0 on success, -EMFILE if there are no
> anonymous bdevs left ".
>
> The fs tree roots are created later than the actual command is executed,
> so all the errors are also delayed. For that reason I moved eg. the root
> item and path allocation to the first phase. We could do the same for
> the anonymous bdev.
>
> The problem won't go away tough, the question is why is the IDA range
> unnamed_dev_ida exhausted.




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux