On 2019/10/24 下午11:44, Johannes Thumshirn wrote: > On the BeeGFS Mailing list there is a report claiming BTRFS is not usable > with BeeGFS, as BeeGFS is using statfs output to determine the number of > total and free inodes. BeeGFS needs the number of free inodes as it stores > its meta-data either in extended attributes of the underlying file-system > or directly in an inline inode. According to the BeeGFS Server Tuning > Guide: > > """ > BeeGFS metadata is stored as extended attributes (EAs) on the underlying > file system to optimal performance. One metadata file will be created for > each file that a user creates. About extended attributes usage: BeeGFS > Metadata files have a size of 0 bytes (i.e. no normal file contents). > > Access to extended attributes is possible with the getfattr tool. > > If the inodes of the underlying file system are sufficiently large, EAs > can be inlined into the inode of the underlying file system. Additional > data blocks are then not required anymore and metadata disk usage will be > reduced. With EAs inlined into the inode, access latencies are reduced as > seeking to an extra data block is not required anymore. > """ Personally speaking, reporting 0 used and 0 free should be the proper way. User of the fs should be aware of dynamical fs which doesn't go fixed inodes. I really think it's BeeFS' job to change their behavior. Since there are more thing to consider when faking the used/free inodes. > > Provide some estimated numbers of total and free inodes in statfs by > dividing the number of blocks by the size of an inode-item for the total > number of possible inodes and for the number of free inodes divide the > number of free blocks by the size of an inode-item, similar to what other > file-systems without a fixed number of inodes do. > > This of is just an estimation and should not be relied upon. > > Without the patch applied: > rapido1:/# df -hTi /mnt/test > Filesystem Type Inodes IUsed IFree IUse% Mounted on > /mnt/test btrfs 0 0 0 - /mnt/test > > With the patch applied on an empty fs: > rapido1:/# df -hTi /mnt/test > Filesystem Type Inodes IUsed IFree IUse% Mounted on > /dev/zram0 btrfs 1.6K 0 1.6K 0% /mnt/test > > With the patch applied on a dirty fs: > rapido1:/# df -hTi /mnt/test > Filesystem Type Inodes IUsed IFree IUse% Mounted on > /dev/zram0 btrfs 1.6K 1.5K 197 88% /mnt/test > > Link: https://groups.google.com/forum/#!msg/fhgfs-user/IJqGS5o1UD0/8ftDdUI3AQAJ > Signed-off-by: Johannes Thumshirn <jthumshirn@xxxxxxx> > --- > fs/btrfs/super.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c > index b818f764c1c9..6f6f6a70eb1e 100644 > --- a/fs/btrfs/super.c > +++ b/fs/btrfs/super.c > @@ -2068,6 +2068,8 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf) > buf->f_blocks = div_u64(btrfs_super_total_bytes(disk_super), factor); > buf->f_blocks >>= bits; > buf->f_bfree = buf->f_blocks - (div_u64(total_used, factor) >> bits); > + buf->f_files = div_u64(buf->f_blocks, sizeof(struct btrfs_inode_item)); That's too optimistic. (I'd call it even beyond Elon Musk's schedule) We have tree block header overhead, and with the increase of tree blocks, the size of extent tree will also increase and bring overhead. In long run, user will report that the ffiles increases more than they used. It will be a hell to calculate such estimation, and we will never reach a good enough point for that. > + buf->f_ffree = div_u64(buf->f_bfree, sizeof(struct btrfs_inode_item)); The same can be applied to ffree, it will decrease faster than real usage. If whatever the distributed fs is using ffree/files as an indicator, it's not reliable anyway. And if they accept such unreliable indicator, they'd better double think before using that indicator. Thanks, Qu > > /* Account global block reserve as used, it's in logical size already */ > spin_lock(&block_rsv->lock); >
Attachment:
signature.asc
Description: OpenPGP digital signature
