Re: very slow "btrfs dev delete" 3x6Tb, 7Tb of data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 2, 2020 at 3:39 PM Leszek Dubiel <leszek@xxxxxxxxx> wrote:

>  > Almost no reads, all writes, but slow. And rather high write request
>  > per second, almost double for sdc. And sdc is near it's max
>  > utilization so it might be ear to its iops limit?
>  >
>  > ~210 rareq-sz = 210KiB is the average size of the read request for
> sda and sdb
>  >
>  > Default mkfs and default mount options? Or other and if so what other?
>  >
>  > Many small files on this file system? Or possibly large files with a
>  > lot of fragmentation?
>
> Default mkfs and default mount options.
>
> This system could have a few million (!) of small files.
> On reiserfs it takes about 40 minutes, to do "find /".
> Rsync runs for 6 hours to backup data.

There is a mount option:  max_inline=<bytes> which the man page says
(default: min(2048, page size) )

I've never used it, so in theory the max_inline byte size is 2KiB.
However, I have seen substantially larger inline extents than 2KiB
when using a nodesize larger than 16KiB at mkfs time.

I've wondered whether it makes any difference for the "many small
files" case to do more aggressive inlining of extents.

I've seen with 16KiB leaf size, often small files that could be
inlined, are instead put into a data block group, taking up a minimum
4KiB block size (on x64_64 anyway). I'm not sure why, but I suspect
there just isn't enough room in that leaf to always use inline
extents, and yet there is enough room to just reference it as a data
block group extent. When using a larger node size, a larger percentage
of small files ended up using inline extents. I'd expect this to be
quite a bit more efficient, because it eliminates a time expensive (on
HDD anyway) seek.

Another optimization, using compress=zstd:1, which is the lowest
compression setting. That'll increase the chance a file can use inline
extents, in particular with a larger nodesize.

And still another optimization, at the expense of much more
complexity, is LVM cache with an SSD. You'd have to pick a suitable
policy for the workload, but I expect that if the iostat utilizations
you see of often near max utilization in normal operation, you'll see
improved performance. SSD's can handle way higher iops than HDD. But a
lot of this optimization stuff is use case specific. I'm not even sure
what your mean small file size is.

> # iotop -d30
>
> Total DISK READ:        34.12 M/s | Total DISK WRITE: 40.36 M/s
> Current DISK READ:      34.12 M/s | Current DISK WRITE:      79.22 M/s
>    TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND
>   4596 be/4 root       34.12 M/s   37.79 M/s  0.00 % 91.77 % btrfs

Not so bad for many small file reads and writes with HDD. I've see
this myself with single spindle when doing small file reads and
writes.


-- 
Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux