Re: very slow "btrfs dev delete" 3x6Tb, 7Tb of data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 2, 2020 at 11:47 AM Leszek Dubiel <leszek@xxxxxxxxx> wrote:
>
>
>  > It might be handy to give users a clue on snapshot delete, like add
>  > "use btrfs sub list -d to monitor deletion progress, or btrfs sub sync
>  > to wait for deletion to finish".
>
> After having cleaned old shapshots, and after "dev delete" has
> completed I have added new fresh empty disk
>
>
>                   btrfs dev add /dev/sda3 /
>
>
> and started to balance:
>
>
>                       btrfs balance start -dconvert=raid1 -mconvert=raid1 /
>
>
> It was slow (3-5 MB/sec), so canceled balance.

I'm not sure why it's slow and I don't think it should be this slow,
but I would say that in retrospect it would have been better to NOT
delete the device with a few bad sectors, and instead use `btrfs
replace` to do a 1:1 replacement of that particular drive.

When you delete a device, it has to rewrite everything onto the two
remaining device. And now that you added another device, it has to
rewrite everything onto three devices. Two full balances. Whereas
'device replace' is optimized to copy block groups as-is from source
to destination drives. Only on a read-error will Btrfs use mirror
copies from other devices.


> Device            r/s     w/s     rMB/s     wMB/s   rrqm/s wrqm/s
> %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm  %util
> sda              4,39  142,05      0,90      3,78     3,64 12,68
> 45,32   8,20    9,81    5,48   0,78   209,68    27,26 0,54   7,95
> sdb              4,66  155,25      0,97      4,03     4,52 13,11
> 49,27   7,78    9,25    4,68   0,73   213,20    26,59 0,49   7,89
> sdc              6,35  246,61      0,38      6,94     4,35 25,11
> 40,67   9,24   27,09   48,00  11,92    61,02    28,82 2,65  67,02

Almost no reads, all writes, but slow. And rather high write request
per second, almost double for sdc. And sdc is near it's max
utilization so it might be ear to its iops limit?

~210 rareq-sz = 210KiB is the average size of the read request for sda and sdb

Default mkfs and default mount options? Or other and if so what other?

Many small files on this file system? Or possibly large files with a
lot of fragmentation?


> Device            r/s     w/s     rMB/s     wMB/s   rrqm/s wrqm/s
> %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm  %util
> sda              2,38  186,52      0,20      3,33     1,33 4,20  35,87
> 2,20    9,88    3,48   0,63    86,04    18,27 0,30   5,60
> sdb              1,00  108,35      0,02      1,99     0,00 2,73   0,00
> 2,46    8,92    3,78   0,41    18,00    18,85 0,27   2,98
> sdc             13,47  294,68      0,21      5,32     0,00 6,92   0,00
> 2,29   13,33   61,56  18,28    16,00    18,48 3,13  96,45

And again, sdc is at max utilization, with ~300 write requests per
second which is at the high end for a fast drive for IOPS, if I'm not
mistaken. That's a lot of writes per second. The average write request
size is 18KiB.

So what's going on with the workload? Is this only a balance operation
or are there concurrent writes happening from some process?




> iotop:
>
> Total DISK READ:         0.00 B/s | Total DISK WRITE:         0.00 B/s
> Current DISK READ:       0.00 B/s | Current DISK WRITE:       0.00 B/s
>    TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND
>      1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
>      2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kthreadd]
>      3 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_gp]
>      4 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_par_gp]
>      6 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 %
> [kworker/0:0H-kblockd]
>      8 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [mm_percpu_wq]
>      9 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/0]
>     10 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_sched]
>     11 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_bh]
>     12 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/0]
>     14 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [cpuhp/0]
>     15 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [cpuhp/1]
>     16 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/1]
>     17 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/1]
>     19 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 %
> [kworker/1:0H-kblockd]
>     20 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [cpuhp/2]
>     21 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/2]
>     22 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/2]
>     24 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 %
> [kworker/2:0H-kblockd]
>     25 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [cpuhp/3]
>     26 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/3]
>     27 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/3]
>     29 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 %
> [kworker/3:0H-kblockd]
>     30 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [cpuhp/4]


iotop -d 5 -o might be more revealing; all zeros doesn't really make
sense. I see balance and scrub reported in iotop.




-- 
Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux