On Thu, Jan 2, 2020 at 11:47 AM Leszek Dubiel <leszek@xxxxxxxxx> wrote: > > > > It might be handy to give users a clue on snapshot delete, like add > > "use btrfs sub list -d to monitor deletion progress, or btrfs sub sync > > to wait for deletion to finish". > > After having cleaned old shapshots, and after "dev delete" has > completed I have added new fresh empty disk > > > btrfs dev add /dev/sda3 / > > > and started to balance: > > > btrfs balance start -dconvert=raid1 -mconvert=raid1 / > > > It was slow (3-5 MB/sec), so canceled balance. I'm not sure why it's slow and I don't think it should be this slow, but I would say that in retrospect it would have been better to NOT delete the device with a few bad sectors, and instead use `btrfs replace` to do a 1:1 replacement of that particular drive. When you delete a device, it has to rewrite everything onto the two remaining device. And now that you added another device, it has to rewrite everything onto three devices. Two full balances. Whereas 'device replace' is optimized to copy block groups as-is from source to destination drives. Only on a read-error will Btrfs use mirror copies from other devices. > Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s > %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util > sda 4,39 142,05 0,90 3,78 3,64 12,68 > 45,32 8,20 9,81 5,48 0,78 209,68 27,26 0,54 7,95 > sdb 4,66 155,25 0,97 4,03 4,52 13,11 > 49,27 7,78 9,25 4,68 0,73 213,20 26,59 0,49 7,89 > sdc 6,35 246,61 0,38 6,94 4,35 25,11 > 40,67 9,24 27,09 48,00 11,92 61,02 28,82 2,65 67,02 Almost no reads, all writes, but slow. And rather high write request per second, almost double for sdc. And sdc is near it's max utilization so it might be ear to its iops limit? ~210 rareq-sz = 210KiB is the average size of the read request for sda and sdb Default mkfs and default mount options? Or other and if so what other? Many small files on this file system? Or possibly large files with a lot of fragmentation? > Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s > %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util > sda 2,38 186,52 0,20 3,33 1,33 4,20 35,87 > 2,20 9,88 3,48 0,63 86,04 18,27 0,30 5,60 > sdb 1,00 108,35 0,02 1,99 0,00 2,73 0,00 > 2,46 8,92 3,78 0,41 18,00 18,85 0,27 2,98 > sdc 13,47 294,68 0,21 5,32 0,00 6,92 0,00 > 2,29 13,33 61,56 18,28 16,00 18,48 3,13 96,45 And again, sdc is at max utilization, with ~300 write requests per second which is at the high end for a fast drive for IOPS, if I'm not mistaken. That's a lot of writes per second. The average write request size is 18KiB. So what's going on with the workload? Is this only a balance operation or are there concurrent writes happening from some process? > iotop: > > Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s > Current DISK READ: 0.00 B/s | Current DISK WRITE: 0.00 B/s > TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND > 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init > 2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd] > 3 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_gp] > 4 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_par_gp] > 6 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % > [kworker/0:0H-kblockd] > 8 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [mm_percpu_wq] > 9 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0] > 10 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_sched] > 11 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_bh] > 12 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0] > 14 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [cpuhp/0] > 15 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [cpuhp/1] > 16 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/1] > 17 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/1] > 19 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % > [kworker/1:0H-kblockd] > 20 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [cpuhp/2] > 21 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/2] > 22 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/2] > 24 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % > [kworker/2:0H-kblockd] > 25 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [cpuhp/3] > 26 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/3] > 27 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/3] > 29 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % > [kworker/3:0H-kblockd] > 30 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [cpuhp/4] iotop -d 5 -o might be more revealing; all zeros doesn't really make sense. I see balance and scrub reported in iotop. -- Chris Murphy
