Re: very slow "btrfs dev delete" 3x6Tb, 7Tb of data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2019/12/26 上午6:35, Leszek Dubiel wrote:
> 
> Hello!
> 
> I have a server: 3 disks, 6TB each, total 17TB space, occupied with data
> 6TB.
> 
> 
> One of the disks got smart errors:
> 
>     197 Current_Pending_Sector  0x0022   100   100   000 Old_age  
> Always       -       16
>     198 Offline_Uncorrectable   0x0008   100   100   000 Old_age  
> Offline      -       2
> 
> And didn't pass tests:
> 
>     Num  Test_Description    Status                  Remaining
> LifeTime(hours)  LBA_of_first_error
>     # 1  Extended offline    Completed: read failure 90%     
> 3575         -
>     # 2  Short offline       Completed without error 00%     
> 3574         -
>     # 3  Extended offline    Completed: read failure 90%     
> 3574         -
>     # 4  Extended offline    Completed: read failure 90%     
> 3560         -
>     # 5  Extended offline    Completed: read failure 50%     
> 3559         -
> 
> I decided to remove that drive from BTRFS system:
> 
>     btrfs dev delete /dev/sda2 /
> 
> 
> At the beginning iostat showed that system is reading from /dev/sda
> 100Mb/sec, then writing to /dev/sdb + /dev/sdc 100Mb/sek. It looked
> correct, and during a few hours it moved aobut 1.5 TB of data out of the
> failing disk. But then it stuck... Iostat no longer showed reading or
> writing, "btrfs dev usage" didn't change. Couldn't stop "btrfs dev
> delete" command -- nor Ctrl+C nor "kill -9". I have rebooted system,
> started "btrfs dev delete" again.
> 
> But this is extremely slow. Disks can read/write 100-150Mb/sek. But
> "process btrfs dev delete" moved 53Gb during 6 hours, that is only about
> 2Mb/sek.
> 
> Why this is so slow? Below are some logs. Some other people also have
> had this problem...

Are you using qgroups?

You can verify it by "btrfs qgroup show -pcre <mount>" to verify.
Qgroup can hugely impact performance for balance, if using kernel older
than v5.2.

You can either disable qgroup, losing the ability to trace how many
bytes are exclusively used by each subvolume, or upgrade to v5.2 kernel
to solve it.

Thanks,
Qu

> 
> What is my problem? I am testing btrfs for many years, I have had some
> issues few years ago, recently I considered btrfs to be stable... but I
> cannot put btrfs into production when I have such problems when disk
> fails... to move 3Tb with 2mb/sec speed it would take many, many days.
> Server does almost nothing... it just deletes one drive. Cpu is not
> intensive, iostat looks lazy...
> 
> 
> 
> 
> #################################################
> 
> root@wawel:~# uname -a
> Linux wawel 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11)
> x86_64 GNU/Linux
> 
> root@wawel:~# btrfs --version
> btrfs-progs v4.20.1
> 
> root@wawel:~# dpkg -l | egrep btrfs
> ii  btrfs-progs                   4.20.1-2 amd64        Checksumming
> Copy on Write Filesystem utilities
> 
> #################################################
> 
> root@wawel:~# btrfs sub list /  | wc -l
> 21
> 
> #################################################
> 
> root@wawel:~# btrfs dev usage /
> /dev/sda2, ID: 1
>    Device size:             5.45TiB
>    Device slack:            5.45TiB
>    Data,single:             1.08TiB
>    Metadata,RAID1:         11.00GiB
>    System,RAID1:           32.00MiB
>    Unallocated:            -1.09TiB
> 
> /dev/sdb2, ID: 2
>    Device size:             5.45TiB
>    Device slack:              0.00B
>    Data,single:             3.22TiB
>    Metadata,RAID1:         87.00GiB
>    Unallocated:             2.15TiB
> 
> /dev/sdc2, ID: 3
>    Device size:             5.45TiB
>    Device slack:              0.00B
>    Data,single:             3.21TiB
>    Metadata,RAID1:         96.00GiB
>    System,RAID1:           32.00MiB
>    Unallocated:             2.15TiB
> 
> #################################################
> 
> root@wawel:~# iostat -m -d 30
> Linux 4.19.0-6-amd64 (wawel)     25.12.2019     _x86_64_    (4 CPU)
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda              29,52         2,77         0,90     158467 51287
> sdb             532,93         0,77        16,80      44046 962107
> sdc             532,03         0,59        16,79      34017 961406
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda               0,70         0,00         0,01 0          0
> sdb             624,20         0,05        14,17          1 424
> sdc             628,47         0,06        14,79          1 443
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda              17,90         0,23         0,59 7         17
> sdb             760,63         0,15        15,39          4 461
> sdc             761,13         0,09        16,75          2 502
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda              65,80         1,00         0,26 30          7
> sdb            1085,93         0,18        26,14          5 784
> sdc            1081,03         0,09        27,03          2 810
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda             101,53         1,57         0,10 47          3
> sdb             609,63         0,34        19,29         10 578
> sdc             612,67         0,33        19,73          9 591
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda             114,30         1,76         0,28 52          8
> sdb             355,90         0,39        19,60         11 587
> sdc             368,50         0,35        20,17         10 605
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda              63,70         0,55         0,54 16         16
> sdb             438,40         0,13         7,83          3 234
> sdc             463,97         0,09         8,93          2 267
> 
> root@wawel:~# iostat -m -d 30
> Linux 4.19.0-6-amd64 (wawel)     25.12.2019     _x86_64_    (4 CPU)
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda              30,18         2,51         1,01     226710 91462
> sdb             531,30         0,69        15,36      62022 1384919
> sdc             535,25         0,59        15,53      53092 1400122
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda               2,80         0,01         0,49 0         14
> sdb             154,77         0,53         3,05 15         91
> sdc             161,40         0,64         3,02 19         90
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda             124,87         0,01         3,02 0         90
> sdb             799,33         0,13        19,40          3 581
> sdc             930,63         0,23        22,21          6 666
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda               5,97         0,00         0,69 0         20
> sdb             730,10         0,05        14,62          1 438
> sdc             739,23         0,13        14,86          3 445
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda               6,53         0,01         1,39 0         41
> sdb             800,63         0,13        15,63          4 468
> sdc             801,83         0,02        16,12          0 483
> 
> Device             tps    MB_read/s    MB_wrtn/s    MB_read MB_wrtn
> sda               4,83         0,00         1,13 0         33
> sdb             756,30         0,14        15,08          4 452
> sdc             759,03         0,15        15,33          4 459
> 
> 
> 
> #################################################
> 
> root@wawel:~# btrfs dev usage -b / | sed -r "s/^/$(date +'%F %T')\t/" >>
> btrfsdevusage
> root@wawel:~# btrfs dev usage -b / | sed -r "s/^/$(date +'%F %T')\t/" >>
> btrfsdevusage
> root@wawel:~# btrfs dev usage -b / | sed -r "s/^/$(date +'%F %T')\t/" >>
> btrfsdevusage
> root@wawel:~# btrfs dev usage -b / | sed -r "s/^/$(date +'%F %T')\t/" >>
> btrfsdevusage
> 
> root@wawel:~# cat btrfsdevusage  | egrep sda -A5 | egrep single
> 2019-12-25 17:03:04       Data,single:          1188632199168
> 2019-12-25 18:31:27       Data,single:          1184337231872
> 2019-12-25 19:00:56       Data,single:          1177894780928
> 2019-12-25 20:25:46       Data,single:          1158567428096
> 2019-12-25 21:06:00       Data,single:          1147830009856
> 2019-12-25 23:01:34       Data,single:          1132797624320
> 2019-12-25 23:16:27       Data,single:          1130650140672
> 
> root@wawel:~# echo "speed is $(( ( 1188632199168  - 1130650140672 ) /
> 1024 / 1024 / ((23 - 17) * 3600)  )) MB/sec"
> speed is 2 MB/sec
> 
> #################################################
> 
> root@wawel:~# iotop -d12
> 
> Total DISK READ:         0.00 B/s | Total DISK WRITE:         0.00 B/s
> Current DISK READ:       0.00 B/s | Current DISK WRITE:     183.48 M/s
>   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND
>  8138 be/4 root        0.00 B/s    0.00 B/s  0.00 % 99.99 % btrfs dev
> delete /dev/sda2 /
>     1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
>     2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kthreadd]
>     3 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_gp]
>     4 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_par_gp]
>     6 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 %
> [kworker/0:0H-kblockd]
>     8 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [mm_percpu_wq]
> 
> Total DISK READ:        11.23 M/s | Total DISK WRITE:        28.77 M/s
> Current DISK READ:      11.23 M/s | Current DISK WRITE:      56.51 M/s
>   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND
> 10349 be/4 root       29.26 K/s  168.89 K/s  0.00 % 75.29 %
> [kworker/u8:3-btrfs-endio-meta]
>  8138 be/4 root       11.03 M/s   23.61 M/s  0.00 % 74.65 % btrfs dev
> delete /dev/sda2 /
> 10351 be/4 root       45.21 K/s 1365.76 K/s  0.00 % 11.50 %
> [kworker/u8:8-btrfs-endio-write]
> 10350 be/4 root       63.83 K/s 1542.63 K/s  0.00 % 10.22 %
> [kworker/u8:7-btrfs-freespace-write]
>  7879 be/4 root       42.56 K/s 1186.23 K/s  0.00 %  7.07 %
> [kworker/u8:1-btrfs-freespace-write]
> 23032 be/4 root       25.27 K/s  972.12 K/s  0.00 %  5.04 %
> [kworker/u8:12-btrfs-extent-refs]
>  8035 be/4 root        0.00 B/s   25.27 K/s  0.00 %  0.00 %
> [kworker/u8:4-btrfs-endio-meta]
>  8036 be/4 root        0.00 B/s   11.97 K/s  0.00 %  0.00 %
> [kworker/u8:5-btrfs-extent-refs]
> 31253 be/4 root        0.00 B/s    5.32 K/s  0.00 %  0.00 %
> [kworker/u8:9-btrfs-extent-refs]
> 
> Total DISK READ:       674.25 K/s | Total DISK WRITE:        13.57 M/s
> Current DISK READ:     674.25 K/s | Current DISK WRITE:      39.97 M/s
>   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND
>  8138 be/4 root      147.62 K/s    7.72 M/s  0.00 % 79.89 % btrfs dev
> delete /dev/sda2 /
> 10274 be/4 root      484.08 K/s    4.91 M/s  0.00 % 56.93 %
> [kworker/u8:2+btrfs-extent-refs]
> 10351 be/4 root       27.93 K/s  492.06 K/s  0.00 %  3.03 %
> [kworker/u8:8-btrfs-endio-meta]
> 10350 be/4 root        3.99 K/s  115.70 K/s  0.00 %  0.59 %
> [kworker/u8:7-btrfs-endio-write]
>  8037 be/4 root        5.32 K/s   79.79 K/s  0.00 %  0.40 %
> [kworker/u8:6-btrfs-endio-write]
> 23032 be/4 root        3.99 K/s   79.79 K/s  0.00 %  0.20 %
> [kworker/u8:12-btrfs-endio-write]
>  7879 be/4 root     1361.80 B/s  105.06 K/s  0.00 %  0.10 %
> [kworker/u8:1-btrfs-endio-write]
>   538 be/4 root        0.00 B/s 1021.35 B/s  0.00 %  0.00 % rsyslogd -n
> -iNONE [rs:main Q:Reg]
> 10349 be/4 root        0.00 B/s   86.44 K/s  0.00 %  0.00 %
> [kworker/u8:3-btrfs-endio-meta]
>     1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
> 
> 
> 
> #################################################
> 
> root@wawel:~# top -d12
> top - 23:23:01 up 1 day, 56 min,  1 user,  load average: 3,05, 3,28, 3,47
> Tasks: 137 total,   1 running, 136 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  0,0 us,  4,5 sy,  0,0 ni, 92,1 id,  3,3 wa,  0,0 hi, 0,1 si, 
> 0,0 st
> MiB Mem :   7841,1 total,    797,0 free,    259,7 used,   6784,4 buff/cache
> MiB Swap:   8053,0 total,   7962,2 free,     90,8 used.   7265,2 avail Mem
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM TIME+ COMMAND
>  8138 root      20   0    8540    104    104 D  11,6   0,0 23:25.88 btrfs
> 23032 root      20   0       0      0      0 I   4,8   0,0 0:51.20
> kworker/u8:12-btrfs-extent-refs
> 10350 root      20   0       0      0      0 I   0,9   0,0 0:01.79
> kworker/u8:7-btrfs-endio-meta
> 10349 root      20   0       0      0      0 I   0,8   0,0 0:02.12
> kworker/u8:3-btrfs-freespace-write
>   242 root       0 -20       0      0      0 I   0,2   0,0 0:59.66
> kworker/2:1H-kblockd
> 
> 
> #################################################
> 
> root@wawel:~# tail /var/log/kern.log -n50
> Dec 25 21:50:35 wawel kernel: [84258.634360] BTRFS info (device sda2):
> found 1752 extents
> Dec 25 21:50:41 wawel kernel: [84264.771079] BTRFS info (device sda2):
> relocating block group 3434964058112 flags data
> Dec 25 21:51:09 wawel kernel: [84291.864553] BTRFS info (device sda2):
> found 584 extents
> Dec 25 21:55:51 wawel kernel: [84574.383303] BTRFS info (device sda2):
> found 584 extents
> Dec 25 21:55:56 wawel kernel: [84579.359349] BTRFS info (device sda2):
> relocating block group 3431742832640 flags data
> Dec 25 21:56:17 wawel kernel: [84600.668237] BTRFS info (device sda2):
> found 635 extents
> Dec 25 22:00:36 wawel kernel: [84859.766366] BTRFS info (device sda2):
> found 635 extents
> Dec 25 22:00:47 wawel kernel: [84870.367399] BTRFS info (device sda2):
> relocating block group 3428521607168 flags data
> Dec 25 22:01:14 wawel kernel: [84897.313282] BTRFS info (device sda2):
> found 1170 extents
> Dec 25 22:05:38 wawel kernel: [85161.264397] BTRFS info (device sda2):
> found 1170 extents
> Dec 25 22:05:46 wawel kernel: [85169.216425] BTRFS info (device sda2):
> relocating block group 3425300381696 flags data
> Dec 25 22:06:20 wawel kernel: [85202.963222] BTRFS info (device sda2):
> found 1336 extents
> Dec 25 22:13:00 wawel kernel: [85603.648327] BTRFS info (device sda2):
> found 1336 extents
> Dec 25 22:13:02 wawel kernel: [85604.878785] BTRFS info (device sda2):
> relocating block group 3422079156224 flags data
> Dec 25 22:14:57 wawel kernel: [85720.394945] BTRFS info (device sda2):
> found 1771 extents
> Dec 25 22:23:03 wawel kernel: [86206.582245] BTRFS info (device sda2):
> found 1771 extents
> Dec 25 22:23:04 wawel kernel: [86207.671814] BTRFS info (device sda2):
> relocating block group 3418857930752 flags data
> Dec 25 22:23:19 wawel kernel: [86222.138761] BTRFS info (device sda2):
> found 199 extents
> Dec 25 22:26:26 wawel kernel: [86408.991487] BTRFS info (device sda2):
> found 199 extents
> Dec 25 22:26:29 wawel kernel: [86412.181922] BTRFS info (device sda2):
> relocating block group 3415636705280 flags data
> Dec 25 22:26:48 wawel kernel: [86431.726268] BTRFS info (device sda2):
> found 50 extents
> Dec 25 22:27:32 wawel kernel: [86475.800606] BTRFS info (device sda2):
> found 50 extents
> Dec 25 22:27:39 wawel kernel: [86482.222298] BTRFS info (device sda2):
> found 20 extents
> Dec 25 22:27:44 wawel kernel: [86487.168278] BTRFS info (device sda2):
> relocating block group 3412415479808 flags data
> Dec 25 22:28:27 wawel kernel: [86530.525657] BTRFS info (device sda2):
> found 6748 extents
> Dec 25 22:34:07 wawel kernel: [86870.618140] BTRFS info (device sda2):
> found 6748 extents
> Dec 25 22:34:35 wawel kernel: [86898.498949] BTRFS info (device sda2):
> relocating block group 3409194254336 flags data
> Dec 25 22:36:06 wawel kernel: [86989.589309] BTRFS info (device sda2):
> found 1175 extents
> Dec 25 22:40:12 wawel kernel: [87234.881523] perf: interrupt took too
> long (3943 > 3930), lowering kernel.perf_event_max_sample_rate to 50500
> Dec 25 22:44:44 wawel kernel: [87507.870507] BTRFS info (device sda2):
> found 1175 extents
> Dec 25 22:44:52 wawel kernel: [87515.563744] BTRFS info (device sda2):
> relocating block group 3405973028864 flags data
> Dec 25 22:45:14 wawel kernel: [87537.128352] BTRFS info (device sda2):
> found 372 extents
> Dec 25 22:46:11 wawel kernel: [87594.569709] BTRFS info (device sda2):
> found 372 extents
> Dec 25 22:46:17 wawel kernel: [87600.772681] BTRFS info (device sda2):
> relocating block group 3402751803392 flags data
> Dec 25 22:47:27 wawel kernel: [87670.663043] BTRFS info (device sda2):
> found 3483 extents
> Dec 25 23:04:57 wawel kernel: [88720.068885] BTRFS info (device sda2):
> found 3479 extents
> Dec 25 23:05:03 wawel kernel: [88726.864461] BTRFS info (device sda2):
> relocating block group 3399530577920 flags data
> Dec 25 23:06:36 wawel kernel: [88819.464525] BTRFS info (device sda2):
> found 1871 extents
> Dec 25 23:16:11 wawel kernel: [89393.997406] BTRFS info (device sda2):
> found 1868 extents
> Dec 25 23:16:13 wawel kernel: [89396.825305] BTRFS info (device sda2):
> relocating block group 3396309352448 flags data
> Dec 25 23:16:41 wawel kernel: [89424.310916] BTRFS info (device sda2):
> found 1946 extents
> Dec 25 23:22:12 wawel kernel: [89755.068457] BTRFS info (device sda2):
> found 1946 extents
> Dec 25 23:22:15 wawel kernel: [89758.601122] BTRFS info (device sda2):
> relocating block group 3393088126976 flags data
> Dec 25 23:22:35 wawel kernel: [89778.172631] BTRFS info (device sda2):
> found 647 extents
> Dec 25 23:23:06 wawel kernel: [89809.153016] BTRFS info (device sda2):
> found 647 extents
> Dec 25 23:23:07 wawel kernel: [89810.017006] BTRFS info (device sda2):
> relocating block group 3389866901504 flags data
> Dec 25 23:23:24 wawel kernel: [89827.351203] BTRFS info (device sda2):
> found 745 extents
> Dec 25 23:24:38 wawel kernel: [89901.176493] BTRFS info (device sda2):
> found 745 extents
> Dec 25 23:24:51 wawel kernel: [89914.840850] BTRFS info (device sda2):
> relocating block group 3386645676032 flags data
> Dec 25 23:25:40 wawel kernel: [89963.641062] BTRFS info (device sda2):
> found 3114 extents
> 
> 
> 
> 
> ################################################
> 
> More on this topic:
> 
> https://strugglers.net/~andy/blog/2018/06/15/another-disappointing-btrfs-experience/
> 
> 
> https://www.spinics.net/lists/linux-btrfs/msg50771.html
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1294531
> 
> 
> 
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux