On 2019/12/26 上午6:35, Leszek Dubiel wrote: > > Hello! > > I have a server: 3 disks, 6TB each, total 17TB space, occupied with data > 6TB. > > > One of the disks got smart errors: > > 197 Current_Pending_Sector 0x0022 100 100 000 Old_age > Always - 16 > 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age > Offline - 2 > > And didn't pass tests: > > Num Test_Description Status Remaining > LifeTime(hours) LBA_of_first_error > # 1 Extended offline Completed: read failure 90% > 3575 - > # 2 Short offline Completed without error 00% > 3574 - > # 3 Extended offline Completed: read failure 90% > 3574 - > # 4 Extended offline Completed: read failure 90% > 3560 - > # 5 Extended offline Completed: read failure 50% > 3559 - > > I decided to remove that drive from BTRFS system: > > btrfs dev delete /dev/sda2 / > > > At the beginning iostat showed that system is reading from /dev/sda > 100Mb/sec, then writing to /dev/sdb + /dev/sdc 100Mb/sek. It looked > correct, and during a few hours it moved aobut 1.5 TB of data out of the > failing disk. But then it stuck... Iostat no longer showed reading or > writing, "btrfs dev usage" didn't change. Couldn't stop "btrfs dev > delete" command -- nor Ctrl+C nor "kill -9". I have rebooted system, > started "btrfs dev delete" again. > > But this is extremely slow. Disks can read/write 100-150Mb/sek. But > "process btrfs dev delete" moved 53Gb during 6 hours, that is only about > 2Mb/sek. > > Why this is so slow? Below are some logs. Some other people also have > had this problem... Are you using qgroups? You can verify it by "btrfs qgroup show -pcre <mount>" to verify. Qgroup can hugely impact performance for balance, if using kernel older than v5.2. You can either disable qgroup, losing the ability to trace how many bytes are exclusively used by each subvolume, or upgrade to v5.2 kernel to solve it. Thanks, Qu > > What is my problem? I am testing btrfs for many years, I have had some > issues few years ago, recently I considered btrfs to be stable... but I > cannot put btrfs into production when I have such problems when disk > fails... to move 3Tb with 2mb/sec speed it would take many, many days. > Server does almost nothing... it just deletes one drive. Cpu is not > intensive, iostat looks lazy... > > > > > ################################################# > > root@wawel:~# uname -a > Linux wawel 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) > x86_64 GNU/Linux > > root@wawel:~# btrfs --version > btrfs-progs v4.20.1 > > root@wawel:~# dpkg -l | egrep btrfs > ii btrfs-progs 4.20.1-2 amd64 Checksumming > Copy on Write Filesystem utilities > > ################################################# > > root@wawel:~# btrfs sub list / | wc -l > 21 > > ################################################# > > root@wawel:~# btrfs dev usage / > /dev/sda2, ID: 1 > Device size: 5.45TiB > Device slack: 5.45TiB > Data,single: 1.08TiB > Metadata,RAID1: 11.00GiB > System,RAID1: 32.00MiB > Unallocated: -1.09TiB > > /dev/sdb2, ID: 2 > Device size: 5.45TiB > Device slack: 0.00B > Data,single: 3.22TiB > Metadata,RAID1: 87.00GiB > Unallocated: 2.15TiB > > /dev/sdc2, ID: 3 > Device size: 5.45TiB > Device slack: 0.00B > Data,single: 3.21TiB > Metadata,RAID1: 96.00GiB > System,RAID1: 32.00MiB > Unallocated: 2.15TiB > > ################################################# > > root@wawel:~# iostat -m -d 30 > Linux 4.19.0-6-amd64 (wawel) 25.12.2019 _x86_64_ (4 CPU) > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 29,52 2,77 0,90 158467 51287 > sdb 532,93 0,77 16,80 44046 962107 > sdc 532,03 0,59 16,79 34017 961406 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 0,70 0,00 0,01 0 0 > sdb 624,20 0,05 14,17 1 424 > sdc 628,47 0,06 14,79 1 443 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 17,90 0,23 0,59 7 17 > sdb 760,63 0,15 15,39 4 461 > sdc 761,13 0,09 16,75 2 502 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 65,80 1,00 0,26 30 7 > sdb 1085,93 0,18 26,14 5 784 > sdc 1081,03 0,09 27,03 2 810 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 101,53 1,57 0,10 47 3 > sdb 609,63 0,34 19,29 10 578 > sdc 612,67 0,33 19,73 9 591 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 114,30 1,76 0,28 52 8 > sdb 355,90 0,39 19,60 11 587 > sdc 368,50 0,35 20,17 10 605 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 63,70 0,55 0,54 16 16 > sdb 438,40 0,13 7,83 3 234 > sdc 463,97 0,09 8,93 2 267 > > root@wawel:~# iostat -m -d 30 > Linux 4.19.0-6-amd64 (wawel) 25.12.2019 _x86_64_ (4 CPU) > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 30,18 2,51 1,01 226710 91462 > sdb 531,30 0,69 15,36 62022 1384919 > sdc 535,25 0,59 15,53 53092 1400122 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 2,80 0,01 0,49 0 14 > sdb 154,77 0,53 3,05 15 91 > sdc 161,40 0,64 3,02 19 90 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 124,87 0,01 3,02 0 90 > sdb 799,33 0,13 19,40 3 581 > sdc 930,63 0,23 22,21 6 666 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 5,97 0,00 0,69 0 20 > sdb 730,10 0,05 14,62 1 438 > sdc 739,23 0,13 14,86 3 445 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 6,53 0,01 1,39 0 41 > sdb 800,63 0,13 15,63 4 468 > sdc 801,83 0,02 16,12 0 483 > > Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 4,83 0,00 1,13 0 33 > sdb 756,30 0,14 15,08 4 452 > sdc 759,03 0,15 15,33 4 459 > > > > ################################################# > > root@wawel:~# btrfs dev usage -b / | sed -r "s/^/$(date +'%F %T')\t/" >> > btrfsdevusage > root@wawel:~# btrfs dev usage -b / | sed -r "s/^/$(date +'%F %T')\t/" >> > btrfsdevusage > root@wawel:~# btrfs dev usage -b / | sed -r "s/^/$(date +'%F %T')\t/" >> > btrfsdevusage > root@wawel:~# btrfs dev usage -b / | sed -r "s/^/$(date +'%F %T')\t/" >> > btrfsdevusage > > root@wawel:~# cat btrfsdevusage | egrep sda -A5 | egrep single > 2019-12-25 17:03:04 Data,single: 1188632199168 > 2019-12-25 18:31:27 Data,single: 1184337231872 > 2019-12-25 19:00:56 Data,single: 1177894780928 > 2019-12-25 20:25:46 Data,single: 1158567428096 > 2019-12-25 21:06:00 Data,single: 1147830009856 > 2019-12-25 23:01:34 Data,single: 1132797624320 > 2019-12-25 23:16:27 Data,single: 1130650140672 > > root@wawel:~# echo "speed is $(( ( 1188632199168 - 1130650140672 ) / > 1024 / 1024 / ((23 - 17) * 3600) )) MB/sec" > speed is 2 MB/sec > > ################################################# > > root@wawel:~# iotop -d12 > > Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s > Current DISK READ: 0.00 B/s | Current DISK WRITE: 183.48 M/s > TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND > 8138 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % btrfs dev > delete /dev/sda2 / > 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init > 2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd] > 3 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_gp] > 4 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_par_gp] > 6 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % > [kworker/0:0H-kblockd] > 8 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [mm_percpu_wq] > > Total DISK READ: 11.23 M/s | Total DISK WRITE: 28.77 M/s > Current DISK READ: 11.23 M/s | Current DISK WRITE: 56.51 M/s > TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND > 10349 be/4 root 29.26 K/s 168.89 K/s 0.00 % 75.29 % > [kworker/u8:3-btrfs-endio-meta] > 8138 be/4 root 11.03 M/s 23.61 M/s 0.00 % 74.65 % btrfs dev > delete /dev/sda2 / > 10351 be/4 root 45.21 K/s 1365.76 K/s 0.00 % 11.50 % > [kworker/u8:8-btrfs-endio-write] > 10350 be/4 root 63.83 K/s 1542.63 K/s 0.00 % 10.22 % > [kworker/u8:7-btrfs-freespace-write] > 7879 be/4 root 42.56 K/s 1186.23 K/s 0.00 % 7.07 % > [kworker/u8:1-btrfs-freespace-write] > 23032 be/4 root 25.27 K/s 972.12 K/s 0.00 % 5.04 % > [kworker/u8:12-btrfs-extent-refs] > 8035 be/4 root 0.00 B/s 25.27 K/s 0.00 % 0.00 % > [kworker/u8:4-btrfs-endio-meta] > 8036 be/4 root 0.00 B/s 11.97 K/s 0.00 % 0.00 % > [kworker/u8:5-btrfs-extent-refs] > 31253 be/4 root 0.00 B/s 5.32 K/s 0.00 % 0.00 % > [kworker/u8:9-btrfs-extent-refs] > > Total DISK READ: 674.25 K/s | Total DISK WRITE: 13.57 M/s > Current DISK READ: 674.25 K/s | Current DISK WRITE: 39.97 M/s > TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND > 8138 be/4 root 147.62 K/s 7.72 M/s 0.00 % 79.89 % btrfs dev > delete /dev/sda2 / > 10274 be/4 root 484.08 K/s 4.91 M/s 0.00 % 56.93 % > [kworker/u8:2+btrfs-extent-refs] > 10351 be/4 root 27.93 K/s 492.06 K/s 0.00 % 3.03 % > [kworker/u8:8-btrfs-endio-meta] > 10350 be/4 root 3.99 K/s 115.70 K/s 0.00 % 0.59 % > [kworker/u8:7-btrfs-endio-write] > 8037 be/4 root 5.32 K/s 79.79 K/s 0.00 % 0.40 % > [kworker/u8:6-btrfs-endio-write] > 23032 be/4 root 3.99 K/s 79.79 K/s 0.00 % 0.20 % > [kworker/u8:12-btrfs-endio-write] > 7879 be/4 root 1361.80 B/s 105.06 K/s 0.00 % 0.10 % > [kworker/u8:1-btrfs-endio-write] > 538 be/4 root 0.00 B/s 1021.35 B/s 0.00 % 0.00 % rsyslogd -n > -iNONE [rs:main Q:Reg] > 10349 be/4 root 0.00 B/s 86.44 K/s 0.00 % 0.00 % > [kworker/u8:3-btrfs-endio-meta] > 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init > > > > ################################################# > > root@wawel:~# top -d12 > top - 23:23:01 up 1 day, 56 min, 1 user, load average: 3,05, 3,28, 3,47 > Tasks: 137 total, 1 running, 136 sleeping, 0 stopped, 0 zombie > %Cpu(s): 0,0 us, 4,5 sy, 0,0 ni, 92,1 id, 3,3 wa, 0,0 hi, 0,1 si, > 0,0 st > MiB Mem : 7841,1 total, 797,0 free, 259,7 used, 6784,4 buff/cache > MiB Swap: 8053,0 total, 7962,2 free, 90,8 used. 7265,2 avail Mem > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 8138 root 20 0 8540 104 104 D 11,6 0,0 23:25.88 btrfs > 23032 root 20 0 0 0 0 I 4,8 0,0 0:51.20 > kworker/u8:12-btrfs-extent-refs > 10350 root 20 0 0 0 0 I 0,9 0,0 0:01.79 > kworker/u8:7-btrfs-endio-meta > 10349 root 20 0 0 0 0 I 0,8 0,0 0:02.12 > kworker/u8:3-btrfs-freespace-write > 242 root 0 -20 0 0 0 I 0,2 0,0 0:59.66 > kworker/2:1H-kblockd > > > ################################################# > > root@wawel:~# tail /var/log/kern.log -n50 > Dec 25 21:50:35 wawel kernel: [84258.634360] BTRFS info (device sda2): > found 1752 extents > Dec 25 21:50:41 wawel kernel: [84264.771079] BTRFS info (device sda2): > relocating block group 3434964058112 flags data > Dec 25 21:51:09 wawel kernel: [84291.864553] BTRFS info (device sda2): > found 584 extents > Dec 25 21:55:51 wawel kernel: [84574.383303] BTRFS info (device sda2): > found 584 extents > Dec 25 21:55:56 wawel kernel: [84579.359349] BTRFS info (device sda2): > relocating block group 3431742832640 flags data > Dec 25 21:56:17 wawel kernel: [84600.668237] BTRFS info (device sda2): > found 635 extents > Dec 25 22:00:36 wawel kernel: [84859.766366] BTRFS info (device sda2): > found 635 extents > Dec 25 22:00:47 wawel kernel: [84870.367399] BTRFS info (device sda2): > relocating block group 3428521607168 flags data > Dec 25 22:01:14 wawel kernel: [84897.313282] BTRFS info (device sda2): > found 1170 extents > Dec 25 22:05:38 wawel kernel: [85161.264397] BTRFS info (device sda2): > found 1170 extents > Dec 25 22:05:46 wawel kernel: [85169.216425] BTRFS info (device sda2): > relocating block group 3425300381696 flags data > Dec 25 22:06:20 wawel kernel: [85202.963222] BTRFS info (device sda2): > found 1336 extents > Dec 25 22:13:00 wawel kernel: [85603.648327] BTRFS info (device sda2): > found 1336 extents > Dec 25 22:13:02 wawel kernel: [85604.878785] BTRFS info (device sda2): > relocating block group 3422079156224 flags data > Dec 25 22:14:57 wawel kernel: [85720.394945] BTRFS info (device sda2): > found 1771 extents > Dec 25 22:23:03 wawel kernel: [86206.582245] BTRFS info (device sda2): > found 1771 extents > Dec 25 22:23:04 wawel kernel: [86207.671814] BTRFS info (device sda2): > relocating block group 3418857930752 flags data > Dec 25 22:23:19 wawel kernel: [86222.138761] BTRFS info (device sda2): > found 199 extents > Dec 25 22:26:26 wawel kernel: [86408.991487] BTRFS info (device sda2): > found 199 extents > Dec 25 22:26:29 wawel kernel: [86412.181922] BTRFS info (device sda2): > relocating block group 3415636705280 flags data > Dec 25 22:26:48 wawel kernel: [86431.726268] BTRFS info (device sda2): > found 50 extents > Dec 25 22:27:32 wawel kernel: [86475.800606] BTRFS info (device sda2): > found 50 extents > Dec 25 22:27:39 wawel kernel: [86482.222298] BTRFS info (device sda2): > found 20 extents > Dec 25 22:27:44 wawel kernel: [86487.168278] BTRFS info (device sda2): > relocating block group 3412415479808 flags data > Dec 25 22:28:27 wawel kernel: [86530.525657] BTRFS info (device sda2): > found 6748 extents > Dec 25 22:34:07 wawel kernel: [86870.618140] BTRFS info (device sda2): > found 6748 extents > Dec 25 22:34:35 wawel kernel: [86898.498949] BTRFS info (device sda2): > relocating block group 3409194254336 flags data > Dec 25 22:36:06 wawel kernel: [86989.589309] BTRFS info (device sda2): > found 1175 extents > Dec 25 22:40:12 wawel kernel: [87234.881523] perf: interrupt took too > long (3943 > 3930), lowering kernel.perf_event_max_sample_rate to 50500 > Dec 25 22:44:44 wawel kernel: [87507.870507] BTRFS info (device sda2): > found 1175 extents > Dec 25 22:44:52 wawel kernel: [87515.563744] BTRFS info (device sda2): > relocating block group 3405973028864 flags data > Dec 25 22:45:14 wawel kernel: [87537.128352] BTRFS info (device sda2): > found 372 extents > Dec 25 22:46:11 wawel kernel: [87594.569709] BTRFS info (device sda2): > found 372 extents > Dec 25 22:46:17 wawel kernel: [87600.772681] BTRFS info (device sda2): > relocating block group 3402751803392 flags data > Dec 25 22:47:27 wawel kernel: [87670.663043] BTRFS info (device sda2): > found 3483 extents > Dec 25 23:04:57 wawel kernel: [88720.068885] BTRFS info (device sda2): > found 3479 extents > Dec 25 23:05:03 wawel kernel: [88726.864461] BTRFS info (device sda2): > relocating block group 3399530577920 flags data > Dec 25 23:06:36 wawel kernel: [88819.464525] BTRFS info (device sda2): > found 1871 extents > Dec 25 23:16:11 wawel kernel: [89393.997406] BTRFS info (device sda2): > found 1868 extents > Dec 25 23:16:13 wawel kernel: [89396.825305] BTRFS info (device sda2): > relocating block group 3396309352448 flags data > Dec 25 23:16:41 wawel kernel: [89424.310916] BTRFS info (device sda2): > found 1946 extents > Dec 25 23:22:12 wawel kernel: [89755.068457] BTRFS info (device sda2): > found 1946 extents > Dec 25 23:22:15 wawel kernel: [89758.601122] BTRFS info (device sda2): > relocating block group 3393088126976 flags data > Dec 25 23:22:35 wawel kernel: [89778.172631] BTRFS info (device sda2): > found 647 extents > Dec 25 23:23:06 wawel kernel: [89809.153016] BTRFS info (device sda2): > found 647 extents > Dec 25 23:23:07 wawel kernel: [89810.017006] BTRFS info (device sda2): > relocating block group 3389866901504 flags data > Dec 25 23:23:24 wawel kernel: [89827.351203] BTRFS info (device sda2): > found 745 extents > Dec 25 23:24:38 wawel kernel: [89901.176493] BTRFS info (device sda2): > found 745 extents > Dec 25 23:24:51 wawel kernel: [89914.840850] BTRFS info (device sda2): > relocating block group 3386645676032 flags data > Dec 25 23:25:40 wawel kernel: [89963.641062] BTRFS info (device sda2): > found 3114 extents > > > > > ################################################ > > More on this topic: > > https://strugglers.net/~andy/blog/2018/06/15/another-disappointing-btrfs-experience/ > > > https://www.spinics.net/lists/linux-btrfs/msg50771.html > > https://bugzilla.redhat.com/show_bug.cgi?id=1294531 > > > > >
Attachment:
signature.asc
Description: OpenPGP digital signature
