Howdy,
I have 6 btrfs pools on my laptop on 3 different SSDs.
After a few years, one of them is now very slow to scrub
and hands my laptop while it runs.
This started under 5.3.8, but upgrading to 5.4.8 didn't fix it.
Also, it output 'errors during scrubbing', but I see nothing in the kernel log:
btrfs scrub start -Bd /mnt/btrfs_pool2
scrub device /dev/mapper/pool2 (id 1) done
scrub started at Thu Jan 9 01:46:45 2020 and finished after 01:29:49
total bytes scrubbed: 1.27TiB with 0 errors
WARNING: errors detected during scrubbing, corrected
real 89m49.190s
user 0m0.000s
sys 13m26.548s
89mn is also longer than normal
balance works ok:
logger: Quick Metadata and Data Balance of /mnt/btrfs_pool2 (/dev/mapper/pool2)
Done, had to relocate 0 out of 837 chunks
Done, had to relocate 0 out of 837 chunks
Done, had to relocate 0 out of 837 chunks
I re-ran a bigger balance, and it ran fine too:
trfs balance start -musage=60 /mnt/btrfs_pool2; btrfs balance start -dusage=60 /mnt/btrfs_pool2
Jan 9 01:46:45 saruman kernel: [14530.056667] BTRFS info (device dm-3): balance: start -musage=0 -susage=0
Jan 9 01:46:45 saruman kernel: [14530.059623] BTRFS info (device dm-3): balance: ended with status: 0
Jan 9 01:46:45 saruman kernel: [14530.134043] BTRFS info (device dm-3): balance: start -dusage=0
Jan 9 01:46:45 saruman kernel: [14530.135525] BTRFS info (device dm-3): balance: ended with status: 0
Jan 9 01:46:45 saruman kernel: [14530.193798] BTRFS info (device dm-3): balance: start -dusage=20
Jan 9 01:46:45 saruman kernel: [14530.195642] BTRFS info (device dm-3): balance: ended with status: 0
Jan 9 01:46:45 saruman kernel: [14530.240290] BTRFS info (device dm-3): scrub: started on devid 1
Jan 9 01:58:21 saruman kernel: [15226.254196] Tainted: G W OE 5.4.8-amd64-preempt-sysrq-20190816 #1
Jan 9 01:58:21 saruman kernel: [15226.254198] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 9 01:58:21 saruman kernel: [15226.254201] btrfs-transacti D 0 12403 2 0x80004000
Jan 9 01:58:21 saruman kernel: [15226.254204] Call Trace:
Jan 9 01:58:21 saruman kernel: [15226.254211] ? __schedule+0x575/0x5d0
Jan 9 01:58:21 saruman kernel: [15226.254215] ? __list_add+0x12/0x2b
Jan 9 01:58:21 saruman kernel: [15226.254218] schedule+0x7b/0xac
Jan 9 01:58:21 saruman kernel: [15226.254222] btrfs_scrub_pause+0x99/0xd3
Jan 9 01:58:21 saruman kernel: [15226.254226] ? finish_wait+0x62/0x62
Jan 9 01:58:21 saruman kernel: [15226.254231] btrfs_commit_transaction+0x307/0x82b
Jan 9 01:58:21 saruman kernel: [15226.254235] ? start_transaction+0x37b/0x3ec
Jan 9 01:58:21 saruman kernel: [15226.254239] ? schedule_timeout+0xf/0xea
Jan 9 01:58:21 saruman kernel: [15226.254243] transaction_kthread+0xdd/0x151
Jan 9 01:58:21 saruman kernel: [15226.254247] ? btrfs_cleanup_transaction+0x417/0x417
Jan 9 01:58:21 saruman kernel: [15226.254250] kthread+0xf5/0xfa
Jan 9 01:58:21 saruman kernel: [15226.254253] ? kthread_create_worker_on_cpu+0x65/0x65
Jan 9 01:58:21 saruman kernel: [15226.254256] ret_from_fork+0x35/0x40
Jan 9 01:58:21 saruman kernel: [15226.254554] INFO: task cron:3869 blocked for more than 120 seconds.
from here, lots of hangs until eventually:
Jan 9 03:16:34 saruman kernel: [19919.454109] BTRFS info (device dm-3): scrub: finished on devid 1 with status: 0
I see no error about the scrub though.
saruman:/mnt/btrfs_pool2# btrfs fi show .
Label: 'btrfs_pool2' uuid: c3ac7621-79da-4d4f-bd59-d12fe7ba3578
Total devices 1 FS bytes used 785.58GiB
devid 1 size 1.12TiB used 831.21GiB path /dev/mapper/pool2
saruman:/mnt/btrfs_pool2# btrfs fi df .
Data, single: total=817.08GiB, used=779.88GiB
System, DUP: total=64.00MiB, used=128.00KiB
Metadata, DUP: total=7.00GiB, used=5.70GiB
GlobalReserve, single: total=512.00MiB, used=64.00KiB
saruman:/mnt/btrfs_pool2# btrfs fi usage .
Overall:
Device size: 1.12TiB
Device allocated: 831.21GiB
Device unallocated: 315.79GiB
Device missing: 0.00B
Used: 791.28GiB
Free (estimated): 352.99GiB (min: 195.10GiB)
Data ratio: 1.00
Metadata ratio: 2.00
Global reserve: 512.00MiB (used: 0.00B)
Data,single: Size:817.08GiB, Used:779.88GiB
/dev/mapper/pool2 817.08GiB
Metadata,DUP: Size:7.00GiB, Used:5.70GiB
/dev/mapper/pool2 14.00GiB
System,DUP: Size:64.00MiB, Used:128.00KiB
/dev/mapper/pool2 128.00MiB
Unallocated:
/dev/mapper/pool2 315.79GiB
I'm going to stop the scrub for now, but clearly that's not so good.
What should I try next?
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08