On Thu, May 2, 2019 at 1:02 PM Hendrik Friedel <hendrik@xxxxxxxxxxxxx> wrote: > > >What scheduler is being used for the drive? > > > ># cat /sys/block/<dev>/queue/scheduler > [mq-deadline] none At first I thought you might be running into this bug https://lwn.net/Articles/774440/ However: [Mo Apr 29 20:44:47 2019] Not tainted 4.19.0-0.bpo.2-amd64 #1 Debian 4.19.16-1~bpo9+1 This is actually based on 4.19.16 which has the fix for that. [Mo Apr 29 06:44:32 2019] systemd[1]: apt-daily-upgrade.timer: Adding 36min 35.299087s random time. [Mo Apr 29 20:44:47 2019] INFO: task btrfs-transacti:10227 blocked for more than 120 seconds. Literally nothing for hours before the blocking. And I don't see anything off during device discovery. Qu would know better but usually developers ask for sysrq+w when there's blocked tasks. https://www.kernel.org/doc/html/v4.11/admin-guide/sysrq.html Basically as root issue # echo 1 >/proc/sys/kernel/sysrq # echo w > /proc/sysrq-trigger What I do is run the first command and type out the second command but do not press return; in another shell reproduce the hang, and then go back to the first shell and hit return. That way it doesn't take a minute or two to type out during the hang. The result appears in dmesg, so stop the operation causing the hang if possible and then 'dmesg>dmesg.txt' and attach it. Also, you'll want to reboot with 'log_bug_len=1M' because the sysrq+w that gets dumped to dmesg will fill up the kernel message buffer. > Done (also the two smartctl outputs). I don't see anything weird there either. The errors are a little weird but predate the Btrfs error by a lot. > I was tempted to ask, whether this should be fixed. On the other hand, I am not even sure anything bad happened (except, well, the system -at least the copy- seemed to hang). It could be a bug somewhere, but question is where. The workload is only copy? Seems trivial and not prone to lock contention. You know what? Try changing the scheduler from mq-deadline to none. Change nothing else. Now try to reproduce. Let's see if it still happens. Also, what are the mount options? -- Chris Murphy
