Re: btrfs-progs reports nonsense scrub status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, May 10, 2020 at 7:39 PM Andrew Pam <andrew@xxxxxxxxxxxxxx> wrote:
>
> On 10/5/20 6:33 am, Chris Murphy wrote:
> > 2. That a scrub started, then cancelled, then resumed, also does
> > finish (or not).
>
> OK, I have now run a scrub with multiple cancel and resumes and that
> also proceeded and finished normally as expected:
>
> $ sudo ./btrfs scrub status -d /home
> NOTE: Reading progress from status file
> UUID:             85069ce9-be06-4c92-b8c1-8a0f685e43c6
> scrub device /dev/sda (id 1) history
> Scrub resumed:    Mon May 11 06:06:37 2020
> Status:           finished
> Duration:         7:27:31
> Total to scrub:   3.67TiB
> Rate:             142.96MiB/s
> Error summary:    no errors found
> scrub device /dev/sdb (id 2) history
> Scrub resumed:    Mon May 11 06:06:37 2020
> Status:           finished
> Duration:         7:27:15
> Total to scrub:   3.67TiB
> Rate:             143.04MiB/s
> Error summary:    no errors found
>
> [54472.936094] BTRFS info (device sda): scrub: started on devid 2
> [54472.936095] BTRFS info (device sda): scrub: started on devid 1
> [55224.956293] BTRFS info (device sda): scrub: not finished on devid 1
> with status: -125
> [55226.356563] BTRFS info (device sda): scrub: not finished on devid 2
> with status: -125
> [58775.602370] BTRFS info (device sda): scrub: started on devid 1
> [58775.602372] BTRFS info (device sda): scrub: started on devid 2
> [72393.296199] BTRFS info (device sda): scrub: not finished on devid 1
> with status: -125
> [72393.296215] BTRFS info (device sda): scrub: not finished on devid 2
> with status: -125
> [77731.999603] BTRFS info (device sda): scrub: started on devid 1
> [77731.999604] BTRFS info (device sda): scrub: started on devid 2
> [87727.510382] BTRFS info (device sda): scrub: not finished on devid 1
> with status: -125
> [87727.582401] BTRFS info (device sda): scrub: not finished on devid 2
> with status: -125
> [89358.196384] BTRFS info (device sda): scrub: started on devid 1
> [89358.196386] BTRFS info (device sda): scrub: started on devid 2
> [89830.639654] BTRFS info (device sda): scrub: not finished on devid 2
> with status: -125
> [89830.856232] BTRFS info (device sda): scrub: not finished on devid 1
> with status: -125
> [94486.300097] BTRFS info (device sda): scrub: started on devid 2
> [94486.300098] BTRFS info (device sda): scrub: started on devid 1
> [96223.185459] BTRFS info (device sda): scrub: not finished on devid 1
> with status: -125
> [96223.227246] BTRFS info (device sda): scrub: not finished on devid 2
> with status: -125
> [97810.489388] BTRFS info (device sda): scrub: started on devid 1
> [97810.540625] BTRFS info (device sda): scrub: started on devid 2
> [98068.987932] BTRFS info (device sda): scrub: finished on devid 2 with
> status: 0
> [98085.771626] BTRFS info (device sda): scrub: finished on devid 1 with
> status: 0
>
> So by elimination it's starting to look like suspend-to-RAM might be
> part of the problem.  That's what I'll test next.
>

Power management is difficult. (I'm actually working on a git bisect
right now, older laptop won't wake from suspend, 5.7 regression.)

Do all the devices wake up correctly isn't always easy to get an
answer to. They might all have power but did they really come up in
the correct state? Thing is, you're reporting that iotop independently
shows a transfer rate consistent with getting data off the drives.

I also wonder whether the socket that Graham mentions, could get in
some kind of stuck or confused state due to sleep/wake cycle? My case,
NVMe, is maybe not the best example because that's just PCIe. In your
case it's real drives, so it's SCSI, block, and maybe libata and other
things.



--
Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux