On 06/06/2019 15:26, Graham Cobb wrote: > However, after a few cancel/resume cycles, the scrub terminates. No > errors are reported but one of the resumes will just immediately > terminate claiming the scrub is done. It isn't. Nowhere near. I believe I have found the problem. It is a bug in the scrub command. When a scrub completes or is cancelled, the utility updates the saved statistics for reporting using btrfs scrub status. These statistics include the last_physical value returned from the ioctl, which is then used by the resume code to specify the start for the next run. Most statistics (such as bytes scrubbed, error counts, etc) are maintained by adding the values from the current run to the saved values. However, the last_physical value should not be added: it should replace the saved value. The current code incorrectly adds it to the saved value, meaning that large amounts of the filesystem are missed out on the next run. I have created a patch, which I will send in a separate message. As I have not submitted patches to this list before, I will send it as a PATCH RFC and would welcome comments. Graham
