On Tue, Jun 25, 2019 at 09:17:43PM +0100, Graham Cobb wrote: > On 18/06/2019 09:08, Graham R. Cobb wrote: > > When a scrub completes or is cancelled, statistics are updated for reporting > > in a later btrfs scrub status command and for resuming the scrub. Most > > statistics (such as bytes scrubbed) are additive so scrub adds the statistics > > from the current run to the saved statistics. > > > > However, the last_physical statistic is not additive. The value from the > > current run should replace the saved value. The current code incorrectly > > adds the last_physical from the current run to the previous saved value. > > > > This bug causes the resume point to be incorrectly recorded, so large areas > > of the disk are skipped when the scrub resumes. As an example, assume a disk > > had 1000000 bytes and scrub was cancelled and resumed each time 10% (100000 > > bytes) had been scrubbed. > > > > Run | Start byte | bytes scrubbed | kernel last_physical | saved last_physical > > 1 | 0 | 100000 | 100000 | 100000 > > 2 | 100000 | 100000 | 200000 | 300000 > > 3 | 300000 | 100000 | 400000 | 700000 > > 4 | 700000 | 100000 | 800000 | 1500000 > > 5 | 1500000 | 0 | immediately completes| completed > > > > In this example, only 40% of the disk is actually scrubbed. > > > > This patch changes the saved/displayed last_physical to track the last > > reported value from the kernel. > > > > Signed-off-by: Graham R. Cobb <g.btrfs@xxxxxxxxxxx> > > Ping? This fix is important for anyone who interrupts and resumes scrubs > -- which will happen more and more as filesystems get bigger. It is a > small fix and would be good to get out to distros. Sorry for the delay. The analysis and fix are correct, patch added to devel. Thanks.
