On 09/01/2020 11:05, Holger Hoffstätte wrote: > On 1/9/20 11:52 AM, Graham Cobb wrote: >> On 09/01/2020 10:34, Holger Hoffstätte wrote: >>> On 1/9/20 11:03 AM, Sebastian Döring wrote: >>>> Maybe I'm doing it entirely wrong, but I can't seem to get 'btrfs >>>> scrub resume' to work properly. During a running scrub the resume >>>> information (like data_bytes_scrubbed:1081454592) gets written to a >>>> file in /var/lib/btrfs, but as soon as the scrub is cancelled all >>>> relevant fields are zeroed. 'btrfs scrub resume' then seems to >>>> re-start from the very beginning. >>>> >>>> This is on linux-5.5-rc5 and btrfs-progs 5.4, but I've been seeing >>>> this for a while now. >>>> >>>> Is this intended/expected behavior? Am I using the btrfs-progs wrong? >>>> How can I interrupt and resume a scrub? >>> >>> Using 5.4.9+ (all of btrfs-5.5) and btrfs-progs 5.4 I just tried and >>> it still works for me (and always has): >>> >>> $btrfs scrub start /mnt/backup >>> scrub started on /mnt/backup, fsid d163af2f-6e03-4972-bfd6-30c68b6ed312 >>> (pid=25633) >>> >>> $btrfs scrub cancel /mnt/backup >>> scrub cancelled >>> >>> $btrfs scrub resume /mnt/backup >>> scrub resumed on /mnt/backup, fsid d163af2f-6e03-4972-bfd6-30c68b6ed312 >>> (pid=25704) >>> >>> ..and it keeps munching away as expected. >> >> Can you check that the resume has really started from where the scrub >> was cancelled? What I (and, I think, Sebastian) are seeing is that the >> resume "works" but actually restarts from the beginning. >> >> For example, something like: >> >> btrfs scrub start /mnt/backup >> sleep 300 >> btrfs scrub status -R /mnt/backup >> btrfs scrub cancel /mnt/backup >> btrfs scrub resume /mnt/backup >> sleep 100 >> btrfs scrub status -R /mnt/backup >> >> and check the last_physical in the second status is higher than the one >> in the first status. >> > > Well, yes. Reduced the wait times a bit and: > > $cat test-scrub > #!/bin/sh > btrfs scrub start /mnt/backup > sleep 30 > btrfs scrub status -R /mnt/backup > btrfs scrub cancel /mnt/backup > btrfs scrub resume /mnt/backup > sleep 10 > btrfs scrub status -R /mnt/backup > > $./test-scrub > scrub started on /mnt/backup, fsid d163af2f-6e03-4972-bfd6-30c68b6ed312 > (pid=26390) > UUID: d163af2f-6e03-4972-bfd6-30c68b6ed312 > Scrub started: Thu Jan 9 12:02:18 2020 > Status: running > Duration: 0:00:25 > data_extents_scrubbed: 65419 > tree_extents_scrubbed: 28 > data_bytes_scrubbed: 4117274624 > tree_bytes_scrubbed: 458752 > read_errors: 0 > csum_errors: 0 > verify_errors: 0 > no_csum: 0 > csum_discards: 0 > super_errors: 0 > malloc_errors: 0 > uncorrectable_errors: 0 > unverified_errors: 0 > corrected_errors: 0 > last_physical: 3591372800 > ^^^^^^^^^^^^^^^^^^^^^^^^^ > scrub cancelled > scrub resumed on /mnt/backup, fsid d163af2f-6e03-4972-bfd6-30c68b6ed312 > (pid=26399) > UUID: d163af2f-6e03-4972-bfd6-30c68b6ed312 > Scrub resumed: Thu Jan 9 12:02:49 2020 > Status: running > Duration: 0:00:36 > data_extents_scrubbed: 12648 > tree_extents_scrubbed: 28 > data_bytes_scrubbed: 823394304 > tree_bytes_scrubbed: 458752 > read_errors: 0 > csum_errors: 0 > verify_errors: 0 > no_csum: 0 > csum_discards: 0 > super_errors: 0 > malloc_errors: 0 > uncorrectable_errors: 0 > unverified_errors: 0 > corrected_errors: 0 > last_physical: 923205632 > ^^^^^^^^^^^^^^^^^^^^^^^^ > > Not sure what I'm doing wrong ;) So, you ARE seeing the same problem we are (as 923205632 is < 3591372800)! The resume started from scratch, which is not what it is supposed to do. You should have seen 4514578432 (3591372800+923205632) as last_physical in the second status.
