OK, I have bisected the problem with scrub resume being broken by the
scrub ioctl ABI being changed.
The bad commit is:
Fail
06fe39ab15a6a47d4979460fcc17d33b1d72ccf9 is the first bad commit
commit 06fe39ab15a6a47d4979460fcc17d33b1d72ccf9
Author: Filipe Manana <fdmanana@xxxxxxxx>
Date: Fri Dec 14 19:50:17 2018 +0000
Btrfs: do not overwrite scrub error with fault error in scrub ioctl
If scrub returned an error and then the copy_to_user() call did not
succeed, we would overwrite the error returned by scrub with -EFAULT.
Fix that by calling copy_to_user() only if btrfs_scrub_dev() returned
success.
Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
Reviewed-by: David Sterba <dsterba@xxxxxxxx>
Signed-off-by: David Sterba <dsterba@xxxxxxxx>
fs/btrfs/ioctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
bisect run success
It is important that scrub always returns the stats, even when it
returns an error. This is critical for cancel, as that is how
cancel/resume works, but it should also apply in case of other errors so
that the user can see how much of the scrub was done before the fatal error.
I am not sure in which kernel release this commit appeared but as this
breaks the "scrub resume" command completely, I think the fix for this
needs to be backported and may want to be considered by distro kernel
maintainers.
I will reply later with the simple reproducer program I created for the
bisection in case it is useful for testing.