Re: block corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-- 
Rann Bar-On
Senior Lecturer
Dept of Mathematics
Duke University

Pronouns: he/him/his

On Sun, 2019-09-01 at 14:09 -0600, Chris Murphy wrote:
> On Sun, Sep 1, 2019 at 11:39 AM Rann Bar-On <rann@xxxxxxxxxxxxx>
> wrote:
> > On Sun, 2019-09-01 at 07:48 +0800, Qu Wenruo wrote:
> > > On 2019/9/1 上午7:39, Rann Bar-On wrote:
> > > > On Sat, 2019-08-31 at 17:04 -0600, Chris Murphy wrote:
> > > > > On Sat, Aug 31, 2019 at 2:26 PM Rann Bar-On <
> > > > > rann@xxxxxxxxxxxxx>
> > > > > wrote:
> > > > > > I just downgraded to kernel 4.19, and the supposed
> > > > > > corruption
> > > > > > vanished.
> > > > > > This may be related to
> > > > > > 
> > > > > > https://www.spinics.net/lists/linux-btrfs/msg91046.html
> > > > > > 
> > > > > > If I can provide information that would help fix this
> > > > > > issue,
> > > > > > I'd be
> > > > > > glad to, but I cannot upgrade back to kernel 5.2, as I
> > > > > > can't
> > > > > > risk
> > > > > > this
> > > > > > system.
> > > > > 
> > > > > 5.2 has more strict checks for corruption, exposing the rare
> > > > > case
> > > > > where metadata in a leaf is corrupt but the checksum was
> > > > > properly
> > > > > computed.
> > > 
> > > Exactly.
> > > 
> > > Although for your case, it's some older kernel doing something
> > > bad.
> > > 
> > > It's also reported once for the same problem, some older kernel
> > > doesn't
> > > set the mode member properly.
> > > > > > > Btrfs v3.17
> > > > > 
> > > > > This is old. I suggest finding a newer version of btrfs-
> > > > > progs,
> > > > > ideally
> > > > > latest stable version is 5.2.1. Definitely don't use --repair
> > > > > with
> > > > > this version. It's safe to use check --readonly (which is the
> > > > > default)
> > > > > with this version but probably not that helpful to devs.
> > > > > 
> > > > 
> > > > Not really sure why that said 3.17:
> > > > 
> > > > $ btrfs --version
> > > > btrfs-progs v5.2.1
> > > > 
> > > > Anyway, running btrfs --repair on the file system didn't do
> > > > anything to
> > > > fix the above errors.
> > > 
> > > That's what we need to enhance next.
> > > 
> > > > I removed at least one of the corrupt files (the only one that
> > > > was
> > > > mode
> > > > 0) using kernel 4.19.
> > > > 
> > > > Happy to help further if I can. What would you suggest as far
> > > > as
> > > > fixing
> > > > this or reporting it usefully? If you believe 5.2 isn't causing
> > > > the
> > > > corruption, but rather, just exposing it, I'm more than happy
> > > > to
> > > > experiment with it.
> > > 
> > > Deleting the offending inodes would be enough to fix the alert.
> > > 
> > 
> > I deleted the file using the older kernel. I rebooted into the new
> > kernel, and things seem good for now.
> > 
> > Note: The newer one wouldn't let me access the file to delete it,
> > nor
> > did any btrfs repair tool do anything at all. This is a big problem
> > IMO!
> 
> The current behavior is an improvement over propagating corruption
> and
> never detecting it because the leaf is assumed to be correct only
> because the checksum matches. The next step is figuring out ways to
> work around such rare detected corruptions, hopefully automatically
> and while online.
> 
> I don't consider it user responsibility to have to do this, but I'm
> vaguely curious if it's possible to delete the offending file in a
> snapshot, then delete the original subvolume. i.e.
> 
> 1.
> snapshot the subvolume containing the file (default rw snapshot)
> 2.
> delete the bad file(s) in the snapshot
> 3.
> delete the original subvolume (snapshot's parent)
> 
> I'm curious if either 2 or 3 are permitted.
> 
> 

Wish I could help, but I already deleted the file. If there's something
I can do to move this forward, I'd be glad to.




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux