-- Rann Bar-On Senior Lecturer Dept of Mathematics Duke University Pronouns: he/him/his On Sun, 2019-09-01 at 14:09 -0600, Chris Murphy wrote: > On Sun, Sep 1, 2019 at 11:39 AM Rann Bar-On <rann@xxxxxxxxxxxxx> > wrote: > > On Sun, 2019-09-01 at 07:48 +0800, Qu Wenruo wrote: > > > On 2019/9/1 上午7:39, Rann Bar-On wrote: > > > > On Sat, 2019-08-31 at 17:04 -0600, Chris Murphy wrote: > > > > > On Sat, Aug 31, 2019 at 2:26 PM Rann Bar-On < > > > > > rann@xxxxxxxxxxxxx> > > > > > wrote: > > > > > > I just downgraded to kernel 4.19, and the supposed > > > > > > corruption > > > > > > vanished. > > > > > > This may be related to > > > > > > > > > > > > https://www.spinics.net/lists/linux-btrfs/msg91046.html > > > > > > > > > > > > If I can provide information that would help fix this > > > > > > issue, > > > > > > I'd be > > > > > > glad to, but I cannot upgrade back to kernel 5.2, as I > > > > > > can't > > > > > > risk > > > > > > this > > > > > > system. > > > > > > > > > > 5.2 has more strict checks for corruption, exposing the rare > > > > > case > > > > > where metadata in a leaf is corrupt but the checksum was > > > > > properly > > > > > computed. > > > > > > Exactly. > > > > > > Although for your case, it's some older kernel doing something > > > bad. > > > > > > It's also reported once for the same problem, some older kernel > > > doesn't > > > set the mode member properly. > > > > > > > Btrfs v3.17 > > > > > > > > > > This is old. I suggest finding a newer version of btrfs- > > > > > progs, > > > > > ideally > > > > > latest stable version is 5.2.1. Definitely don't use --repair > > > > > with > > > > > this version. It's safe to use check --readonly (which is the > > > > > default) > > > > > with this version but probably not that helpful to devs. > > > > > > > > > > > > > Not really sure why that said 3.17: > > > > > > > > $ btrfs --version > > > > btrfs-progs v5.2.1 > > > > > > > > Anyway, running btrfs --repair on the file system didn't do > > > > anything to > > > > fix the above errors. > > > > > > That's what we need to enhance next. > > > > > > > I removed at least one of the corrupt files (the only one that > > > > was > > > > mode > > > > 0) using kernel 4.19. > > > > > > > > Happy to help further if I can. What would you suggest as far > > > > as > > > > fixing > > > > this or reporting it usefully? If you believe 5.2 isn't causing > > > > the > > > > corruption, but rather, just exposing it, I'm more than happy > > > > to > > > > experiment with it. > > > > > > Deleting the offending inodes would be enough to fix the alert. > > > > > > > I deleted the file using the older kernel. I rebooted into the new > > kernel, and things seem good for now. > > > > Note: The newer one wouldn't let me access the file to delete it, > > nor > > did any btrfs repair tool do anything at all. This is a big problem > > IMO! > > The current behavior is an improvement over propagating corruption > and > never detecting it because the leaf is assumed to be correct only > because the checksum matches. The next step is figuring out ways to > work around such rare detected corruptions, hopefully automatically > and while online. > > I don't consider it user responsibility to have to do this, but I'm > vaguely curious if it's possible to delete the offending file in a > snapshot, then delete the original subvolume. i.e. > > 1. > snapshot the subvolume containing the file (default rw snapshot) > 2. > delete the bad file(s) in the snapshot > 3. > delete the original subvolume (snapshot's parent) > > I'm curious if either 2 or 3 are permitted. > > Wish I could help, but I already deleted the file. If there's something I can do to move this forward, I'd be glad to.
