On 10/10/2018 07:44 PM, Chris Murphy wrote: > On Wed, Oct 10, 2018 at 10:04 AM, Holger Hoffstätte > <holger@xxxxxxxxxxxxxxxxxxxxxx> wrote: >> On 10/10/18 17:44, Larkin Lowrey wrote: >> (..) >>> >>> About once a week, or so, I'm running into the above situation where >>> FS seems to deadlock. All IO to the FS blocks, there is no IO >>> activity at all. I have to hard reboot the system to recover. There >>> are no error indications except for the following which occurs well >>> before the FS freezes up: >>> >>> BTRFS warning (device dm-3): block group 78691883286528 has wrong amount >>> of free space >>> BTRFS warning (device dm-3): failed to load free space cache for block >>> group 78691883286528, rebuilding it now >>> >>> Do I have any options other the nuking the FS and starting over? >> >> >> Unmount cleanly & mount again with -o space_cache=v2. > > I'm pretty sure you have to umount, and then clear the space_cache > with 'btrfs check --clear-space-cache=v1' and then do a one time mount > with -o space_cache=v2. The --clear-space-cache=v1 is optional, but recommended, if you are someone who do not likes to keep accumulated cruft. The v2 mount (rw mount!!!) does not remove the v1 cache. If you just mount with v2, the v1 data keeps being there, doing nothing any more. > But anyway, to me that seems premature because we don't even know > what's causing the problem. > > a. Freezing means there's a kernel bug. Hands down. > b. Is it freezing on the rebuild? Or something else? > c. I think the devs would like to see the output from btrfs-progs > v4.17.1, 'btrfs check --mode=lowmem' and see if it finds anything, in > particular something not related to free space cache. > > Rebuilding either version of space cache requires successfully reading > (and parsing) the extent tree. > > -- Hans van Kranenburg
