btrfs won't mount any more

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have wrecked another btrfs file system, probably for good this time.

It's a 80 GB filesystem from 2015, in my secondary notebook, on an
encrypted SSD. The btrfs holds the root filesystem and the rest of the
system as well.

I have a cronjob that makes snapshots of the system directories daily,
and of /home every ten minutes. A second cronjob cleans up old snapshots
so that the number of snapshots present is about between 400 and 600.
This is the key feature that made me decide for btrfs in the first
place.

Last week (I was on kernel 4.10.8 with Debian unstable), I was forced to
promote the secondary laptop to the primary one which resulted in
serious work being done on the first time. Over time, the filesystem
filled up without me noticing and was finally 100% full.

I then cleaned up about four gigs by deleting a couple of redundant ISO
images and some snapshots that were not due for regular deletion yet. I
then started a btrfs balance / -d50, unfortunately without stopping the
snapshot-making cronjob. This resulted in the notebook becoming
unuseable for extended periods of time, without even being able to log
in. After running for some 30 hours, the notebook ran out of battery
(don't ask, stupid me).

After rebooting, the btrfs balance proceeded immediately after mounting
the root fs. System unuseable again. After a day, I finally had a root
shell and was able to issue a btrfs cancel /. Unfortunately, the system
didn't care about that command and happily continued to balance. After
some more 30 hours, I lost patience and resetted the system.

To be able to keep control of the system and to monitor operations from
remote, I installed a fresh copy of Debian unstable with the same 4.10.8
kernel on an USB stick and booted the notebook from the stick. I brought
up the system and tried to mount the btrfs. The mount process quickly
went up to 100 % CPU usage and stayed that way until I went to bed last
night. This morning, the machine had dropped off the network (couldn't
ping the default gateway any more despite the network looked fine), and
spewed kernel oopses of about 80 lines (too long to scroll back even)
every few seconds.

I will try to tweak kernel.printk tonight so that I get my console back
and see whether the oopses are also in journal, dmesg or syslog so that
I can copypaste them. I also have a reasonably current backup of the
filesystem so nuking it from orbit is an option, I would however hate
losing my snapshots.

Is it worthwhile to save information about the borked filesystem, or
does the btrfs community just dont care about a heavily snapshotted two
years old filesystem?

I would like to hear comments and opinions about what has happened here
and how to avoid things like that in the future. Do more recently
created btrfs filesystems have safeguards against damage that may occur
when a filesystem fills up?

Greetings
Marc


-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux