Am Freitag, 22. November 2019, 02:36:56 CET schrieb Chris Murphy: > On Thu, Nov 21, 2019 at 3:39 PM Marc Joliet <marcec@xxxxxx> wrote: > > On a side note, I am also really annoyed by the lockups caused by qgroups. > > On my Gentoo systems (which use btrbk) I have it disabled for that > > reason, but I left it on on my openSUSE laptop (a Dell XPS 13 9360), > > which locks up for about 15-30 minutes while cleaning up snapshots a few > > times a week (usually after reboots or after "zypper dup"). > > 15 seconds is not at all acceptable on a desktop system, 15 minutes is > atrocious. A computer that appears to hang for 15 seconds, it is > completely reasonable for ordinary users to consider has totally > faceplanted, will not recover, and to force power off. The > distribution really needs to do something about that kind of negative > user experience. Sadly, I can't say if it's better without snapshotting /home, because I hadn't accumulated many / snapshots at that point in time. It might have gotten worse even with only / being snapshotted. But like I said, I'll experiment with configuring snapper before blaming SUSE. I believe the installation even recommends against snapshotting /home, but hey, I wanted to do it anyway :-) . But to be precise, it's not locked up continuously during snapshot deletion. Occasionally I'll be able to operate my desktop for a few seconds, and if I leave top running in a GUI terminal (in my case konsole), I'll see it updating (almost) the entire time. My guess (emphasis on *guess*) is that the qgroups update is holding some lock that is preventing other I/O from finishing, thus locking up any application that wants to write to disk and isn't doing so concurrently (maybe Plasma is blocking on fsync() at the time?). > And by the way, I've recently done some unprivileged compilations of > webkitgtk, with default options that cause n cores +2 to be used, > eating all available RAM and swap, and quickly totally hanging the > system while swap thrashing and basically acting like a fork bomb. I'm > using Btrfs for the rootfs as well as user home for this compile, and > have done hundreds of forced power offs during these events and have > seen exactly zero corruptions or Btrfs complaints. So at least there's > that, however unscientific a sample that is. My experience has also been that forced reboots don't cause any damage, even though I usually only have to do them rarely [0]. I mean, with COW it should be expected to be safe. [0] I have two main situations where this happens: The first are RCU stalls that cause my desktop to get hung up (happens during bootup occasionally, shortly between the boot loader and the login screen), but also recently started affecting my home server. The second only affects my home server (a used small business server), namely a wonky e1000e NIC, which I only recently learned are sometimes buggy are known for causing servers to crash. The workaround is apparently to turn off TSO and GSO, and sometimes also GRO, but I've been able to get away with only the first two without experiencing any more crashes thus far. Interestingly enough the RCU stalls happened shortly after I did that. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup
Attachment:
signature.asc
Description: This is a digitally signed message part.
