On Sun, Aug 11, 2019 at 11:43 PM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: > > > > On 2019/8/12 上午10:27, Chris Murphy wrote: > > I'm not sure this is a bug, but I'm also not sure if the behavior is expected. > > > > Test system as follows: > > > > Intel i7-2820QM, 4/8 cores > > 8 GiB RAM, 8 GiB swap on SSD plain partition > > Samsung SSD 840 EVO 250GB > > kernel 5.3.0-0.rc3.git0.1.fc31.x86_64+debug, but same behavior seen on 5.2.6 > > > > Test involves using a desktop, GNOME shell, while building webkitgtk. > > This uses all available RAM, and eventually all available swap. > > > > While the build fails on ext4 as well as on Btrfs, the difference on > > Btrfs is many btrfs processes taking up quite a lot of cpu resources. > > And iotop shows many processes with unexpectedly high read IO. I don't > > have enough data collected to be certain, but it does seem on Btrfs > > the oom killer is substantially delayed. Realistically, by the time > > the system is in this state, practically speaking it's lost. > > > > Screenshot shows iotop and top state information for this system, at > > the time sysrq+t is taken. > > > > Full 'journalctl -k' output is rather excessive, 13MB uncompressed, > > 714K zstd compressed > > https://drive.google.com/open?id=1bYYedsj1O4pii51MUy-7cWhnWGXb67XE > > > > from last sysrq+t > > https://drive.google.com/open?id=1vhnIki9lpiWK8T5Qsl81_RToQ8CFdnfU > > > > last screenshot, matching above sysrq+t > > https://drive.google.com/open?id=12jpQeskPsvHmfvDjWSPOwIWSz09JIUlk > > This shows it's btrfs endio workqueue, which do the data verification > against csum tree. > > So you see the point, ext* just doesn't support data csum. But 10-17% CPU, times 8 processes? Even during scrub at maximum SSD read there isn't such a load doing csum computations. Get a load of this screenshot: https://drive.google.com/file/d/1IDboR1fzP4onu_tzyZxsx7M5cT_RJ7Iz/view That doesn't even make sense. How is it possible Btrfs is using 100% CPU times 10 processes? There aren't even that many cores. And then Firefox is using 800% CPU? Another 8 cores that don't exist. And then look at iotop which is reporting 28G/s reads? This is an ordinary SATA SSD that can't do more than maybe 600M/s reads. Something is very weird and misreporting. But again, only on Btrfs. It doesn't happen with ext4, even though the system hang user experience is the same and not worse on Btrfs. Just the system statistics seems much crazier on Btrfs. The other time I've seen this behavior? Running Firefox through gdb with certain kinds of crashes, that have nothing to do with swap. -- Chris Murphy
