On Fri, 14 Feb 2020 12:30:27 +0100 Marc Lehmann <schmorp@xxxxxxxxxx> wrote: > I've upgraded a machine to linux 5.4.15 that runs a small netnews You don't seem to mention which version you upgraded from. If a full bisect is impractical, this is the (very distant) next best thing you can do. Was it from 5.14.14, or from 3.4? :) Also would be nice if you can double-check that returning to that previous version right now makes the issue go away, and it's not a coincidence of something else changed on the FS or OS (such as other package upgrades beside the kernel). > system. It normally pulls news with about 20MB/s. After upgrading (it > seems) that this process is now CPU bound, and I get only about 10mb/s > throughput. Otherwise, everything seems fine - no obvious bugs, and no > obvious performance problems. > > "CPU-bound" specifically means that the disk(s) seem pretty idle (it an > 6x10TB raid5), I can do a lot of I/O without slowing down the transfer, > but there is always a single kworker which is constantly at 100% cpu (i.e. > one core) in top: > > 8963 root 20 0 0 0 0 R 2 100.0 0.0 2:04 [kworker/u8:15+flush-btrfs-3] > > When I cat /proc/8963/task/8963/stack regularly, I get either no output or > (most often) this single line: > > [<0>] tree_search_offset.isra.0+0x16a/0x1d0 [btrfs] > > It is possible that this is _not_ new behaviour with 5.4, but I often use > top, and I can't remember having a kworker stuck at 100% cpu for days. > (The fs is about a year old and had no issues so far, the last scrub is about > a week old). > > Another symptom is that Dirty in /proc/meminfo is typically at 7-8GB, > which is more or less the value of /proc/sys/vm/dirty_ratio, Writeback is > usually 0 or has small values, and running sync often takes 30m or more. > > The 100% cpu is definitely caused by the news transfer - pausing it and > waiting a while makes it effectively disappear and everything goes back to > normal. > > The news process effectively does this in multiple parallel loops: > > openat(AT_FDCWD, "/store/04267/26623~", O_WRONLY|O_CREAT|O_EXCL, 0600... > write(75, "Path: ask005.abavia.com!"..., 656453... > close(75) = 0 > renameat2(AT_FDCWD, "/store/04267/26623~", AT_FDCWD, "/store/04267/26623", 0 ... > > The file layout is one layer of subdirectories with 100000 files inside > each, which has posed absolutely no probelms withe xt4/xfs in the past, > and also btrfs didn't seem to mind. > > My question is, would this be expected behaviour? If yes, is it something > that can be influenced/improved on my side? > > I can investigate and do some experiments, but I cannot easily update > kernels/do reboots on this system. > -- With respect, Roman
