On Tue, Apr 7, 2020 at 2:11 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > > On Tue, Apr 7, 2020 at 1:46 PM kjansen387 <kjansen387@xxxxxxxxx> wrote: > > > > Hello, > > > > I'm using btrfs on fedora 31 running 5.5.10-200.fc31.x86_64 . > > > > I've moved my workload from md raid5 to btrfs raid1. > > # btrfs filesystem show > > Label: none uuid: 8ce9e167-57ea-4cf8-8678-3049ba028c12 > > Total devices 5 FS bytes used 3.73TiB > > devid 1 size 3.64TiB used 2.53TiB path /dev/sde > > devid 2 size 3.64TiB used 2.53TiB path /dev/sdf > > devid 3 size 1.82TiB used 902.00GiB path /dev/sdb > > devid 4 size 1.82TiB used 902.03GiB path /dev/sdc > > devid 5 size 1.82TiB used 904.03GiB path /dev/sdd > > > > # btrfs fi df /export > > Data, RAID1: total=3.85TiB, used=3.72TiB > > System, RAID1: total=32.00MiB, used=608.00KiB > > Metadata, RAID1: total=6.00GiB, used=5.16GiB > > GlobalReserve, single: total=512.00MiB, used=0.00B > > > > After moving to btrfs, I'm seeing freezes of ~10 seconds very, very > > often (multiple times per minute). Mariadb, vim, influxdb, etc. > > > > See attachment for a stacktrace of vim, and the dmesg output of 'echo w > > > /proc/sysrq-trigger' also including other hanging processes. > > > > What's going on.. ? Hope someone can help. > > How busy are the databases? What are the mount options for this volume? > > I think there is some kind of write contention possible if there's > heavy fsync writes, since the tree log is per subvolume? (Maybe a > developer can describe this correctly if I haven't.) A possible work > around is putting each database in its own subvolume. You could, of course, start with just the busiest database and see if it makes any difference. By default on Fedora, only /home and / are on separate subvolumes. The general idea would be to use a subvolume instead of a directory, e.g. ##shutdown the database mv /var/lib/mysql /var/lib/mysqlold btrfs sub create /var/lib/mysql restorecon -Rv /var/lib/mysql cp -r --reflink /var/lib/mysqlold/* /var/lib/mysql/ ##resume database operation, and clean up mysqlold whenever Another thing that might help, depending on the workload, is space_cache=v2. You can safely switch to v2 via: mount -o remount,clear_cache,space_cache=v2 / This is a one time command. It will set a feature flag for space cache v2, and it'll be used automatically on subsequent mounts. It may be busy for a minute or so while the cache is rebuilt. You might test these separately so you have an idea of their relative effects on the workload. -- Chris Murphy
