On Sun, Jan 5, 2020 at 2:58 PM Christian Wimmer <telefonchris@xxxxxxxxxx> wrote: > > On 5. Jan 2020, at 18:13, Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > > > > On Sun, Jan 5, 2020 at 1:36 PM Christian Wimmer <telefonchris@xxxxxxxxxx> wrote: > >> > >> > >> > >>> On 5. Jan 2020, at 17:30, Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > >>> > >>> On Sun, Jan 5, 2020 at 12:48 PM Christian Wimmer > >>> <telefonchris@xxxxxxxxxx> wrote: > >>>> > >>>> > >>>> #fdisk -l > >>>> Disk /dev/sda: 256 GiB, 274877906944 bytes, 536870912 sectors > >>>> Disk model: Suse 15.1-0 SSD > >>>> Units: sectors of 1 * 512 = 512 bytes > >>>> Sector size (logical/physical): 512 bytes / 4096 bytes > >>>> I/O size (minimum/optimal): 4096 bytes / 4096 bytes > >>>> Disklabel type: gpt > >>>> Disk identifier: 186C0CD6-F3B8-471C-B2AF-AE3D325EC215 > >>>> > >>>> Device Start End Sectors Size Type > >>>> /dev/sda1 2048 18431 16384 8M BIOS boot > >>>> /dev/sda2 18432 419448831 419430400 200G Linux filesystem > >>>> /dev/sda3 532674560 536870878 4196319 2G Linux swap > >>> > >>> > >>> > >>>> btrfs insp dump-s /dev/sda2 > >>>> > >>>> > >>>> Here I have only btrfs-progs version 4.19.1: > >>>> > >>>> linux-ze6w:~ # btrfs version > >>>> btrfs-progs v4.19.1 > >>>> linux-ze6w:~ # btrfs insp dump-s /dev/sda2 > >>>> superblock: bytenr=65536, device=/dev/sda2 > >>>> --------------------------------------------------------- > >>>> csum_type 0 (crc32c) > >>>> csum_size 4 > >>>> csum 0x6d9388e2 [match] > >>>> bytenr 65536 > >>>> flags 0x1 > >>>> ( WRITTEN ) > >>>> magic _BHRfS_M [match] > >>>> fsid affdbdfa-7b54-4888-b6e9-951da79540a3 > >>>> metadata_uuid affdbdfa-7b54-4888-b6e9-951da79540a3 > >>>> label > >>>> generation 799183 > >>>> root 724205568 > >>>> sys_array_size 97 > >>>> chunk_root_generation 797617 > >>>> root_level 1 > >>>> chunk_root 158835163136 > >>>> chunk_root_level 0 > >>>> log_root 0 > >>>> log_root_transid 0 > >>>> log_root_level 0 > >>>> total_bytes 272719937536 > >>>> bytes_used 106188886016 > >>>> sectorsize 4096 > >>>> nodesize 16384 > >>>> leafsize (deprecated) 16384 > >>>> stripesize 4096 > >>>> root_dir 6 > >>>> num_devices 1 > >>>> compat_flags 0x0 > >>>> compat_ro_flags 0x0 > >>>> incompat_flags 0x163 > >>>> ( MIXED_BACKREF | > >>>> DEFAULT_SUBVOL | > >>>> BIG_METADATA | > >>>> EXTENDED_IREF | > >>>> SKINNY_METADATA ) > >>>> cache_generation 799183 > >>>> uuid_tree_generation 557352 > >>>> dev_item.uuid 8968cd08-0c45-4aff-ab64-65f979b21694 > >>>> dev_item.fsid affdbdfa-7b54-4888-b6e9-951da79540a3 [match] > >>>> dev_item.type 0 > >>>> dev_item.total_bytes 272719937536 > >>>> dev_item.bytes_used 129973092352 > >>>> dev_item.io_align 4096 > >>>> dev_item.io_width 4096 > >>>> dev_item.sector_size 4096 > >>>> dev_item.devid 1 > >>>> dev_item.dev_group 0 > >>>> dev_item.seek_speed 0 > >>>> dev_item.bandwidth 0 > >>>> dev_item.generation 0 > >>> > >>> Partition map says > >>>> /dev/sda2 18432 419448831 419430400 200G Linux filesystem > >>> > >>> Btrfs super says > >>>> total_bytes 272719937536 > >>> > >>> 272719937536*512=532656128 > >>> > >>> Kernel FITRIM want is want=532656128 > >>> > >>> OK so the problem is the Btrfs super isn't set to the size of the > >>> partition. The usual way this happens is user error: partition is > >>> resized (shrink) without resizing the file system first. This file > >>> system is still at risk of having problems even if you disable > >>> fstrim.timer. You need to shrink the file system is the same size as > >>> the partition. > >>> > >> > >> Could this be a problem of Parallels Virtual machine that maybe sometimes try to get more space on the hosting file system? > >> One solution would be to have a fixed size of the disc file instead of a growing one. > > > > I don't see how it's related. Parallels has no ability I'm aware of to > > change the GPT partition map or the Btrfs super block - as in, rewrite > > it out with a modification correctly including all checksums being > > valid. This /dev/sda has somehow been mangled on purpose. > > > > Again, from the GPT > >>>> /dev/sda2 18432 419448831 419430400 200G Linux filesystem > >>>> /dev/sda3 532674560 536870878 4196319 2G Linux swap > > > > The end LBA for sda2 is 419448831, but the start LBA for sda3 is > > 532674560. There's a ~54G gap in there as if something was removed. > > I'm not sure why a software installer would produce this kind of > > layout on purpose, because it has no purpose. > > > > > > Ok, understand. Very strange. Maybe we should forget about this particular problem. > Should I repair it somehow? And if yes, how? > >>>> /dev/sda2 18432 419448831 419430400 200G Linux filesystem delete this partition, recreate a new one with the same start LBA, 18432 and a new end LBA that matches the actual fs size: 18432+(272719937536/512)=532674560 write it and reboot the VM. You could instead resize Btrfs to match the partition but that might piss off the kernel if Btrfs thinks it needs to move block groups from a location outside the partition. So I would just resize the partition. And then you need to do a scrub and a btrfs check on this volume to see if it's damaged. I don't know but I suspect it could be possible that this malformed root might have resulted in a significant instability of the system at some point, and in it's last states of confusion as it face planted, wrote out very spurious data causing your broken Btrfs file system. I can't prove that. > > > > > > > > >> > >>> > >>> > >>>> linux-ze6w:~ # systemctl status fstrim.timer > >>>> ● fstrim.timer - Discard unused blocks once a week > >>>> Loaded: loaded (/usr/lib/systemd/system/fstrim.timer; enabled; vendor preset: enabled) > >>>> Active: active (waiting) since Sun 2020-01-05 15:24:59 -03; 1h 19min ago > >>>> Trigger: Mon 2020-01-06 00:00:00 -03; 7h left > >>>> Docs: man:fstrim > >>>> > >>>> Jan 05 15:24:59 linux-ze6w systemd[1]: Started Discard unused blocks once a week. > >>>> > >>>> linux-ze6w:~ # systemctl status fstrim.service > >>>> ● fstrim.service - Discard unused blocks on filesystems from /etc/fstab > >>>> Loaded: loaded (/usr/lib/systemd/system/fstrim.service; static; vendor preset: disabled) > >>>> Active: inactive (dead) > >>>> Docs: man:fstrim(8) > >>>> linux-ze6w:~ # > >>> > >>> OK so it's not set to run. Why do you have FITRIM being called? > >> > >> No idea. > > > > Well you're going to have to find it. I can't do that for you. > > > Ok, I will have a look. Can I simply deactivate the service? fstrim.service is a one shot. The usual method of it being activated once per week is via fstrim.timer - but your status check of fstrim.timer says it's disabled. So something else is running fstrim. I have no idea what it is, you have to find it in order to deactivate it. This 12T file system is a single "device" backed by a 12T file on the Promise drive? And it's a Parallel's formatted VM file? I guess I would have used raw instead of a Parallels format. That way you can inspect things from outside the VM. But that's perhaps a minor point. Check that this 12T (virtual) physical block device inside the guest has the exact correct size you expect. The partition start and end are correct, and that this partition's size matches that of the Btrfs super total_bytes or dev_item.total_bytes. Those two should be the same if it's a single device Btrfs file system. Something still doesn't pass the smell test so it's not at all clear this is an fstrim bug and not some other file system vs device resize problem. -- Chris Murphy
