Hi folks, So I had an interesting scenario that I thought I'd share in case anyone wants to investigate before I blow away this filesystem... Timeline: - Running Linux 5.2.14, I pushed this system to OOM; the oom killer ran and killed some userspace tasks. At this point many of the remaining tasks were stuck in uninterruptible sleeps. Not really worried, I turned the machine off and on again to just get everything back to normal. But I guess now that everything had gone horribly wrong already at this point... - Upon reboot, the system boots OK but now btrfs is throwing zillions of checksum errors. After some time the filesystem is remounted readonly and I lose the ability to interact with the system at all, so it gets powered off. - Now the filesystem is unmountable. I've attached the logs (gzipped) that were captured before, which I think covers from syslog starting on the original boot to the OOM (but possibly not right afterwards since things were hanging), plus the boot logs from the first reboot up to (shortly before) the filesystem goes readonly. Appended is what I get now when attempting to access the filesystem on a rescue system. Let me know if you need any more info. Cheers, Nick # mount -o ro /dev/mapper/fucked /mnt/fucked [ 340.787239] Btrfs loaded, crc32c=crc32c-intel [ 340.788390] BTRFS: device label alastor-root devid 1 transid 2616190 /dev/dm-0 [ 347.054205] BTRFS info (device dm-0): disk space caching is enabled [ 347.054207] BTRFS info (device dm-0): has skinny extents [ 347.155561] BTRFS info (device dm-0): enabling ssd optimizations [ 347.334218] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.334414] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.453104] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.453318] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.456581] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.456843] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.461251] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.461638] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.462755] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.462957] BTRFS error (device dm-0): parent transid verify failed on 554858348544 wanted 2616165 found 2616162 [ 347.511704] BTRFS error (device dm-0): error loading props for ino 721 (root 1): -5 [ 347.551471] BTRFS: error (device dm-0) in __btrfs_prealloc_file_range:10310: errno=-5 IO failure [ 347.551514] WARNING: CPU: 3 PID: 1143 at fs/btrfs/extent-tree.c:4277 btrfs_free_reserved_data_space_noquota+0xd0/0xe0 [btrfs] [ 347.551515] Modules linked in: btrfs libcrc32c xor raid6_pq dm_crypt algif_skcipher af_alg dm_mod ext4 crc32c_generic mbcache jbd2 fscrypto ccm 8021q garp mrp stp llc joydev mousedev rmi_smbus rmi_core arc4 iwlmvm mac80211 intel_rapl ofpart uvcvideo x86_pkg_temp_thermal cmdlinepart intel_powerclamp btusb intel_spi_platform coretemp intel_spi iwlwifi btrtl mei_wdt snd_hda_codec_realtek spi_nor snd_hda_codec_generic snd_hda_codec_hdmi mtd btbcm snd_hda_intel btintel kvm_intel iTCO_wdt snd_hda_codec videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 iTCO_vendor_support bluetooth crct10dif_pclmul thinkpad_acpi videobuf2_common ghash_clmulni_intel tpm_tis videodev tpm_tis_core intel_cstate cfg80211 pcspkr snd_hda_core tpm snd_hwdep intel_uncore snd_pcm nvram psmouse snd_timer input_leds ecdh_generic media [ 347.551557] intel_rapl_perf mei_me crc16 snd ac battery rng_core rfkill mei rtsx_pci_ms lpc_ich intel_pch_thermal soundcore memstick evdev mac_hid wmi_bmof i2c_i801 pcc_cpufreq ip_tables x_tables overlay squashfs loop isofs sd_mod uas usb_storage i915 kvmgt vfio_mdev mdev vfio_iommu_type1 vfio ahci kvm libahci crc32_pclmul crc32c_intel rtsx_pci_sdmmc irqbypass i2c_algo_bit serio_raw mmc_core atkbd libata drm_kms_helper libps2 aesni_intel syscopyarea sysfillrect aes_x86_64 sysimgblt crypto_simd fb_sys_fops cryptd ehci_pci xhci_pci glue_helper ehci_hcd scsi_mod rtsx_pci drm e1000e xhci_hcd intel_gtt agpgart wmi i8042 serio [ 347.551595] CPU: 3 PID: 1143 Comm: mount Not tainted 4.19.34-1-lts #1 [ 347.551596] Hardware name: LENOVO 20CMCTO1WW/20CMCTO1WW, BIOS N10ET42W (1.21 ) 02/26/2016 [ 347.551610] RIP: 0010:btrfs_free_reserved_data_space_noquota+0xd0/0xe0 [btrfs] [ 347.551612] Code: 6c 55 1b c1 48 8b 7b 08 48 83 c3 18 45 31 c9 4d 89 e8 4c 89 f1 4c 89 fa 4c 89 e6 e8 ca c3 af c0 48 8b 03 48 85 c0 75 dc eb 98 <0f> 0b 31 db eb 89 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 [ 347.551613] RSP: 0018:ffffaafd41fef758 EFLAGS: 00010287 [ 347.551614] RAX: 0000000000000000 RBX: fffffffffffc0000 RCX: 0000000000040000 [ 347.551615] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9087abcf5600 [ 347.551616] RBP: ffff9087abcf5600 R08: 0000000000000369 R09: 0000000000000004 [ 347.551617] R10: ffff9087a02c40d8 R11: ffffffff82861eed R12: ffff90881aa2a000 [ 347.551618] R13: 0000000000040000 R14: 0000000000040000 R15: ffff9087b0299ad0 [ 347.551620] FS: 00007fa67625b780(0000) GS:ffff908825cc0000(0000) knlGS:0000000000000000 [ 347.551621] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 347.551622] CR2: 00007f1d33580458 CR3: 00000001aedcc006 CR4: 00000000003606e0 [ 347.551623] Call Trace: [ 347.551638] btrfs_free_reserved_data_space+0x4b/0x70 [btrfs] [ 347.551656] __btrfs_prealloc_file_range+0x388/0x450 [btrfs] [ 347.551670] cache_save_setup+0x1dd/0x3a0 [btrfs] [ 347.551685] btrfs_setup_space_cache+0x97/0xc0 [btrfs] [ 347.551700] commit_cowonly_roots+0xde/0x2b0 [btrfs] [ 347.551718] ? btrfs_qgroup_account_extents+0xbb/0x1d0 [btrfs] [ 347.551734] btrfs_commit_transaction+0x2ac/0x890 [btrfs] [ 347.551752] btrfs_recover_log_trees+0x38a/0x420 [btrfs] [ 347.551771] ? replay_one_dir_item+0x170/0x170 [btrfs] [ 347.551786] open_ctree+0x1a21/0x1b60 [btrfs] [ 347.551798] btrfs_mount_root+0x656/0x720 [btrfs] [ 347.551802] ? bitmap_find_next_zero_area_off+0x3d/0x90 [ 347.551804] ? cpumask_next+0x16/0x20 [ 347.551807] ? pcpu_alloc+0x1cb/0x640 [ 347.551810] mount_fs+0x3b/0x167 [ 347.551813] vfs_kern_mount.part.11+0x54/0x110 [ 347.551825] btrfs_mount+0x16f/0x860 [btrfs] [ 347.551830] ? path_lookupat.isra.13+0xa6/0x230 [ 347.551832] ? legitimize_path.isra.9+0x2d/0x60 [ 347.551834] ? bitmap_find_next_zero_area_off+0x3d/0x90 [ 347.551836] ? pcpu_alloc_area+0xe2/0x130 [ 347.551838] ? pcpu_next_unpop+0x37/0x50 [ 347.551840] ? cpumask_next+0x16/0x20 [ 347.551842] ? pcpu_alloc+0x1cb/0x640 [ 347.551844] ? mount_fs+0x3b/0x167 [ 347.551845] mount_fs+0x3b/0x167 [ 347.551848] vfs_kern_mount.part.11+0x54/0x110 [ 347.551850] do_mount+0x1fb/0xc10 [ 347.551852] ? _copy_from_user+0x37/0x60 [ 347.551854] ? memdup_user+0x4b/0x70 [ 347.551855] ksys_mount+0xba/0xd0 [ 347.551857] __x64_sys_mount+0x21/0x30 [ 347.551860] do_syscall_64+0x4e/0x100 [ 347.551862] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 347.551864] RIP: 0033:0x7fa6763e568e [ 347.551866] Code: 48 8b 0d d5 17 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a2 17 0c 00 f7 d8 64 89 01 48 [ 347.551867] RSP: 002b:00007ffc92d01298 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 347.551868] RAX: ffffffffffffffda RBX: 00005561f6fe4400 RCX: 00007fa6763e568e [ 347.551869] RDX: 00005561f6fec000 RSI: 00005561f6fe5300 RDI: 00005561f6fe4610 [ 347.551870] RBP: 00007fa67650b1e4 R08: 0000000000000000 R09: 0000000000000000 [ 347.551871] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 [ 347.551872] R13: 0000000000000001 R14: 00005561f6fe4610 R15: 00005561f6fec000 [ 347.551874] ---[ end trace 010db75a59ca54bb ]--- [ 347.556498] BTRFS warning (device dm-0): Skipping commit of aborted transaction. [ 347.556501] BTRFS: error (device dm-0) in cleanup_transaction:1846: errno=-5 IO failure [ 347.557941] BTRFS error (device dm-0): pending csums is 262144 [ 347.557946] BTRFS: error (device dm-0) in btrfs_replay_log:2277: errno=-5 IO failure (Failed to recover log tree) [ 347.790510] BTRFS error (device dm-0): open_ctree failed # btrfs check --readonly /dev/mapper/fucked Opening filesystem to check... Checking filesystem on /dev/mapper/fucked UUID: 412a90ce-0a07-4072-9219-44bd98eb1be4 [1/7] checking root items parent transid verify failed on 554858348544 wanted 2616165 found 2616162 parent transid verify failed on 554858348544 wanted 2616165 found 2616162 parent transid verify failed on 554858348544 wanted 2616165 found 2616162 parent transid verify failed on 554858348544 wanted 2616165 found 2616162 Ignoring transid failure leaf parent key incorrect 554858348544 ERROR: failed to repair root items: Operation not permitted
Attachment:
alastor-log-merged.log.gz
Description: GNU Zip compressed data
