Dear list, at customer site i can't mount a given btrfs device in rw mode. this is production data and i do have a backup and managed to mount the filesystem in ro mode. I did copy out relevant stuff. Having said this, if btrfs --repair can't heal the situation, i could reformat the filesystem and start all over. But i would prefere to save the time and take the heeling as a proof of "production ready" status of btrfs-progs. Here are the details: kernel: 5.2.2 (Ubuntu 18.04.3) btrfs-progs: 5.2.1 HBA: DELL Perc # storcli /c0/v0 # 0/0 RAID5 Optl RW Yes RWBD - OFF 7.274 TB SSD-Data #btrfs fi show /dev/sdX #Label: 'Data-Ssd' uuid: <my uuid> # Total devices 1 FS bytes used 7.12TiB # devid 1 size 7.27TiB used 7.27TiB path /dev/<mydev> What happend: Customer filled up the filesystem (lots of snapshots in a couple of subvolumes). System was working with kernel 4.15 and btrfs-progs 4.15. I updated kernel and btrfs-progs with the assumption more mainlined/actual tools could do a better job. Since they have seen lots of fixups. 1) As a first step, i did run # btrfs check --mode lowmem --progress /dev/<mydev> got extend mismatches and wrong extend CRC's 2) As a second step i did try to mount in recovery mode # mount -t btrfs -o defaults, recovery, skip_balance /dev/<mydev> /mnt I included skip_balance, since there might be an unfinished balance run. But this didn't work out. 3) As a third step, got it mounted with ro mode # mount -t btrfs -o ro /dev/<mydev> /mnt And filed data received via usage: # btrfs fi usage /mnt # Overall: # Device size: 7.27TiB # Device allocated: 7.27TiB # Device unallocated: 1.00MiB # Device missing: 0.00B # Used: 7.13TiB # Free (estimated): 134.13GiB (min: 134.13GiB) # Data ratio: 1.00 # Metadata ratio: 2.00 # Global reserve: 512.00MiB (used: 0.00B) # # Data,single: Size:7.23TiB, Used:7.10TiB # /dev/<mydev> 7.23TiB # # Metadata,DUP: Size:21.50GiB, Used:14.31GiB # /dev/<mydev> 43.00GiB # # System,DUP: Size:8.00MiB, Used:864.00KiB # /dev/<mydev> 16.00MiB # Unallocated: # /dev/<mydev> 1.00MiB Obviously, totally filled up. At that time i copied out all relevant data - you never know ... Finished! Then tried to unmout, but that got to nowhere. Leads to a reboot . 4) As a forth step, i tried to repair it # btrfs check --mode lowmem --progress --repair /dev/<mydev> # enabling repair mode # WARNING: low-memory mode repair support is only partial # Opening filesystem to check... # Checking filesystem on /dev/<mydev> # UUID: <my UUID> # [1/7] checking root items (0:00:33 elapsed, 20853512 items checked) # Fixed 0 roots. # ERROR: extent[1988733435904, 134217728] referencer count mismatch (root: 261, owner: 286, offset: 5905580032) wanted: # 28, have: 34 # ERROR: fail to allocate new chunk No space left on device # Try to exclude all metadata blcoks and extents, it may be slow # Delete backref in extent [1988733435904 134217728]07:16 elapsed, 40435 items checked) # ERROR: extent[1988733435904, 134217728] referencer count mismatch (root: 261, owner: 286, offset: 5905580032) wanted: 27, have: 34 # Delete backref in extent [1988733435904 134217728] # ERROR: extent[1988733435904, 134217728] referencer count mismatch (root: 261, owner: 286, offset: 5905580032) wanted: 26, have: 34 # ERROR: commit_root already set when starting transaction # ERROR: fail to start transaction: Invalid argument # ERROR: extent[2017321811968, 134217728] referencer count mismatch (root: 261, owner: 287, offset: 2281701376) wanted: 3215, have: 3319 # ERROR: commit_root already set when starting transaction # ERROR: fail to start transaction Invalid argument This ends with a core-dump. Last not least my question: I'm not experienced enough to solve this issue myself and need your help. Is it worth the time and effort to solve this issue? Developers might be interested while having a real live testbed? Do you need any further info that will help to solve the issue? Best regards Ralf
