On Donnerstag, 29. Dezember 2016 22:31:29 CET Duncan wrote: > Jan Koester posted on Thu, 29 Dec 2016 20:05:35 +0100 as excerpted: > > Hi, > > > > i have problem with filesystem if my system crashed i have made been > > hard reset of the system after my Filesystem was crashed. I have already > > tried to repair without success you can see it on log file. It's seem > > one corrupted block brings complete filesystem to crashing. > > > > Have anybody idea what happened with my filesystem ? > > > > dmesg if open file: > > [29450.404327] WARNING: CPU: 5 PID: 16161 at > > /build/linux-lIgGMF/linux-4.8.11/ fs/btrfs/extent-tree.c:6945 > > __btrfs_free_extent.isra.71+0x8e2/0xd60 [btrfs] > > First a disclaimer. I'm a btrfs user and list regular, not a dev. As > such I don't really read call traces much beyond checking the kernel > version, and don't do code. It's likely that you will get a more > authoritative reply from someone who does, and it should take precedence, > but in the mean time, I can try to deal with the preliminaries. > > Kernel 4.8.11, good. But you run btrfs check below, and we don't have > the version of your btrfs-progs userspace. Please report that too. > > > btrfs output: > > root@dibsi:/home/jan# btrfs check /dev/disk/by-uuid/ > > 73d4dc77-6ff3-412f-9b0a-0d11458faf32 > > Note that btrfs check is read-only by default. It will report what it > thinks are errors, but won't attempt to fix them unless you add various > options (such as --repair) to tell it to do so. This is by design and is > very important, as attempting to repair problems that it doesn't properly > understand could make the problems worse instead of better. So even tho > the above command will only report what it sees as problems, not attempt > to fix them, you did the right thing by running check without --repair > first, and posting the results here for an expert to look at and tell you > whether to try --repair, or what else to try instead. > > > Checking filesystem on > > /dev/disk/by-uuid/73d4dc77-6ff3-412f-9b0a-0d11458faf32 > > UUID: 73d4dc77-6ff3-412f-9b0a-0d11458faf32 > > checking extents > > parent transid verify failed on 2280458502144 wanted 861168 > > found 860380 > > parent transid verify failed on 2280458502144 wanted 861168 > > found 860380 > > checksum verify failed on 2280458502144 found FC3DF84D > > wanted 2164EB93 > > checksum verify failed on 2280458502144 found FC3DF84D > > wanted 2164EB93 > > bytenr mismatch, want=2280458502144, have=15938383240448 > > [...] > > Some other information that we normally ask for includes the output from > a few other btrfs commands. > > It's unclear from your report if the filesystem will mount at all. The > subject says mount failed, but then it mentions any file on the > filesystem, which seems to imply that you could mount, but that any file > you attempted to actually access after mounting crashes the system with > the trace you posted, so I'm not sure if you can actually mount the > filesystem at all. > > If you can't mount the filesystem, at least try to post the output from... > > btrfs filesystem show > > If you can mount the filesystem, then the much more detailed... > > btrfs filesystem usage > > ... if your btrfs-progs is new enough, or... > > btrfs filesystem df > > ... if btrfs-progs is too old to have the usage command. > > Also, if it's not clear from the output of the commands above (usage by > itself, or show plus df, should answer most of the below, but show alone > only provides some of the information), tell us a bit more about the > filesystem in question: > > Single device (like traditional filesystems) or multiple device? If > multiple device, what raid levels if you know them, or did you just go > with the defaults. If single device, again, defaults, or did you specify > single or dup, particularly for metadata. > > Also, how big was the filesystem and how close to full? And was it on > ssd, spinning rust, or on top of something virtual (like a VM image > existing as a file on the host, or lvm, or mdraid, etc)? > > > Meanwhile, if you can mount, the first thing I'd try is btrfs scrub > (unless you were running btrfs raid56 mode, which makes things far more > complex as it's not stable yet and isn't recommended except for testing > with data you can afford to lose). Often, a scrub can fix much of the > damage of a crash if you were running raid1 mode (multi-device metadata > default), raid10, or dup (single device metadata default, except on ssd), > as those have a second checksummed copy that will often be correct that > scrub can use to fix the bad copy, but it will detect but be unable to > fix damage in single mode (default for data) or raid0 mode, as those > don't have a second copy available to fix the first. > > Because the default for single device btrfs is dup metadata, single data, > in that case the scrub should fix most or all of metadata, allowing you > to access small file (roughly anything under a couple KiB) and larger > files that weren't themselves damaged, but you may still have damage in > some files of any significant size. > > But scrub can only run if you can mount the filesystem. If you can't, > then you have to try other things in ordered to get it mountable, first. > Many of these other things tend to be much more complex and risky, so if > you can mount at all, try scrub first, and see how much it helps. Here > I'm dual-device raid1 for nearly all my btrfs, and (assuming I can mount > the affected filesystem, which I usually can) I now run scrub first thing > after a crash, as a preventative measure even without knowing if the > filesystem was damaged or not. > > If the filesystem won't mount, then the recommendation is /likely/ to be > trying the usebackuproot mount option (which replaced the older recovery > mount option, but you're using a new enough kernel for usebackuproot), > which will try some older tree roots if the newest one is damaged. You > may have to use that option with readonly, which of course will prevent > running scrub or the like while mounted, but may help you get access to > the data at least to freshen up your backups. However, usebackuproot > will by definition sacrifice the last seconds of writes before the crash, > and while I'd probably try this option on my own system without asking, > I'm not comfortable recommending it to others, so I'd suggest waiting for > one of the higher experts to confirm, before trying it yourself. > > Beyond usebackuproot, you get into more risky attempts to repair that may > instead do further damage if they don't work. This is where btrfs check > --repair lives, along with some other check options, btrfs rescue, etc. > Unless specifically told otherwise by an expert after they look at the > filesystem info, these are risky enough that if at all possible, you want > to freshen your backups before you try them. > > That's where btrfs restore comes in, as it lets you try to attempt > restoring files from an unmountable filesystem, while not actually > writing to that filesystem, thus not risking doing further damage, in the > process. Of course that means you have to have some place to put the > files it's going to restore. In simple mode you just run btrfs restore > with commandline parameters telling it what device to try to restore from > and where to put the restored files (and some options telling it whether > to try restoring metadata like file ownership, permissions, dates, etc), > and it just works. > > However, should btrfs restore's simple mode fail, there's more complex > advanced modes to try, still without risking further damage to the > filesystem in question, but that gets complex enough it needs its own > post... if you come to that. There's a page on the wiki with some > instructions, but they may not be current and it's a complex enough > operation that most people need help beyond what's on the wiki (and in > the btrfs-restore manpage), anyway. But here's the link so you can take > a look at what the general operation looks like: > > https://btrfs.wiki.kernel.org/index.php/Restore > > Meanwhile, it's a bit late now, but in general, btrfs is considered still > in heavy development, stabilizing but not yet fully stable and mature. > As such, while any sysadmin worth the label will tell you that you are > defining any data you don't have backups for as not worth the time, > trouble and resources to do those backups, basically defining it as throw- > away data because it's /not/ worth backing up or by definition you'd > /have/ those backups, even for normal stable and mature filesystems, with > btrfs still stabilizing, backups are even /more/ strongly recommended, as > is keeping them current within the window of data you're willing to lose > if you lose the primary copy, and keeping those backups practically > usable (not over a slow net link that'll take over a week to download in > ordered to restore, for instance, one real case that was posted). If > you're doing that then losing a filesystem isn't going to be a big stress > and you can afford to skip the real complex and risky stuff (unless > you're simply doing it to learn how) and just restore from backup, as it > will be simpler. If not, then you should really reexamine whether btrfs > is the right filesystem choice for you, because it /isn't/ yet fully > stable and mature, and chances are you'd be better off with a more stable > and mature filesystem where not having updated at-hand backups is less of > a risk (altho as I said any sysadmin worth the name will tell you not > having backups is literally defining the data as throw-away value, > because in the real world, "things happen", and there's too many of those > things possible in the real world to behave otherwise). hi, I'am using kernel 4.8.0-2 and btrfsprogs 4.9. I can mouting now the filesystem but it's chrashed when i try to access file on this filesystem. btrfs fi show Label: none uuid: 73d4dc77-6ff3-412f-9b0a-0d11458faf32 Total devices 5 FS bytes used 1.17TiB devid 1 size 931.51GiB used 420.78GiB path /dev/sdd devid 2 size 931.51GiB used 420.78GiB path /dev/sdf devid 3 size 931.51GiB used 420.78GiB path /dev/sde devid 4 size 931.51GiB used 420.78GiB path /dev/sda devid 5 size 931.51GiB used 420.78GiB path /dev/sdc btrfs filesystem usage sudo btrfs filesystem usage /mnt WARNING: RAID56 detected, not implemented WARNING: RAID56 detected, not implemented WARNING: RAID56 detected, not implemented Overall: Device size: 4.55TiB Device allocated: 0.00B Device unallocated: 4.55TiB Device missing: 0.00B Used: 0.00B Free (estimated): 0.00B (min: 8.00EiB) Data ratio: 0.00 Metadata ratio: 0.00 Global reserve: 512.00MiB (used: 0.00B) Data,RAID6: Size:1.22TiB, Used:0.00B /dev/sda 418.00GiB /dev/sdc 418.00GiB /dev/sdd 418.00GiB /dev/sde 418.00GiB /dev/sdf 418.00GiB Metadata,RAID6: Size:8.25GiB, Used:44.00KiB /dev/sda 2.75GiB /dev/sdc 2.75GiB /dev/sdd 2.75GiB /dev/sde 2.75GiB /dev/sdf 2.75GiB System,RAID6: Size:96.00MiB, Used:0.00B /dev/sda 32.00MiB /dev/sdc 32.00MiB /dev/sdd 32.00MiB /dev/sde 32.00MiB /dev/sdf 32.00MiB Unallocated: /dev/sda 510.73GiB /dev/sdc 510.73GiB /dev/sdd 510.73GiB /dev/sde 510.73GiB /dev/sdf 510.73GiB sudo btrfs filesystem df /mnt Data, RAID6: total=1.22TiB, used=0.00B System, RAID6: total=96.00MiB, used=0.00B Metadata, RAID6: total=8.25GiB, used=80.00KiB GlobalReserve, single: total=512.00MiB, used=8.00KiB -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
