In my raid6 setup, a disk was soft-failing on me. I pulled the disk,
inserted a new one, mounted degraded, then did btrfs-replace while
running some RW jobs on the FS.
My jobs were taking too long. It seems like raid6 btrfs-replace without
the source disk is not very fast. So I unmounted the FS, inserted the
soft-failing disk again, remounted normally (non-degraded) and restarted
the (now much faster) btrfs-replace.
I checked on the status sometime later and there were hundreds if not
thousands of "transid verify failure" messages in my dmesg.
Additionally the btrfs-replace operation had outright failed.
After removing the soft-failing disk, and mounting degraded again, it now
seemed that some files in the FS were corrupted and in some instances
accessing certain files would actually cause the kernel to loop
indefinitely while eating up memory.
Nothing was corrupted before I mounted the soft-failing disk in
non-degraded mode. This leads me to believe that btrfs doesn't
intelligently handle remounting normally previously degraded arrays. Can
anyone confirm this?
I would highly recommend some sort of fast failure or DWIM behavior for
this, e.g.:
$ mount -o degraded /dev/sda1 /mnt
$ touch /mnt/newfile.txt
$ unmount /mnt
$ # plugin other device, e.g. /dev/sdb
$ mount /dev/sda1 /mnt
STDERR: Ignoring /dev/sdb1, FS has changed since mounting degraded
without it.
Alternatively:
STDERR: Cannot mount file system with device /dev/sdb1, FS has changed
since mounting degraded without it.
Rian
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html