On 12/01/2014 06:47 AM, Oliver wrote:
Hi All,
on a testing machine I installed four HDDs and they are configured as
RAID6. For a test I removed one of the drives (/dev/sdk) while the
volume was mounted and data was written to it. This worked well, as far
as I can see. Some I/O errors were written to /var/log/syslog, but the
volume kept working. Unfortunately the command "btrfs fi sh" did not
show any missing drives. So I remounted the volume in degraded mode:
"mount -t btrfs /dev/sdx1 -o remount,rw,degraded,noatime /mnt". This way
the drive in question was reported as missing. Then I plugged in the HDD
again (it is of course /dev/sdk again) and started a balancing in hope
that this will restore RAID6: "btrfs filesystem balance start /mnt". Now
the volume looks like this:
Since it was already running and such, remounting it as degraded was
probably not a good thing (or even vaguely necessary).
The WIKI, in discussing add/remove and failed drives goes to great
lengths (big red box) to discuss the current instability of RAID5/6 format.
I am guessing here but I _think_ you should do the following...
(0) Backup your data. [okay this is a test system that you deliberately
purturbed but still... 8-) ]
Option 1:
(reasonable, but undocumented .. Either blance or scrub _ought_ to look
at the disk sectors and trigger some re-copying from the good parts.)
The disk is in the array (again), you may just need to re-balance or
scrub the array to get the data on the drive back in harmony with the
state of the array overall.
Option 2:
(unlikely :: add and remove are about making the geometry smaller/larger
and, as stated, a RAID 6 cannot be less than 4 drives by definition, so
there is no three-drive geometry for a RAID 6.)
re-unplug the device, then use btrfs remove /dev/sdk /mnt
then re-plug-in the device and use btrfs add /dev/sdk /mnt
Option 3:
(reasonable, but undocumented :: replace by device id -- 4 in your
example case -- instead of system path. This would, I should think, skip
the check of /dev/sdk1's separate status)
btrfs replace start -f 4 /dev/sdk1 /mnt
Option 3a:
(got to get /dev/sdk1 back out of the list of active devices for /mnt so
the system wont see /dev/sdk1 as "mounted" (e.g. held by a subsystem))
unplug device.
mount -o remount,degraded etc...
plug in device.
btrfs replace start -f 4 /dev/sdk1 /mnt
Option 4:
(most likely, most time consuming)
Unplug /dev/sdk. Plug it into another computer and zero a decent chunk
of partition 1.
Plug it back into the original computer
do the replace operation as in Option 3.
This is the most-likely correct option if a simple rebalance or scrub
doesn't work, as you will be presenting the system with three attached
drives, one "missing" drive that will not match any necessary
signatures, and a "new, blank" drive in its place.
===
In all cases, you may need to unmount or remount or remount degraded in
there somewhere, particularly because you have already done so at least
once.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html