Re: Unexpected raid1 behaviour

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> The fact is, the only cases where this is really an issue is
>> if you've either got intermittently bad hardware, or are
>> dealing with external

> Well, the RAID1+ is all about the failing hardware.

>> storage devices. For the majority of people who are using
>> multi-device setups, the common case is internally connected
>> fixed storage devices with properly working hardware, and for
>> that use case, it works perfectly fine.

> If you're talking about "RAID"-0 or storage pools (volume
> management) that is true. But if you imply, that RAID1+ "works
> perfectly fine as long as hardware works fine" this is
> fundamentally wrong.

I really agree with this, the argument about "properly working
hardware" is utterly ridiculous. I'll to this: apparently I am
not the first one to discover the "anomalies" in the "RAID"
profiles, but I may have been the first to document some of
them, e.g. the famous issues with the 'raid1' profile. How did I
discover them? Well, I had used Btrfs in single device mode for
a bit, and wanted to try multi-device, and the docs seemed
"strange", so I did tests before trying it out.

The tests were simply on a spare PC with a bunch of old disks to
create two block devices (partitions), put them in 'raid1' first
natively, then by adding a new member to an existing partition,
and then 'remove' one, or simply unplug it (actually 'echo 1 >
/sys/block/.../device/delete') initially. I wanted to check
exactly what happened, resync times, speed, behaviour and speed
when degraded, just ordinary operational tasks.

Well I found significant problems after less than one hour. I
can't imagine anyone with some experience of hw or sw RAID
(especially hw RAID, as hw RAID firmware is often fantastically
buggy especially as to RAID operations) that wouldn't have done
the same tests before operational use, and would not have found
the same issues too straight away. The only guess I could draw
is that whover designed the "RAID" profile had zero operational
system administration experience.

> If the hardware needs to work properly for the RAID to work
> properly, noone would need this RAID in the first place.

It is not just that, but some maintenance operations are needed
even if the hardware works properly: for example preventive
maintenance, replacing drives that are becoming too old,
expanding capacity, testing periodically hardware bits. Systems
engineers don't just say "it works, let's assume it continues to
work properly, why worry".

My impression is that multi-device and "chunks" were designed in
one way by someone, and someone else did not understand the
intent, and confused them with "RAID", and based the 'raid'
profiles on that confusion. For example the 'raid10' profile
seems the least confused to me, and that's I think because the
"RAID" aspect is kept more distinct from the "multi-device"
aspect. But perhaps I am an optimist...

To simplify a longer discussion to have "RAID" one needs an
explicit design concept of "stripe", which in Btrfs needs to be
quite different from that of "set of member devices" and
"chunks", so that for example adding/removing to a "stripe" is
not quite the same thing as adding/removing members to a volume,
plus to make a distinction between online and offline members,
not just added and removed ones, and well-defined state machine
transitions (e.g. in response to hardware problems) among all
those, like in MD RAID. But the importance of such distinctions
may not be apparent to everybody.

But I may have read comments in which "block device" (a data
container on some medium), "block device inode" (a descriptor
for that) and "block device name" (a path to a "block device
inode") were hopelessly confused, so I don't hold a lot of
hope. :-(
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux