On Fri, May 02, 2014 at 01:21:50PM -0600, Chris Murphy wrote: > > On May 2, 2014, at 2:23 AM, Duncan <1i5t5.duncan@xxxxxxx> wrote: > > > > Something tells me btrfs replace (not device replace, simply replace) > > should be moved to btrfs device replace… > > The syntax for "btrfs device" is different though; replace is like balance: btrfs balance start and btrfs replace start. And you can also get a status on it. We don't (yet) have options to stop, start, resume, which could maybe come in handy for long rebuilds and a reboot is required (?) although maybe that just gets handled automatically: set it to pause, then unmount, then reboot, then mount and resume. > > > Well, I'd say two copies if it's only two devices in the raid1... would > > be true raid1. But if it's say four devices in the raid1, as is > > certainly possible with btrfs raid1, that if it's not mirrored 4-way > > across all devices, it's not true raid1, but rather some sort of hybrid > > raid, raid10 (or raid01) if the devices are so arranged, raid1+linear if > > arranged that way, or some form that doesn't nicely fall into a well > > defined raid level categorization. > > Well, md raid1 is always n-way. So if you use -n 3 and specify three devices, you'll get 3-way mirroring (3 mirrors). But I don't know any hardware raid that works this way. They all seem to be raid 1 is strictly two devices. At 4 devices it's raid10, and only in pairs. > > Btrfs raid1 with 3+ devices is unique as far as I can tell. It is something like raid1 (2 copies) + linear/concat. But that allocation is round robin. I don't read code but based on how a 3 disk raid1 volume grows VDI files as it's filled it looks like 1GB chunks are copied like this > > Disk1 Disk2 Disk3 > 134 124 235 > 679 578 689 > > So 1 through 9 each represent a 1GB chunk. Disk 1 and 2 each have a chunk 1; disk 2 and 3 each have a chunk 2, and so on. Total of 9GB of data taking up 18GB of space, 6GB on each drive. You can't do this with any other raid1 as far as I know. You do definitely run out of space on one disk first though because of uneven metadata to data chunk allocation. The algorithm is that when the chunk allocator is asked for a block group (in pairs of chunks for RAID-1), it picks the number of chunks it needs, from different devices, in order of the device with the most free space. So, with disks of size 8, 4, 4, you get: Disk 1: 12345678 Disk 2: 1357 Disk 3: 2468 and with 8, 8, 4, you get: Disk 1: 1234568A Disk 2: 1234579A Disk 3: 6789 Hugo. > Anyway I think we're off the rails with raid1 nomenclature as soon as we have 3 devices. It's probably better to call it replication, with an assumed default of 2 replicates unless otherwise specified. > > There's definitely a benefit to a 3 device volume with 2 replicates, efficiency wise. As soon as we go to four disks 2 replicates it makes more sense to do raid10, although I haven't tested odd device raid10 setups so I'm not sure what happens. > > > Chris Murphy > -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Prisoner unknown: Return to Zenda. ---
Attachment:
signature.asc
Description: Digital signature
