Re: [PATCH v2 2/3] btrfs: add read_mirror_policy parameter devid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019-01-22 13:43, Anand Jain wrote:
On 01/21/2019 07:56 PM, Steven Davies wrote:
On 2018-05-16 11:03, Anand Jain wrote:

+                break;
+            }

I noticed that it's possible to pass this option multiple times at mount, which sets multiple devices as read mirrors. While that doesn't do anything harmful, the code below will only use the first device. It may be worth at least mentioning this in documentation.

 There were few feedback if read_mirror_policy should be a mount
 option or a sysfs or a property. IMO property is better as it would
 be persistent. In sysfs and mount-option, the user or a config file
 has to remember. Will fix.

I agree. Would/could this be set at mkfs time? It would then be similar to how mdadm's --write-mostly is set up.

+
             ret = -EINVAL;
             goto out;
         case Opt_err:
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 64dba5c4cf33..3a9c3804ff17 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5220,6 +5220,16 @@ static int find_live_mirror(struct
btrfs_fs_info *fs_info,
         num_stripes = map->num_stripes;

     switch(fs_info->read_mirror_policy) {
+    case BTRFS_READ_MIRROR_BY_DEV:
+        preferred_mirror = first;
+        if (test_bit(BTRFS_DEV_STATE_READ_MIRROR,
+                 &map->stripes[preferred_mirror].dev->dev_state))
+            break;
+        if (test_bit(BTRFS_DEV_STATE_READ_MIRROR,
+                 &map->stripes[++preferred_mirror].dev->dev_state)) <--[*]
+            break;
+        preferred_mirror = first;

Why set preferred_mirror again? The only effect of re-setting it will be to use the lowest devid (1) rather than the highest (2). Is there any difference? Either way it should never happen, because the above code traps it (except when the BTRFS_DEV_STATE_READ_MIRROR flag has been changed as mentioned above).

 Code at [*] above does ++preferred_mirror. So the following
   preferred_mirror = first;
 is not redundant.

Yes, but preferred_mirror will be first + num_stripes; meaning that without that line the last stripe is preferred and with it the first stripe is preferred. It only changes the fallback case from reading from devid 2 to devid 1, which barely matters. Perhaps it would be even better to fall back to pid % num_stripes in this case anyway?

From Goffredo's comment[1] on Timofey's similar effect patch, if it becomes possible to have more mirrors in a RAID1/RAID10 fs then this code will need to be updated to test dev_state for more than two devices. Would it be sensible to implement this as a for loop straight away?

 Right. A loop is better, will add.

Thinking more about this, if it becomes possible to have more than two devices in a RAID1 or part-RAID10 then do we also need to consider that there may be more than one READ_MIRROR drive? e.g. slow+fast+fast drives in a RAID1 with this approach would result in all chunks being read from the first drive with READ_MIRROR set if both chunks are on the fast drives. This is where Timofey's queue length patch in [1] would become useful - in any case, that is an existing problem and probably deserving of an entirely new patch.

[1]: https://patchwork.kernel.org/patch/10678531/#22315779


 Thanks,
Anand.

Steve



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux