Re: [BUG][PATCH] btrfs: a mixed profile DUP and RAID1C3/RAID1C4 prevent to alloc a new chunk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/31/20 2:51 AM, Qu Wenruo wrote:


On 2020/5/31 上午2:53, Goffredo Baroncelli wrote:

Hi All,

after the thread "Question: how understand the raid profile of a btrfs
filesystem" [*] I was working to cleanup the function
btrfs_reduce_alloc_profile(), when I figured that it was incompatible with
a mixed profile of DUP and RAID1C3/RAID1C4.

This is a very uncommon situation; to be honest it very unlikely that it will
happen at all.

However if the filesystem has a mixed profiles of DUP and RAID1C3/RAID1C4 (of
the same type of chunk of course, i.e. if you have metadata RAID1C3 and data
DUP there is no problem because the type of chunks are different), the function
btrfs_reduce_alloc_profile() returns both the profiles and subsequent calls
to alloc_profile_is_valid() return invalid.

The problem is how the function btrfs_reduce_alloc_profile "reduces" the
profiles.

[...]
static u64 btrfs_reduce_alloc_profile(struct btrfs_fs_info *fs_info, u64 flags)
[...]
         allowed &= flags;

         if (allowed & BTRFS_BLOCK_GROUP_RAID6)
                 allowed = BTRFS_BLOCK_GROUP_RAID6;
         else if (allowed & BTRFS_BLOCK_GROUP_RAID5)
                 allowed = BTRFS_BLOCK_GROUP_RAID5;
         else if (allowed & BTRFS_BLOCK_GROUP_RAID10)
                 allowed = BTRFS_BLOCK_GROUP_RAID10;
         else if (allowed & BTRFS_BLOCK_GROUP_RAID1)
                 allowed = BTRFS_BLOCK_GROUP_RAID1;
         else if (allowed & BTRFS_BLOCK_GROUP_RAID0)
                 allowed = BTRFS_BLOCK_GROUP_RAID0;

	flags &= ~BTRFS_BLOCK_GROUP_PROFILE_MASK;

[...]

"allowed" are all the possibles profiles allowed by the disks.
"flags" contains the existing profiles.

If "flags" contains both DUP, RAID1C3 no reduction is performed and both
the profiles are returned.

If full conversion from DUP to RAID1C3 is performed, there is no problem.
But a partial conversion from DUP to RAID1C3 is performed, then there is no
possibility to allocate a new chunk.

On my tests the filesystem was never corrupted, but only force to RO.
However I was unable to exit from this state without my patch.

This in facts exposed the long existing bug that btrfs has no on-disk
indicator for the target chunk time, thus we need to be "creative" to
handle chunk profiles.

Fully agree; this patch is... a patch to correct a bug. Changing to having
a persistent field is a lot more complicated.


I'm wondering if we could add new persistent item in chunk tree or super
block to solve the problem

I suggest to add a new object in the trees. I think that the superblock should
be reserved for info which allow to detect / identify the filesystem (i.e.
FDID - Disk UUID, Label basic data for the boot) and nothing more.

Moreover having an object stored in the tree, it would be possible to
think to and extendible structure (an object has a size)
to allow future expansion.


Any idea on this, David?

Thanks,
Qu


[*] https://lore.kernel.org/linux-btrfs/517dac49-5f57-2754-2134-92d716e50064@xxxxxxxx/



--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux