Re: [PATCH 0/6] btrfs: metadata uuid fixes and enhancements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/12/13 4:03 PM, Nikolay Borisov wrote:


On 12.12.19 г. 13:01 ч., damenly.su@xxxxxxxxx wrote:
From: Su Yue <Damenly_Su@xxxxxxx>

This patchset fixes one reproducible bug and add two split-brain
cases ignored.

The origin code thinks the final state of successful synced device is
always has INCOMPAT_METADATA_UUID feature. However, a device without
the feature flag can be the one pull into disk. This is what handled
in the patchset. Test images are added in btrfs-progs part.

Patch[1] fixes a bug about wrong fsid copy.
Patch[2] is for the later patches.
Patch[3-5] add the forgotten cases.
Patch[6] just does simple code movement for grace.

The set passes xfstests-dev without regressions.

Su Yue (6):
   btrfs: metadata_uuid: fix failed assertion due to unsuccessful device
     scan
   btrfs: metadata_uuid: move split-brain handling from fs_id() to new
     function
   btrfs: split-brain case for scanned changing device with
     INCOMPAT_METADATA_UUID
   btrfs: split-brain case for scanned changed device without
     INCOMPAT_METADATA_UUID
   btrfs: copy fsid and metadata_uuid for pulled disk without
     INCOMPAT_METADATA_UUID
   btrfs: metadata_uuid: move partly logic into find_fsid_inprogress()

  fs/btrfs/volumes.c | 193 +++++++++++++++++++++++++++++----------------
  1 file changed, 125 insertions(+), 68 deletions(-)



I'm currently on holiday but the fsid change feature has a design
document here:
https://github.com/btrfs/btrfs-dev-docs/blob/master/fsid-change.txt

it lists all the cases I have handled. If you think there are other
please first describe them in prose following the parlance set out in
the document to ease reasoning.


Thanks. I am going to do it.
The following is just a rough version for people to understand.
The pull request version will be more official like expressions in
btrfs-dev-docs.


> 4. Failure during transaction x + 1. When such a failure happens the
> filesystem in question will be partitioned in two sets P and Q. Where P
> would have the C flag and a new value for fsid written to it as well
as the
> old FSID value written in the ‘metadata_uuid’ field. In contrast Q would
> have just the old fsid and the IP flag, also the superblock’s generation
> number of disks in P will be higher than those in Q. Again two cases
needs
> to be handled:

I won't argue the Q has the old fsid and IP flag. There is another state
of P.


0) There are two devices with fsid A, without metadata_uuid. E.g
       d1[A, 0, 0]
       d2[A, 0, 0]
       (The first element is the FSID, the 2nd is METADATA_UUID, the 3rd is
        the incompat flag bits)

1) After running "btrfstune -M B":
       d1[B, A, METADATA_UUID]
       d2[B, A, METADATA_UUID]

2) During "btrfstune -M a",
2.1) After first btrfs_commit_transaction() of set_metadata_uuid() finished:

       d1[B, A, METADATA_UUID | FSID_CHANGING_V2]
       d2[B, A, METADATA_UUID | FSID_CHANGING_V2]

2.2) Then run into the branch  btrfstune.c: line 141.

	    if (new_uuid && uuid_changed && memcmp(disk_super->metadata_uuid,
                                               new_fsid,
BTRFS_FSID_SIZE) == 0) {
                /*
                 * Changing fsid to be the same as metadata uuid, so just
                 * disable the flag
                 */
                memcpy(disk_super->fsid, &new_fsid, BTRFS_FSID_SIZE);
                incompat_flags &= ~BTRFS_FEATURE_INCOMPAT_METADATA_UUID;
                btrfs_set_super_incompat_flags(disk_super, incompat_flags);
                memset(disk_super->metadata_uuid, 0, BTRFS_FSID_SIZE);

     Now @disk_super is [A, 0, FSID_CHANGING_V2], without METADATA_UUID
flag.

2.3) Then we go to the final btrfs_commit_transaction() of
set_metadata_uuid().
     But powerloss happened, and only d1 get synced:

      d1[A, 0, 0]                                 ---> P
      d2[B, A, METADATA_UUID | FSID_CHANGING_V2]  ---> Q



     So there are 2 cases you forgot to consider.
     - P is scanned first, then Q.
     - Q is scanned first, then P.






[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux