Re: [PATCH v4 2/5] btrfs: use BIOs instead of buffer_heads from superblock writeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 06, 2020 at 08:20:16AM +0000, Johannes Thumshirn wrote:
> >> @@ -3497,9 +3506,23 @@ static int write_dev_supers(struct btrfs_device *device,
> >>   		op_flags = REQ_SYNC | REQ_META | REQ_PRIO;
> >>   		if (i == 0 && !btrfs_test_opt(device->fs_info, NOBARRIER))
> >>   			op_flags |= REQ_FUA;
> > 
> > Question on the existing code:  why is it safe to not use FUA for the
> > subsequent superblocks?
> > 
> >> +
> >>C +		/*
> >> +		 * Directly use BIOs here instead of relying on the page-cache
> >> +		 * to do I/O, so we don't loose the ability to do integrity
> >> +		 * checking.
> >> +		 */
> >> +		bio = bio_alloc(gfp_mask, 1);
> >> +		bio_set_dev(bio, device->bdev);
> >> +		bio->bi_iter.bi_sector = bytenr >> SECTOR_SHIFT;
> >> +		bio->bi_private = device;
> >> +		bio->bi_end_io = btrfs_end_super_write;
> >> +		bio_add_page(bio, page, BTRFS_SUPER_INFO_SIZE,
> >> +			     offset_in_page(bytenr));
> > 
> > Missing return value check.  But given that it is a single page and
> > can't error out please switch to __bio_add_page here.
> IR
> Good question, I guess it's saver to always FUA the SBs

That is a performance optimization IIRC, only the primary superblock
does FUA the backup superblocks don't as this would add 2 more flushes
that are considered expensive.

The trade-off is optimistic because the backup superblocks are almost
never necessary. For the common power-fail situation primary will be
there or not atomically, the non-FUA writes of secondary superblocks
will be perhaps delayed a bit. The scenario where the primary sb is
unexpectedly damaged would have to happen in the short window between
primary FUA and backup writes, so the current version of sb is not
available. Something like that:

  write primary sb
1 FUA

  write backup copy 1
  other writes
  write backup copy 2
  other writes
2 FUA (or equvalent flushing the copies to device)

The window is between 1 and 2, and if some divine force kills primary
sb, the backup copies are not permanently stored yet. Which makes
recovery of the last transaction tricky, but there are still the backup
superblocks with previous intact version.

With FUA after each backup, the window would be shortened, with only 2
blocks written, allowing to access the latest transaction, or possibly
the previous one too given where exactly the write sequence is
interrupted.

The above describes possible scenario but I consider it quite rare to
hit in practice, also it depends on the device that should not just skip
writes or FUAs. So the performance optimization is IMO justified.



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux