On 11.03.19 г. 14:35 ч., Qu Wenruo wrote: > > > On 2019/3/11 下午8:26, Nikolay Borisov wrote: >> >> >> On 11.03.19 г. 3:17 ч., Qu Wenruo wrote: >>> >>> >>> On 2019/3/11 上午7:09, Chris Murphy wrote: >>>> In the case where superblock 0 at 65536 is valid but stale (older than >>>> the others): >>> >>> Then this means either the fs is fuzzed, or the FUA implementation of >>> the disk is completely screwed up. >>> >>> Btrfs kernel submit super blocks as the following sequence: >>> 1) wait all metadata write >>> 2) flush >>> 3) FUA the primary superblock >> >> SATA devices generally do not have FUA support. For example my evo 850 >> ssds do not support it nor does my evo 860 PRO. IMO not having >> functioning FUA seems to be the norm rather than an exception. > > Kernel block layer will translate FUA to write + flush. Where exactly does this happen? > So in that case we will do: > > 1) wait all metadata write > 2) flush > 3) write first sb, flush > 4) write backup sb > > For FUA -> write + flush, it's less atomic than native FUA, but it > should be good enough for pseudo-atomic. > > Thanks, > Qu > >> >> >>> 4) write the backup superblocks >>> >>> If backup is newer than primary, then the FUA write doesn't reach disk >>> before normal write. >>> This means any fs could be corrupted on that disk, not only btrfs. >>> >>>> >>>> 1. btrfs check doesn't complain, the stale super is used for the check >>>> 2. when mounting, super 0 is used, no complaints at mount time, fairly >>>> quickly the newer supers are overwritten >>> >>> The reason why kernel doesn't search backup roots is to avoid stale btrfs. >>> For case like mkfs.btrfs -> do btrfs write -> mkfs.xfs -> try mount as >>> btrfs again, this would cause problems. >>> >>> So IMHO always use the primary superblock is the designed behavior. >>> >>> Thanks, >>> Qu >>> >>>> >>>> Is this expected? In particular, in lieu of `btrfs rescue super` >>>> behavior which considers super 0 a bad super, and offers to fix it >>>> from the newer ones, and when I answer y, it replaces super 0 with >>>> newer information from the other supers. >>>> >>>> I think the `btrfs rescue` behavior is correct. I would expect that >>>> all the supers are read at mount time, and if there's discrepancy that >>>> either there's code to suspiciously sanity check the latest roots in >>>> the newest super, or it flat out fails to mount. Mounting based on >>>> stale super data seems risky doesn't it? >>>> >>> >
