[snip] >> In this case, it depends on when and how we mark the device resilvering. >> If we record the generation of write error happens, then just initial a >> scrub for generation greater than that generation. > > If we record all the degraded transactions then yes. Not just the last > failed transaction. The last successful generation won't be upgraded until the scrub success. > >> In the list, some guys mentioned that for LVM/mdraid they will record >> the generation when some device(s) get write error or missing, and do >> self cure. >> >>> >>> I have been scratching on fix for this [3] for some time now. Thanks >>> for the participation. In my understanding we are missing across-tree >>> parent transid verification at the lowest possible granular OR >> >> Maybe the newly added first_key and level check could help detect such >> mismatch? >> >>> other approach is to modify Liubo approach to provide a list of >>> degraded chunks but without a journal disk. >> >> Currently, DEV_ITEM::generation is seldom used. (only for seed sprout >> case) >> Maybe we could reuse that member to record the last successful written >> transaction to that device and do above purposed LVM/mdraid style self >> cure? > > Record of just the last successful transaction won't help. OR its an > overkill to fix a write hole. > > Transactions: 10 11 [12] [13] [14] <---- write hole ----> [19] [20] > In the above example > disk disappeared at transaction 11 and when it reappeared at > the transaction 19, there were new writes as well as the resilver > writes, Then the last good generation will be 11 and we will commit current transaction as soon as we find a device disappear, and won't upgrade the last good generation until the scrub finishes. > so we were able to write 12 13 14 and 19 20 and then > the disk disappears again leaving a write hole. Only if in above transactions, the auto scrub finishes, the device will has generation updated, or it will stay generation 11. > Now next time when > disk reappears the last transaction indicates 20 on both-disks > but leaving a write hole in one of disk. That will only happens if auto-scrub finishes in transaction 20, or its last successful generation will stay 11. > But if you are planning to > record and start at transaction [14] then its an overkill because > transaction [19 and [20] are already in the disk. Yes, I'm doing it overkilled. But it's already much better than scrub all block groups (my original plan). Thanks, Qu > > Thanks, Anand > > >> Thanks, >> Qu >> >>> [3] https://patchwork.kernel.org/patch/10403311/ >>> >>> Further, as we do a self adapting chunk allocation in RAID1, it needs >>> balance-convert to fix. IMO at some point we have to provide degraded >>> raid1 chunk allocation and also modify the scrub to be chunk granular. >>> >>> Thanks, Anand >>> >>>> Any idea on this? >>>> >>>> Thanks, >>>> Qu >>>> >>>>> Unlock: btrfs_fs_info::chunk_mutex >>>>> Unlock: btrfs_fs_devices::device_list_mutex >>>>> >>>>> ----------------------------------------------------------------------- >>>>> >>>>> >>>>> Thanks, Anand >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe >>>>> linux-btrfs" in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> linux-btrfs" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: OpenPGP digital signature
