On 2019/2/16 上午1:19, David Sterba wrote: > On Fri, Feb 15, 2019 at 09:18:03PM +0800, Qu Wenruo wrote: >> >> >> On 2019/2/15 下午9:10, Nikolay Borisov wrote: >>> >>> >>> On 15.02.19 г. 12:50 ч., Qu Wenruo wrote: >>>> Patchset can be fetched from github: >>>> https://github.com/adam900710/linux/tree/write_time_tree_checker >>>> Which is based on v5.0-rc1 tag. >>>> Also there is no conflict rebasing the patchset to misc-next. >>>> >>>> This patchset has the following 3 features: >>>> - Tree block validation output enhancement >>>> * Output validation failure timing (write time or read time) >>>> * Always output tree block level/key mismatch error message >>>> This part is already submitted and reviewed. >>>> >>>> - Write time tree block validation check >>>> To catch memory corruption either from hardware or kernel. >>>> Example output would be: >>>> >>>> BTRFS critical (device dm-3): corrupt leaf: root=2 block=1350630375424 slot=68, bad key order, prev (10510212874240 169 0) current (1714119868416 169 0) >>>> BTRFS error (device dm-3): write time tree block corruption detected >>> This is not good. Those two error messages should be collapsed into >>> one. Otherwise it's hard to actually match them up. >> >> That shouldn't be a problem, since the error won't happen so frequently >> there is no other error message that could interrupt these 2 lines. >> >>> Better output will >>> be "Corrupt leaf detected during writing: root=..." and eliminate "write >>> time tree block corruption detected" line. Is that feasible? >> >> Feasible, currently tree checker only get called in 3 locations: >> 1) read time full checker >> 2) mark dirty time basic checker >> 3) write time full checker >> >> And they all have different internal bool to indicate the timing, so >> it's possible to output the timing. >> >> But that needs to pass the internal bool down a long long way, for all >> the output help to accept an extra string. >> I'm not a big fan for that, and prefer a timing neutral tree checker. > > I'd rather not merge the error messages, as we'll possibly add more > sanity checks to various functions so there could be a list of problems > and there's one final note about when it happened (read time/write > time). > > Matching the lines together is desirable though, so if the block number > could be part of all messages, I hope this makes it usable for analysis. This looks much better. I'll change the timing line to show extra info to match them. Thanks, Qu > > Reading btree_readpage_end_io_hook, the message should be under the err: > label, as there are 3 other possible messages printed (bad block start, > fsid and level). >
