On 2018年07月04日 17:46, Qu Wenruo wrote: > > > On 2018年07月04日 15:08, Nikolay Borisov wrote: >> >> >> On 3.07.2018 12:10, Qu Wenruo wrote: >>> If a crafted btrfs has missing block group items, it could cause >>> unexpected behavior and breaks our expectation on 1:1 >>> chunk<->block group mapping. >>> >>> Although we added block group -> chunk mapping check, we still need >>> chunk -> block group mapping check. >>> >>> This patch will do extra check to ensure each chunk has its >>> corresponding block group. >>> >>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847 >>> Reported-by: Xu Wen <wen.xu@xxxxxxxxxx> >>> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> >>> --- >>> fs/btrfs/extent-tree.c | 52 +++++++++++++++++++++++++++++++++++++++++- >>> 1 file changed, 51 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >>> index 82b446f014b9..746095034ca2 100644 >>> --- a/fs/btrfs/extent-tree.c >>> +++ b/fs/btrfs/extent-tree.c >>> @@ -10038,6 +10038,56 @@ static int check_exist_chunk(struct btrfs_fs_info *fs_info, u64 start, u64 len, >>> return ret; >>> } >>> >>> +/* >>> + * Iterate all chunks and verify each of them has corresponding block group >>> + */ >>> +static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info) >>> +{ >>> + struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree; >>> + struct extent_map *em; >>> + struct btrfs_block_group_cache *bg; >>> + u64 start = 0; >>> + int ret = 0; >>> + >>> + while (1) { >>> + read_lock(&map_tree->map_tree.lock); >>> + em = lookup_extent_mapping(&map_tree->map_tree, start, >>> + (u64)-1 - start); >> len parameter of lookup_extent_mapping eventually ends up in range_end. >> Meaning it will just return -1. Why not use just -1 for len. Looking at >> the rest of the code this seems to be the convention. But then there are >> several places where 1 is passed as well. Hm, in any case a single >> number is simpler than an expression. > > I still like to be accurate here, since it's @len, then we should follow > its naming. > Although we have range_end() for correct careless caller, it still > doesn't sound good just passing -1 as @len. > >> >>> + read_unlock(&map_tree->map_tree.lock); >>> + if (!em) >>> + break; >>> + >>> + bg = btrfs_lookup_block_group(fs_info, em->start); >>> + if (!bg) { >>> + btrfs_err_rl(fs_info, >>> + "chunk start=%llu len=%llu doesn't have corresponding block group", >>> + em->start, em->len); >>> + ret = -ENOENT; >>> + free_extent_map(em); >>> + break; >>> + } >>> + if (bg->key.objectid != em->start || >>> + bg->key.offset != em->len || >>> + (bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK) != >>> + (em->map_lookup->type & BTRFS_BLOCK_GROUP_TYPE_MASK)) { >>> + btrfs_err_rl(fs_info, >>> +"chunk start=%llu len=%llu flags=0x%llx doesn't match with block group start=%llu len=%llu flags=0x%llx", >>> + em->start, em->len, >>> + em->map_lookup->type & BTRFS_BLOCK_GROUP_TYPE_MASK, >>> + bg->key.objectid, bg->key.offset, >>> + bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK); >>> + ret = -EUCLEAN; >>> + free_extent_map(em); >>> + btrfs_put_block_group(bg); >>> + break; >>> + } >>> + start = em->start + em->len; >>> + free_extent_map(em); >>> + btrfs_put_block_group(bg); >>> + } >>> + return ret; >>> +} >>> + >>> int btrfs_read_block_groups(struct btrfs_fs_info *info) >>> { >>> struct btrfs_path *path; >>> @@ -10227,7 +10277,7 @@ int btrfs_read_block_groups(struct btrfs_fs_info *info) >>> >>> btrfs_add_raid_kobjects(info); >>> init_global_block_rsv(info); >>> - ret = 0; >>> + ret = check_chunk_block_group_mappings(info); >> >> Rather than doing that can we just get the count of chunks. Then if we >> have as many chunks as BG have been read in and we know the BG -> chunk >> mapping check has passed we can assume that chunks also map to BG >> without going through all chunks. > > Nope, just as the checks done in that function, we must ensure not only > the number of bgs/chunks matches, but *each* chunk must have a block > group with the same size, length and type flags. Thanks to Gu's comment, there in find_first_block() we have already done extra check to ensure every block group we're adding has a corresponding chunk, thus just doing chunk/bg counting should be able to detect missing block groups. I'll try this method to reduce unnecessary block group lookup in next version. Thanks, Qu > > If we have a block group doesn't match its size/length, it's pretty > possible that the corrupted block group may overlap with other block > groups, causing undefined behavior. > So the same for type flags. > > This means the only reliable check is the one used in this and previous > check. > (Check bg->chunk matches, and then check chunk->bg matches, using size + > len + type flags as material) > > Thanks, > Qu > >> >>> error: >>> btrfs_free_path(path); >>> return ret; >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
