Re: [PATCH] Btrfs: fix bad extent logging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Josef,

> A user sent me a btrfs-image of a file system that was panicing on mount during
> the log recovery.  I had originally thought these problems were from a bug in
> the free space cache code, but that was just a symptom of the problem.  The
> problem is if your application does something like this
> 
> [prealloc][prealloc][prealloc]
> 
> the internal extent maps will merge those all together into one extent map, even
> though on disk they are 3 separate extents.  So if you go to write into one of
> these ranges the extent map will be right since we use the physical extent when
> doing the write, but when we log the extents they will use the wrong sizes for
> the remainder prealloc space.  If this doesn't happen to trip up the free space
> cache (which it won't in a lot of cases) then you will get bogus entries in your
> extent tree which will screw stuff up later.  The data and such will still work,
> but everything else is broken.  This patch fixes this by not allowing extents
> that are on the modified list to be merged.  This has the side effect that we
> are no longer adding everything to the modified list all the time, which means
> we now have to call btrfs_drop_extents every time we log an extent into the
> tree.  So this allows me to drop all this speciality code I was using to get
> around calling btrfs_drop_extents.  With this patch the testcase I've created no
> longer creates a bogus file system after replaying the log.  Thanks,
> 
> Signed-off-by: Josef Bacik <jbacik@xxxxxxxxxxxx>
>  

<snip>
> 			while (1) {
> 				write_lock(&em_tree->lock);
> -				err = add_extent_mapping(em_tree, hole_em);
> -				if (!err)
> -					list_move(&hole_em->list,
> -						  &em_tree->modified_extents);
> +				err = add_extent_mapping(em_tree, hole_em, 1);
> 				write_unlock(&em_tree->lock);
> 				if (err != -EEXIST)
> 					break;
> @@ -5989,7 +5977,8 @@ static int merge_extent_mapping(struct extent_map_tree *em_tree,
> 		em->block_start += start_diff;
> 		em->block_len -= start_diff;
> 	}
> -	return add_extent_mapping(em_tree, em);
> +	printk(KERN_ERR "merging here for %Lu\n", em->orig_start);

	How about using something like pr_debug here.
	When i tested btrfs-next, i found it hit too much.


Thanks,
Wang
	
> +	return add_extent_mapping(em_tree, em, 0);
> }
> 
> static noinline int uncompress_inline(struct btrfs_path *path,
> @@ -6283,7 +6272,7 @@ insert:
> 
> 	err = 0;
> 	write_lock(&em_tree->lock);
> -	ret = add_extent_mapping(em_tree, em);
> +	ret = add_extent_mapping(em_tree, em, 0);
> 	/* it is possible that someone inserted the extent into the tree
> 	 * while we had the lock dropped.  It is also possible that
> 	 * an overlapping map exists in the tree
> @@ -6706,10 +6695,7 @@ static struct extent_map *create_pinned_em(struct inode *inode, u64 start,
> 		btrfs_drop_extent_cache(inode, em->start,
> 				em->start + em->len - 1, 0);
> 		write_lock(&em_tree->lock);
> -		ret = add_extent_mapping(em_tree, em);
> -		if (!ret)
> -			list_move(&em->list,
> -				  &em_tree->modified_extents);
> +		ret = add_extent_mapping(em_tree, em, 1);
> 		write_unlock(&em_tree->lock);
> 	} while (ret == -EEXIST);

<snip>

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux