On Sat, Jun 04, 2016 at 06:26:39PM +0800, Qu Wenruo wrote:
>
>
> On 06/03/2016 10:27 PM, Josef Bacik wrote:
> >On 06/01/2016 09:12 PM, Qu Wenruo wrote:
> >>
> >>
> >>At 06/02/2016 06:08 AM, Mark Fasheh wrote:
> >>>On Fri, Apr 01, 2016 at 02:35:00PM +0800, Qu Wenruo wrote:
> >>>>Core implement for inband de-duplication.
> >>>>It reuse the async_cow_start() facility to do the calculate dedupe
> >>>>hash.
> >>>>And use dedupe hash to do inband de-duplication at extent level.
> >>>>
> >>>>The work flow is as below:
> >>>>1) Run delalloc range for an inode
> >>>>2) Calculate hash for the delalloc range at the unit of dedupe_bs
> >>>>3) For hash match(duplicated) case, just increase source extent ref
> >>>> and insert file extent.
> >>>> For hash mismatch case, go through the normal cow_file_range()
> >>>> fallback, and add hash into dedupe_tree.
> >>>> Compress for hash miss case is not supported yet.
> >>>>
> >>>>Current implement restore all dedupe hash in memory rb-tree, with LRU
> >>>>behavior to control the limit.
> >>>>
> >>>>Signed-off-by: Wang Xiaoguang <wangxg.fnst@xxxxxxxxxxxxxx>
> >>>>Signed-off-by: Qu Wenruo <quwenruo@xxxxxxxxxxxxxx>
> >>>>---
> >>>> fs/btrfs/extent-tree.c | 18 ++++
> >>>> fs/btrfs/inode.c | 235
> >>>>++++++++++++++++++++++++++++++++++++++++++-------
> >>>> fs/btrfs/relocation.c | 16 ++++
> >>>> 3 files changed, 236 insertions(+), 33 deletions(-)
> >>>>
> >>>>diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> >>>>index 53e1297..dabd721 100644
> >>>>--- a/fs/btrfs/extent-tree.c
> >>>>+++ b/fs/btrfs/extent-tree.c
> >>>>@@ -37,6 +37,7 @@
> >>>> #include "math.h"
> >>>> #include "sysfs.h"
> >>>> #include "qgroup.h"
> >>>>+#include "dedupe.h"
> >>>>
> >>>> #undef SCRAMBLE_DELAYED_REFS
> >>>>
> >>>>@@ -2399,6 +2400,8 @@ static int run_one_delayed_ref(struct
> >>>>btrfs_trans_handle *trans,
> >>>>
> >>>> if (btrfs_delayed_ref_is_head(node)) {
> >>>> struct btrfs_delayed_ref_head *head;
> >>>>+ struct btrfs_fs_info *fs_info = root->fs_info;
> >>>>+
> >>>> /*
> >>>> * we've hit the end of the chain and we were supposed
> >>>> * to insert this extent into the tree. But, it got
> >>>>@@ -2413,6 +2416,15 @@ static int run_one_delayed_ref(struct
> >>>>btrfs_trans_handle *trans,
> >>>> btrfs_pin_extent(root, node->bytenr,
> >>>> node->num_bytes, 1);
> >>>> if (head->is_data) {
> >>>>+ /*
> >>>>+ * If insert_reserved is given, it means
> >>>>+ * a new extent is revered, then deleted
> >>>>+ * in one tran, and inc/dec get merged to 0.
> >>>>+ *
> >>>>+ * In this case, we need to remove its dedup
> >>>>+ * hash.
> >>>>+ */
> >>>>+ btrfs_dedupe_del(trans, fs_info, node->bytenr);
> >>>> ret = btrfs_del_csums(trans, root,
> >>>> node->bytenr,
> >>>> node->num_bytes);
> >>>>@@ -6713,6 +6725,12 @@ static int __btrfs_free_extent(struct
> >>>>btrfs_trans_handle *trans,
> >>>> btrfs_release_path(path);
> >>>>
> >>>> if (is_data) {
> >>>>+ ret = btrfs_dedupe_del(trans, info, bytenr);
> >>>>+ if (ret < 0) {
> >>>>+ btrfs_abort_transaction(trans, extent_root,
> >>>>+ ret);
> >>>
> >>>I don't see why an error here should lead to a readonly fs.
> >>> --Mark
> >>>
> >>
> >>Because such deletion error can lead to corruption.
> >>
> >>For example, extent A is already in hash pool.
> >>And when freeing extent A, we need to delete its hash, of course.
> >>
> >>But if such deletion fails, which means the hash is still in the pool,
> >>even the extent A no longer exists in extent tree.
> >
> >Except if we're in in-memory mode only it doesn't matter, so don't abort
> >if we're in in-memory mode. Thanks,
> >
> >Josef
> >
>
> If we can't ensure a hash is delete along with the extent, we will
> screw up the whole fs, as new write can points to non-exist extent.
>
> Although you're right with in-memory mode here, we won't abort
> trans, as inmem_del_hash() won't return error code. It will always
> return 0.
Until a third party comes along and changes it to return an error code and
neither you or I are there to remind them to fix this check (or have simply
forgotten).
> So still, no need to change anyway.
Personally I'd call this 'defensive coding' and do a check for in-memory
only before our abort_trans(). This would have no effect on our running
code but avoids the problem I stated above. Alternatively, you could
clearly comment the exception. I don't like leaving it as-is for the reason
I stated above.
Thanks,
--Mark
--
Mark Fasheh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html