-------- Original Message --------
Subject: Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?
From: Zygo Blaxell <zblaxell@xxxxxxxxxxxxxxx>
To: Qu Wenruo <quwenruo@xxxxxxxxxxxxxx>
Date: 2014年12月11日 05:57
On Thu, Dec 04, 2014 at 02:56:55PM +0800, Qu Wenruo wrote:
The main memory usage in btrfsck is extent record, which
we can't free them until we read them all in and checked, so even we
mmap/unmap, it can only help with
the extent_buffer(which is already freed if not used according to refs).
I'm thinking aloud here, but is it *really* necessary to read everything
into memory?
Totally agreed to only read what we need.
But some backref and counts on refs can only be determined after a full
scan, especially for leaf/node corruption
case.
Maybe a multiple-pass algorithm might be possible, e.g. one
to find free space by eliminating any areas that are occupied by extents,
then other passes to rebuild the metadata in the free space. Or, one
pass to verify the connectivity of references and collect dangling refs,
then a second pass which fixes only the dangling refs.
I have similar idea, but not multi-pass method, instead, using per
sector scan + tree search for other data.
E.g in extent tree check, each time only record all extents in a block
group, and check them.
After check, remove the good extents/block groups and then move to next
block group.
For fs tree, any key with same objectid(ino) as a group, and only read
the group in one time and remove
the already known healthy record. (info not fully gathered or bad record
will still stay in memory)
But I don't consider this method can really save much memory though...
Usually sequential reads are significantly faster than swapping--even
if swapping on solid-state media. It could be that reading 260GB of
metadata sequentially two or three times is still faster than thrashing
through random lookups in 20GB of swap on a 4GB machine.
Definitely, but if we want to reduce memory usage, it is almost
unavoidable to do more disk IO, especially random
disk IO, so it will become a tradeoff, which may cause the already slow
fsck more slow....
Thanks,
Qu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html