On 2020-05-10 12:55, Qu Wenruo wrote:
On 2020/5/10 下午6:55, Steven Davies wrote:
On 2020-05-10 02:20, Qu Wenruo wrote:
Yes, I'm now stuck with a btrfs_extent_inline_ref of type
BTRFS_SHARED_DATA_REF_KEY which I understand is a direct backref to a
metadata block[1],
Yep, SHARED_DATA_REF is the type for direct (shows the direct parent)
for data.
But there is also an indirect (just tell you how to search) one,
EXTENT_DATA_REF, and under most case, EXTENT_DATA_REF is more common.
but I don't understand how to search for that block
itself. I got lucky with the rest of the code and have found all
EXTENT_ITEM_KEYs for a file. The python library makes looking through
the EXTENT_DATA_REF_KEYs easy but not the shared data refs.
For EXTENT_DATA_REF, it contains rootid, objectid (inode number),
offset
(not file offset, but a calculated one), and count.
That's pretty simple, since it contains the rootid and inode number.
For SHARED_DATA_REF, you need to search the parent bytenr in extent
tree.
It can be SHARED_BLOCK_REF (direct meta ref) or TREE_BLOCK_REF
(indirect
meta ref).
For TREE_BLOCK_REF, although it contains the owner, you can't stop
here,
but still do a search to build a full path towards that root node.
Then check each node to make sure if the node is also shared by other
trees.
For SHARED_BLOCK_REF, you need to go to its parent again, until you
build the full path to the root node.
Now you can see why the backref code used in balance and qgroup is
complex.
I can, I get the feeling that this is now way beyond my abilities and I
can see why it will be very slow to run in practice - especially through
the Python abstraction. Perhaps if knorrie adds backref walking helpers
to python-btrfs it might become more feasible.
--
Steven Davies