Re: Bug in the design of the tree search ioctl API ? [was Re: [PATCH 1/3] Btrfs: Really return keys within specified range]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday, 15 December, 2010, Chris Mason wrote:
> Excerpts from Li Zefan's message of 2010-12-14 22:33:33 -0500:
> > > Suppose to have the following sequence keys  [objectid, type, offset]:
> > > 
> > > [...]
> > > 1)    [300, BTRFS_EXTENT_DATA_KEY, xx]
> > > 2)    [300, BTRFS_INODE_ITEM_KEY, xx]
> > > 3)    [300, BTRFS_XATTR_ITEM_KEY, xx]
> > > 4)    [301, BTRFS_EXTENT_DATA_KEY, xx]
> > > 5)    [301, BTRFS_INODE_ITEM_KEY, xx]
> > > 7)    [30200, BTRFS_EXTENT_DATA_KEY, xx]
> > > 8)    [30200, BTRFS_INODE_ITEM_KEY, xx]
> > > 9)    [30200, BTRFS_XATTR_ITEM_KEY, xx]
> > > [...]
> > > 
> > > 
> > > Suppose that the buffer is filled between the item 2 and 3. We should 
restart 
> > > the search, but how set the min_* key ? Try the following hypothesis
> > > 
> > > h1) objectid++, type = 0 -> In the next search the key 3 would be 
skipped
> > > h2) objectid asis, type ++, -> in the next search the key 4 would be 
skipped
> > > h3) objectid asis, type = 0 -> in the next search the key 1,2,3 would be 
> > 
> > h4) objectid asis, type asis, offset++ -> we should get the correct 
result.
> 
> This is the right answer ;).  The problem is that even though our key has
> 3 distinct parts, and the API makes it look like you have very fine
> grained control over those three parts, you have to remember to reset
> them as you iterate between objectids.  It isn't a obvious as it should
> be.
> 
> The current API is a very raw export of how we do the searches in the
> kernel too.  You can do pretty much anything with it, but we pay with
> complexity.

Hi Chris,

I am a bit confused about your answer.

The actual API is a bit confused (or almost not "obvious"). An application in 
order to work properly has to make some adjustment to the min_* fields AND 
filter the results (because if we tweak with the min_* field, not useful data 
is returned). Moreover this  means that we move between user-space<->kernel-
space a lot of unused data (un-necessary context switch).

On the basis of your answer, it seems that this is ok (please don't consider 
only the case of listing the subvolumes which is a very simple cases). And 
nothing have to do.

Instead I suggest to (see my email [PATCH] BTRFS_IOC_TREE_SEARCH: store and 
use the last key found):
- leave the min_* and max_* fields to act only as filter
- add three more fields start_* as start point for the search 
- make some small modification to the kernel code to track in the start_* 
fields the last key found

pro:
- we pass to userspace only useful data
- we simplify a lot the userspace application, because they don't have to 
update the min_* fields (they will work in a obvious way :-) )
- we can use the ordered property of a btree structure to perform efficient 
data lookup (if we reset to zero the min_* fields we lookup un-necessary data)
- we have the same functionality of the old API

cons:
- we need a modification ( :-) )

May be that I missed some point, but I don't see any advantage to continue to 
support the actual API. Of course, that doesn't means that we can remove the 
old API ignoring the backward compatibility. 
But I think that there are sufficient pros to develop a new API

Please be patient: my english is very bad; I am not trying to blame anybody; I 
want only a "perfect fs" (TM) :-)

> 
> -chris
> 
> 


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack@xxxxxxxxx>
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux