Re: [RFC PATCH v2 1/3] Btrfs: add the Probabilistic Skiplist

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/11/2012 08:37 AM, David Sterba wrote:
> Hi, a few thoughts and comments below:
> 
> On Tue, Jan 10, 2012 at 03:31:34PM +0800, Liu Bo wrote:
>> c) The level limit may need to be adjusted.
>>    I know it is a magic number, but now for simplicity we just keep it at 16,
>>    and then each skiplist is able to contain (2^32-1)/3 nodes at most.
> 
> (2^32-1)/3 = 1,431,655,765 that's a lot, I wonder what an average member
> count of a skiplist would be and whether eg. maxlevel = 12 is not enough
> (5,592,405 members).
> 


hmm, sorry, I found I've made a mistake here,
let me correct it here (changelog will also be updated later):

As I set the probability to 1/4, the members linked on N+1 level list will be 1/4 of
those linked on N level list.

And what's more, in skiplist a node can be linked on multi levels, eg. a node with N+1 level will
also be linked on N level list.

So before the node count reaches to 4^(maxlevel - 1), the skiplist can maintain O(lgn),
and after that, it will be no more O(lgn) although we can still insert nodes into the skiplist.

That's the difference.


> or you can set the maxlevel during skiplist creation, or predefine a
> "small" skiplist with compile-time-set level to whatever < 16.
>
> this can be tuned later of course.
> 

Yes, I do set the maxlevel to 16 at the creation of a skiplist.

Here 4^(16 - 1) is 2^30, I don't think this is enough for some severe workloads which build large
amount of fragments.

Maybe we should make the maxlevel self-update.


>> --- /dev/null
>> +++ b/fs/btrfs/skiplist.c
>> @@ -0,0 +1,98 @@
>> +inline int sl_fill_node(struct sl_node *node, int level, gfp_t mask)
> 
> I suggest to pick the full prefix "skiplist_" instead of just "sl_",
> it'll be IMHO more readable and googlable. (Out of curiosity I grepped
> for the sl_ prefix and it's used by drivers/net/slip/slip.c).
>

I did hesitate for a while between "skiplist_" and "sl_"...
and I just wanna make it be similar to "rb_".

Anyway, I'm ok with "skiplist_".

>> +{
>> +	struct sl_node **p;
>> +	struct sl_node **q;
>> +	int num;
>> +
>> +	BUG_ON(level > MAXLEVEL);
>> +
>> +	num = level + 1;
>> +	p = kmalloc(sizeof(*p) * num, mask);
>> +	BUG_ON(!p);
> 
> you can drop the BUG_ON
> 
>> +	if (!p)
>         ^^^^^^^
> 
>> +		return -ENOMEM;
>> +	q = kmalloc(sizeof(*q) * num, mask);
>> +	BUG_ON(!q);
>         ^^^^^^^^^^
> 

ok, just in case.

>> +	if (!q) {
>> +		kfree(p);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	node->next = p;
>> +	node->prev = q;
>> +	node->level = level;
>> +	return 0;
>> +}
>> +
>> diff --git a/fs/btrfs/skiplist.h b/fs/btrfs/skiplist.h
>> new file mode 100644
>> index 0000000..3e414b5
>> --- /dev/null
>> +++ b/fs/btrfs/skiplist.h
>> +
>> +#define MAXLEVEL 16
>> +/* double p = 0.25; */
>> +
>> +struct sl_node {
>> +	struct sl_node **next;
>> +	struct sl_node **prev;
>> +	unsigned int level;
>> +	unsigned int head:1;
> 
> the bitfield will use another sizeof(int) bytes, but the level is at
> most 16, you can reduce it's size eg to unsigned short.
> on the other hand, the structure has to start at address aligned to
> sizeof(void*) and the bytes after 'head' up to next sizeof(void*)
> boundary will be left unusable anyway. then, 'head' could be a full int
> or bool so the compiler is not restricted and forced to keep state of
> the single bit. if access to these items is exptected to be frequent,
> the diffenence could be mesurable.
> 

I see.  Thanks a lot for your advice!

thanks,
liubo

>> +};
> 
> 
> david
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux