On Tue, Jan 10, 2012 at 03:31:34PM +0800, Liu Bo wrote:
> +static inline int generate_node_level(u64 randseed)
> +{
> + int level = 0;
> +
> + while (randseed && !(randseed & 3)) {
> + randseed >>= 2;
> + level++;
> + }
> +
> + return (level > MAXLEVEL ? MAXLEVEL : level);
> +}
This is counting number of trailing zeros * 2 (except when randseed ==
0), there's a gcc builtin for it __builtin_ctzll and you can turn it in
a loopless inlinable function:
static inline int generate_node_level(u64 randseed)
{
return randseed == 0 ? 0 : __builtin_ctzll(randseed) >> 1
}
the builtin should be safe on all arches without the need of libgcc
support, there seem to be handcoded asm statements for each arch.
microbenchmarkg of builtin vs while-counter showed 2.3x speedup:
builtin: 1.866529 ns/loop
while: 4.265664 ns/loop
(1<<32 loops, on a generic intel x86_64 box)
and if MAXLEVEL is <= 16, then you can generate just 4 random bytes and
compute the level in the same way without any loss.
david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html