On 11/20/2013 10:56 AM, Andrea Mazzoleni wrote:
> Hi,
>
> Yep. At present to multiply for 2^-1 I'm using in C:
>
> static inline uint64_t d2_64(uint64_t v)
> {
> uint64_t mask = v & 0x0101010101010101U;
> mask = (mask << 8) - mask;
> v = (v >> 1) & 0x7f7f7f7f7f7f7f7fU;
> v ^= mask & 0x8e8e8e8e8e8e8e8eU;
> return v;
> }
>
> and for SSE2:
>
> asm volatile("movdqa %xmm2,%xmm4");
> asm volatile("pxor %xmm5,%xmm5");
> asm volatile("psllw $7,%xmm4");
> asm volatile("psrlw $1,%xmm2");
> asm volatile("pcmpgtb %xmm4,%xmm5");
> asm volatile("pand %xmm6,%xmm2"); with xmm6 == 7f7f7f7f7f7f...
> asm volatile("pand %xmm3,%xmm5"); with xmm3 == 8e8e8e8e8e...
> asm volatile("pxor %xmm5,%xmm2");
>
> where xmm2 is the intput/output
>
Now, that doesn't sound like something that can get neatly meshed into
the Cauchy matrix scheme, I assume. It is somewhat nice to have a
scheme which is arbitrarily expandable without having to fall back to
dual parity during the restripe operation. It probably also reduces the
amount of code necessary.
-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html