Le mercredi 09 novembre 2011 à 23:03 +0100, Eric Dumazet a écrit : > BTW, on my bnx2x adapter, even small UDP frames use more than PAGE_SIZE > bytes : > > skb->truesize=4352 len=26 (payload only) > > I wonder if we shouldnt increase SK_MEM_QUANTUM a bit to avoid > ping/pong... > > -#define SK_MEM_QUANTUM ((int)PAGE_SIZE) > +#define SK_MEM_QUANTUM ((int)PAGE_SIZE * 2) > Following patch also helps a lot, even with only two cpus (one handling device interrupts, one running the application thread) [PATCH net-next] bnx2x: reduce skb truesize by ~50% bnx2x uses following formula to compute its rx_buf_sz : dev->mtu + 2*L1_CACHE_BYTES + 14 + 8 + 8 Then core network adds NET_SKB_PAD and SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) Final allocated size for skb head on x86_64 (L1_CACHE_BYTES = 64, MTU=1500) : 2112 bytes : SLUB/SLAB round this to 4096 bytes. Since skb truesize is then bigger than SK_MEM_QUANTUM, we have lot of false sharing because of mem_reclaim in UDP stack. One possible way to half truesize is to lower the need by 64 bytes (2112 -> 2048 bytes) This way, skb->truesize is lower than SK_MEM_QUANTUM and we get better performance. (760.000 pps on a rx UDP monothread benchmark, instead of 720.000 pps) Signed-off-by: Eric Dumazet <eric.dumazet@xxxxxxxxx> CC: Eilon Greenstein <eilong@xxxxxxxxxxxx> --- drivers/net/ethernet/broadcom/bnx2x/bnx2x.h | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h index aec7212..ebbdc55 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h @@ -1185,9 +1185,14 @@ struct bnx2x { #define ETH_MAX_PACKET_SIZE 1500 #define ETH_MAX_JUMBO_PACKET_SIZE 9600 - /* Max supported alignment is 256 (8 shift) */ -#define BNX2X_RX_ALIGN_SHIFT ((L1_CACHE_SHIFT < 8) ? \ - L1_CACHE_SHIFT : 8) +/* Max supported alignment is 256 (8 shift) + * It should ideally be min(L1_CACHE_SHIFT, 8) + * Choosing 5 (32 bytes) permits to get skb heads of 2048 bytes + * instead of 4096 bytes. + * With SLUB/SLAB allocators, data will be cache line aligned anyway. + */ +#define BNX2X_RX_ALIGN_SHIFT 5 + /* FW use 2 Cache lines Alignment for start packet and size */ #define BNX2X_FW_RX_ALIGN (2 << BNX2X_RX_ALIGN_SHIFT) #define BNX2X_PXP_DRAM_ALIGN (BNX2X_RX_ALIGN_SHIFT - 5) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html