On Sat, Apr 28, 2012 at 2:46 PM, Yuchung Cheng <ycheng@xxxxxxxxxx> wrote: > --- a/Documentation/networking/ip-sysctl.txt > +++ b/Documentation/networking/ip-sysctl.txt > @@ -202,6 +202,20 @@ tcp_ecn - INTEGER > not support ECN, behavior is like with ECN disabled. > Default: 2 > > +tcp_early_retrans - INTEGER > + Enable Early Retransmit (ER), per RFC 5827. ER lowers the threshold I get a trailing-whitespace error in the documentation when I apply this patch: > git am /tmp/er2.patch Applying: RFC tcp: early retransmit .../.git/rebase-apply/patch:20: trailing whitespace. Enable Early Retransmit (ER), per RFC 5827. ER lowers the threshold warning: 1 line adds whitespace errors. Also, to keep the sysctls alphabetical, tcp_early_retrans should probably go before tcp_ecn. > --- a/include/linux/tcp.h > +++ b/include/linux/tcp.h > @@ -365,12 +365,13 @@ struct tcp_sock { > > u32 frto_highmark; /* snd_nxt when RTO occurred */ > u16 advmss; /* Advertised MSS */ > - u8 frto_counter; /* Number of new acks after RTO */ > - u8 nonagle : 4,/* Disable Nagle algorithm? */ > + u16 nonagle : 4,/* Disable Nagle algorithm? */ > thin_lto : 1,/* Use linear timeouts for thin streams */ > thin_dupack : 1,/* Fast retransmit on first dupack */ > repair : 1, > - unused : 1; > + do_early_retrans: 1;/* Enable RFC5827 early-retransmit */ > + > + u8 frto_counter; /* Number of new acks after RTO */ > u8 repair_queue; To keep the change minimal and reduce the risk of mysterious performance regressions from cache effects, I'd suggest keeping the frto_counter and nonagle u8 bytes as u8 bytes in their current location, and add a new u8 for the two ER bits. Same amount of space as the scheme in the patch, just less shuffling. > @@ -987,6 +989,7 @@ static void tcp_update_reordering(struct sock *sk, const int metric, > tp->undo_marker ? tp->undo_retrans : 0); > #endif > tcp_disable_fack(tp); > + tcp_disable_early_retrans(tp); > } > } I think we should stick with the behavior where we disable early retransmit any time tcp_update_reordering() is called with a non-zero reordering metric. This is what we've tested and measured, and my sense is that we could risk a significant number of spurious ER firings if instead we relax this so that only reordering >3 causes us to disable ER. I know the delayed ER should help avoid spurious ER firings when there is a small degree of reordering, but my guess would be that the max(RTT/4, 2ms) is perhaps not big enough if we're allowing delayed ER for connections that have already witnessed small degrees of reordering. So until we have more experimental data, I'd recommend sticking with: if (metric > 0) tcp_disable_early_retrans(tp); Otherwise, looks good to me. neal -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html