Re: linux-3.0.x regression with ipv4 routes having mtu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

On Thu, Dec 15, 2011 at 02:49:57PM +0100, Steffen Klassert wrote:
> On Wed, Dec 14, 2011 at 12:50:10PM -0500, David Miller wrote:
> > From: Timo Teräs <timo.teras@xxxxxx>
> > Date: Wed, 14 Dec 2011 17:54:16 +0200
> > 
> > > So something is does not get updated here. This used to work though.
> > > The current production boxes where I know this work is 2.6.38.8.
> > 
> > We store the PMTU externally in inetpeer entries, Eric Dumazet
> > fixed recently but that fix hasn't been submitted to -stable
> > yet.
> > 
> 
> I think we still have at least two problems with pmtu handling.
> One is that "ip route flush cache" flushes the routing cache
> but not the cached metrics on the inetpeer.

"ip route flush cache" does not even flush the routing cache,
it just markes the routing cache as invalid by changing the
rt_genid which make the old routes invisible. And since we don't
have rt_check_expire() anymore, we have to wait until we have
collected gc_thresh (1024) useless routes before rt_garbage_collect()
starts to remove some of them (btw. is this intentional).

I think we need a trigger in rt_cache_invalidate() that expires
all cached pmtu values similar to the routing cache entries.
For a moment I thought we could just reset the __rt_peer_genid
value back to zero to mark all cached pmtu values as expired,
but that's apparently not save to do. So I came to no conclusion
how to fix this today. Any ideas?

> 
> Another problem might be that we ignore the user configured
> fib_metrics in rt_init_metrics() if we have a cached metric
> on the inetpeer.
> 

I think this could be fixed with something like the patch
below. With this it should be possible to overwrite cached
values from userspace. It has not much testing, so it's
just for review at the monent.

---------
Subject: [PATCH] route: Initialize with the fib_metrics in the non default case

We initialize the routing metrics with the cached values in
rt_init_metrics(). So if we have the metrics cached on the
inetpeer, we ignore the user configured fib_metrics. So
initialize the routing metrics with the fib_metrics if they
are different from dst_default_metrics.

Signed-off-by: Steffen Klassert <steffen.klassert@xxxxxxxxxxx>
---
 net/ipv4/route.c |   28 ++++++++++++++++++++--------
 1 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f30112f..26a6249 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1841,6 +1841,22 @@ static unsigned int ipv4_mtu(const struct dst_entry *dst)
 	return mtu;
 }
 
+static void __rt_init_metrics(struct rtable *rt, struct fib_info *fi,
+			      struct inet_peer *peer)
+{
+	if (peer && fi->fib_metrics == (u32 *) dst_default_metrics) {
+		dst_init_metrics(&rt->dst, peer->metrics, false);
+		return;
+	}
+
+	if (fi->fib_metrics != (u32 *) dst_default_metrics) {
+		rt->fi = fi;
+		atomic_inc(&fi->fib_clntref);
+	}
+
+	dst_init_metrics(&rt->dst, fi->fib_metrics, true);
+}
+
 static void rt_init_metrics(struct rtable *rt, const struct flowi4 *fl4,
 			    struct fib_info *fi)
 {
@@ -1859,7 +1875,8 @@ static void rt_init_metrics(struct rtable *rt, const struct flowi4 *fl4,
 		if (inet_metrics_new(peer))
 			memcpy(peer->metrics, fi->fib_metrics,
 			       sizeof(u32) * RTAX_MAX);
-		dst_init_metrics(&rt->dst, peer->metrics, false);
+
+		__rt_init_metrics(rt, fi, peer);
 
 		check_peer_pmtu(&rt->dst, peer);
 		if (peer->redirect_genid != redirect_genid)
@@ -1869,13 +1886,8 @@ static void rt_init_metrics(struct rtable *rt, const struct flowi4 *fl4,
 			rt->rt_gateway = peer->redirect_learned.a4;
 			rt->rt_flags |= RTCF_REDIRECTED;
 		}
-	} else {
-		if (fi->fib_metrics != (u32 *) dst_default_metrics) {
-			rt->fi = fi;
-			atomic_inc(&fi->fib_clntref);
-		}
-		dst_init_metrics(&rt->dst, fi->fib_metrics, true);
-	}
+	} else
+		__rt_init_metrics(rt, fi, NULL);
 }
 
 static void rt_set_nexthop(struct rtable *rt, const struct flowi4 *fl4,
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Linux Kernel Discussion]     [Ethernet Bridging]     [Linux Wireless Networking]     [Linux Bluetooth Networking]     [Linux Networking Users]     [VLAN]     [Git]     [IETF Annouce]     [Linux Assembly]     [Security]     [Bugtraq]     [Photo]     [Singles Social Networking]     [Yosemite Information]     [MIPS Linux]     [ARM Linux Kernel]     [ARM Linux]     [Linux Virtualization]     [Linux Security]     [Linux IDE]     [Linux RAID]     [Linux SCSI]     [Free Dating]

Add to Google Powered by Linux