[PATCH 12/13] NFS: Handle replication on a timeout error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]




nfs4_handle_exception and nfs4_async_handle_error now handle ETIMEDOUT
errors by replacing the transport with a replicated server.

The RPC layer tries to handle timeouts by itself in most cases. It
should be made aware of presence of replicated servers so that it can
return time out failures sooner for replication. Right, now it is a
hack, it returns tasks that encounter first timeout.

Signed-off-by: Malahal Naineni <malahal@xxxxxxxxxx>
---
 fs/nfs/nfs4proc.c |   14 ++++++++++++++
 net/sunrpc/clnt.c |   12 ++++++++++++
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 775adb3..2198b13 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -265,6 +265,9 @@ static int nfs4_handle_exception(struct nfs_server *server, int errorcode, struc
 	switch(errorcode) {
 		case 0:
 			return 0;
+		case -ETIMEDOUT:
+			nfs4_schedule_replication_recovery(server);
+			goto wait_on_recovery;
 		case -NFS4ERR_ADMIN_REVOKED:
 		case -NFS4ERR_BAD_STATEID:
 		case -NFS4ERR_OPENMODE:
@@ -3716,6 +3719,16 @@ nfs4_async_handle_error(struct rpc_task *task, const struct nfs_server *server,
 	if (task->tk_status >= 0)
 		return 0;
 	switch(task->tk_status) {
+		case -ETIMEDOUT:
+			printk(KERN_ERR "%s ERROR: %d calling replicate recovery\n",
+				__func__, task->tk_status);
+			rpc_sleep_on(&clp->cl_rpcwaitq, task, NULL);
+			nfs4_schedule_replication_recovery(server);
+			if (test_bit(NFS4CLNT_MANAGER_RUNNING,
+				     &clp->cl_state) == 0)
+				rpc_wake_up_queued_task(&clp->cl_rpcwaitq,
+							task);
+			goto restart_call;
 		case -NFS4ERR_ADMIN_REVOKED:
 		case -NFS4ERR_BAD_STATEID:
 		case -NFS4ERR_OPENMODE:
@@ -3762,6 +3775,7 @@ wait_on_recovery:
 	rpc_sleep_on(&clp->cl_rpcwaitq, task, NULL);
 	if (test_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state) == 0)
 		rpc_wake_up_queued_task(&clp->cl_rpcwaitq, task);
+restart_call:
 	task->tk_status = 0;
 	return -EAGAIN;
 }
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index e9e8097..ed15b44 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1830,6 +1830,18 @@ call_timeout(struct rpc_task *task)
 {
 	struct rpc_clnt	*clnt = task->tk_client;
 
+	/*
+	 * TODO: If replicated server is present, propagate timeout
+	 * failures as soon as possible to upper layers.  We just
+	 * assume that replicated server is present in this RFC patch.
+	 * RPC client should be made aware of replication later.
+	 */
+	if (1) {
+
+		rpc_exit(task, -ETIMEDOUT);
+		return;
+	}
+
 	if (xprt_adjust_timeout(task->tk_rqstp) == 0) {
 		dprintk("RPC: %5u call_timeout (minor)\n", task->tk_pid);
 		goto retry;
-- 
1.7.8.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Photo]     [Yosemite Info]    [Yosemite Photos]    [POF Sucks]     [Linux Kernel]     [Linux SCSI]     [XFree86]

Add to Google Powered by Linux