[PATCH RT] futex: Fix bug on when a requeued RT task times out

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Requeue with timeout causes a bug with PREEMPT_RT_FULL.

The bug comes from a timed out condition.

	TASK 1				TASK 2
	------				------
	<timed out>


	if (current->pi_blocked_on) { 
	} else {
	    current->pi_blocked_on = PI_WAKE_INPROGRESS;
	    spin_lock(hb->lock); <-- blocked!

					plist_for_each_entry_safe(this) {

The BUG_ON() actually has a check for PI_WAKE_INPROGRESS, but the
problem is that, after TASK 1 sets PI_WAKE_INPROGRESS, it then tries to
grab the hb->lock, which it fails to do so. As the hb->lock is a mutex,
it will block and set the "pi_blocked_on" to the hb->lock.

When TASK 2 goes to requeue it, the check for PI_WAKE_INPROGESS fails
because the task1's pi_blocked_on is no longer set to that, but instead,
set to the hb->lock.

The fix:

When calling rt_mutex_start_proxy_lock() a check is made to see
if the proxy tasks pi_blocked_on is set. If so, exit out early.
Otherwise set it to a new flag PI_REQUEUE_INPROGRESS, which notifies
the proxy task that it is being requeued, and will handle things

Cc: stable-rt@xxxxxxxxxxxxxxx
Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx>

diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c
index 991bc7f..9850dc0 100644
--- a/kernel/rtmutex.c
+++ b/kernel/rtmutex.c
@@ -75,7 +75,8 @@ static void fixup_rt_mutex_waiters(struct rt_mutex *lock)
 static int rt_mutex_real_waiter(struct rt_mutex_waiter *waiter)
-	return waiter && waiter != PI_WAKEUP_INPROGRESS;
+	return waiter && waiter != PI_WAKEUP_INPROGRESS &&
@@ -1353,6 +1354,35 @@ int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
 		return 1;
+	/*
+	 * In PREEMPT_RT there's an added race.
+	 * If the task, that we are about to requeue, times out,
+	 * it can set the PI_WAKEUP_INPROGRESS. This tells the requeue
+	 * to skip this task. But right after the task sets
+	 * its pi_blocked_on to PI_WAKEUP_INPROGRESS it can then
+	 * block on the spin_lock(&hb->lock), which in RT is an rtmutex.
+	 * This will replace the PI_WAKEUP_INPROGRESS with the actual
+	 * lock that it blocks on. We *must not* place this task
+	 * on this proxy lock in that case.
+	 *
+	 * To prevent this race, we first take the task's pi_lock
+	 * and check if it has updated its pi_blocked_on. If it has,
+	 * we assume that it woke up and we return -EAGAIN.
+	 * Otherwise, we set the task's pi_blocked_on to
+	 * PI_REQUEUE_INPROGRESS, so that if the task is waking up
+	 * it will know that we are in the process of requeuing it.
+	 */
+	raw_spin_lock(&task->pi_lock);
+	if (task->pi_blocked_on) {
+		raw_spin_unlock(&task->pi_lock);
+		raw_spin_unlock(&lock->wait_lock);
+		return -EAGAIN;
+	}
+	task->pi_blocked_on = PI_REQUEUE_INPROGRESS;
+	raw_spin_unlock(&task->pi_lock);
 	ret = task_blocks_on_rt_mutex(lock, waiter, task, detect_deadlock);
 	if (ret && !rt_mutex_owner(lock)) {
diff --git a/kernel/rtmutex_common.h b/kernel/rtmutex_common.h
index a688a29..6ec3dc1 100644
--- a/kernel/rtmutex_common.h
+++ b/kernel/rtmutex_common.h
@@ -105,6 +105,7 @@ static inline struct task_struct *rt_mutex_owner(struct rt_mutex *lock)
  * PI-futex support (proxy locking functions, etc.):
 #define PI_WAKEUP_INPROGRESS	((struct rt_mutex_waiter *) 1)
+#define PI_REQUEUE_INPROGRESS	((struct rt_mutex_waiter *) 2)
 extern struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock);
 extern void rt_mutex_init_proxy_locked(struct rt_mutex *lock,

To unsubscribe from this list: send the line "unsubscribe stable-rt" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at

[Corosync Project]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Free Online Dating]     [Linux Kernel]     [Linux SCSI]     [XFree86]

Add to Google Powered by Linux