On 06/06/12 14:12, Mike Christie wrote:
> On 06/06/2012 08:43 AM, Mike Christie wrote:
>> On 06/06/2012 07:25 AM, Bart Van Assche wrote:
>>> On 06/05/12 22:08, Mike Christie wrote:
>>>
>>>> On 06/05/2012 12:14 PM, Bart Van Assche wrote:
>>>>> Avoid that the code for requeueing SCSI requests triggers a
>>>>> crash by making sure that that code isn't scheduled anymore
>>>>> after a device has been removed.
>>>>>
>>>>> Also, source code inspection of __scsi_remove_device() revealed
>>>>> a race condition in this function: no new SCSI requests must be
>>>>> accepted for a SCSI device after device removal started.
>>>>>
>>>>> Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
>>>>> Cc: Mike Christie <michaelc@xxxxxxxxxxx>
>>>>> Cc: James Bottomley <JBottomley@xxxxxxxxxxxxx>
>>>>> Cc: Jens Axboe <axboe@xxxxxxxxx>
>>>>> Cc: Joe Lawrence <jdl1291@xxxxxxxxx>
>>>>> Cc: Jun'ichi Nomura <j-nomura@xxxxxxxxxxxxx>
>>>>> Cc: <stable@xxxxxxxxxx>
>>>>> ---
>>>>> drivers/scsi/scsi_lib.c | 7 ++++---
>>>>> drivers/scsi/scsi_sysfs.c | 11 +++++++++--
>>>>> 2 files changed, 13 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>>>>> index 082c1e5..b722a8b 100644
>>>>> --- a/drivers/scsi/scsi_lib.c
>>>>> +++ b/drivers/scsi/scsi_lib.c
>>>>> @@ -158,10 +158,11 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, int reason, int unbusy)
>>>>> * that are already in the queue.
>>>>> */
>>>>> spin_lock_irqsave(q->queue_lock, flags);
>>>>> - blk_requeue_request(q, cmd->request);
>>>>> + if (!blk_queue_dead(q)) {
>>>>> + blk_requeue_request(q, cmd->request);
>>>>> + kblockd_schedule_work(q, &device->requeue_work);
>>>>> + }
>>>>> spin_unlock_irqrestore(q->queue_lock, flags);
>>>>> -
>>>>> - kblockd_schedule_work(q, &device->requeue_work);
>>>>
>>>> If we do not have the part of the patch above, but have your other
>>>> patches and the code below, will we be ok?
>>>
>>>
>>> I'm not sure. Without the above part the request could get killed after
>>> the blk_requeue_request() call finished but before the requeue_work is
>>> scheduled, e.g. because the request timer fired or due to a
>>> blk_abort_queue() call.
>>>
>>
>> You are right.
>>
>> What if we moved the requeue work struct to the request queue, then have
>> blk_cleanup_queue or blk_drain_queue call cancel_work_sync before the
>> queue is freed. That way that code could make sure the queue and work is
>> flushed and drained, and it can make sure it is flushed and drained
>> before freeing the queue?
>
> Or, in scsi_requeue_run_queue could we just add a check for the
> scsi_device being in the SDEV_DEL state. That combined with your cancel
> call in __scsi_remove_device would prevent us from running a cleaned up
> queue, right?
I'm not sure. If a requeued request times out before blk_cleanup_queue()
is invoked then it's possible that the requeue_work is started after the
struct scsi_device has already been deleted.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[SCSI Target Devel]
[Linux SCSI Target Infrastructure]
[Kernel Newbies]
[Share Photos]
[IDE]
[Security]
[Git]
[Netfilter]
[Bugtraq]
[Photos]
[Yosemite]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Linux ATA RAID]
[Linux IIO]
[Samba]
[Video 4 Linux]
[Device Mapper]
[Linux Resources]