On Thu, 2012-05-17 at 12:26 -0400, Jörn Engel wrote:
> As the comment explains, we have a race between dropping the refcount to
> zero and actually removing the session from lookup tables. Plug this by
> checking the refcount on lookup.
>
> Signed-off-by: Joern Engel <joern@xxxxxxxxx>
> ---
> drivers/scsi/qla2xxx/tcm_qla2xxx.c | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
Hi Joern!
> diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> index 09a70aa..b1bfa08 100644
> --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> @@ -1300,6 +1300,20 @@ static struct qla_tgt_sess *tcm_qla2xxx_find_sess_by_s_id(
> return NULL;
> }
>
> + /*
> + * We drop the hardware_lock between dropping the reference count to
> + * zero and removing the session from the btree. Therefore it
> + * is possible to successfully lookup a session after it has been
> + * scheduled for removal, but before actual removal. Result is pretty
> + * good chance of dereferencing freed memory in qlt_issue_task_mgmt.
> + *
> + * Proper fix would be to atomically drop the refcount and remove the
> + * session from all data structures that would allow lookup. Until
> + * then this tasteless hack shall suffice.
> + */
> + if (atomic_read(&nacl->qla_tgt_sess->se_sess->sess_kref.refcount) == 0)
> + return NULL;
> +
> return nacl->qla_tgt_sess;
> }
>
Mmmmm, I'd still like to try to avoid (another) atomic op in the fast
path w/ hardware_lock held if at all possible..
Also thinking more about the case in your comment, I'm starting to
wonder if tcm_qla2xxx_free_session() should be obtaining hardware_lock +
clearing the s_id and loop_id lookup entries earlier than currently
done..
That is, target_wait_for_sess_cmds() is called before clearing the
nodeacl pointers from s_id + loop_id in tcm_qla2xxx_free_session(),
which means that we still perform se_node_acl lookups in the case where
the individual tcm_qla2xxx endpoint was not also being shutdown as
well..
So that said, I think your work-around is OK to address the current BUG,
but the s_id + loop_id clearing order in tcm_qla2xxx_free_session()
would be having an effect here as well..
--nab
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[SCSI Target Devel]
[Linux SCSI Target Infrastructure]
[Kernel Newbies]
[Share Photos]
[IDE]
[Security]
[Git]
[Netfilter]
[Bugtraq]
[Photos]
[Yosemite]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Linux ATA RAID]
[Linux IIO]
[Samba]
[Video 4 Linux]
[Device Mapper]
[Linux Resources]