Re: 'Device not ready' issue on mpt2sas since 3.1.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2012-07-22 at 10:31 -0700, Tejun Heo wrote:
> Hello,
> 
> On Sat, Jul 21, 2012 at 02:15:56PM +0200, Matthias Prager wrote:
> > Now I'm not sure this isn't taping over another bug. Which leads me to
> > my question: What is the correct behavior?
> > 
> > #1 Issuing a separate spin-up command (START UNIT?) prior to sending i/o
> > by setting allow_restart=1 for sata disks on sas controllers
> > 
> > or
> > 
> > #2 Teaching the sas drivers they do not need spin-up commands and can
> > simply start issuing i/o to sata disks
> 
> I haven't consulted SAT but it seems like a bug in SAS driver or
> firmware.  If it's a driver bug, we better fix it there.  If a
> firmware bug, working around those is one of major roles of drivers,
> so I think setting allow_restart is fine.

Actually, I don't think so.  SAT-2 section 8.12.2 does say 

        if the device is in the stopped state as the result of
        processing a START STOP UNIT command (see 9.11), then the SATL
        shall terminate the TEST UNIT READY command with CHECK CONDITION
        status with the sense key set to NOT READY and the additional
        sense code of LOGICAL UNIT NOT READY, INITIALIZING COMMAND
        REQUIRED;

START STOP UNIT (with START=0) translates to STANDBY IMMEDIATE, and
that's what hdparm -y issues.  We don't see this in /drivers/ata because
TEST UNIT READY always returns success.

So it looks like the mpt2sas SAT is doing the correct thing and we only
don't see this problem in normal SATA devices because of a bug in the
libata-scsi SAT.

However, the kernel log

Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj] Device not ready
Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj]  Sense Key : Not Ready [current]
Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj]  Add. Sense: Logical unit not ready, initializing command required
Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj] CDB: Write(10): 2a 00 57 54 52 3f 00 00 08 00

Indicates we got the NOT READY to a non-TUR command, so I suspect what's
happening is that sending the TUR causes the SAT to remember the standby
state and respond NOT READY to all subsequent commands.  However, if we
just send an ordinary command, not a TUR, it quietly wakes the drive and
we don't see any problems.

There is support in SAT for this behaviour because there's a note on the
START STOP UNIT command saying

        After returning GOOD status for a START STOP UNIT command with
        the START bit set to zero, the SATL shall consider the ATA
        device to be in the Stopped power state (see SBC-2)

Which in SCSI terms would mean return NOT READY to any subsequent
commands.

Can someone verify this is indeed what the mpt2sas HBA is doing?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux