Am 17.07.2012 22:01, schrieb Tejun Heo:
> On Tue, Jul 17, 2012 at 09:39:41PM +0200, Matthias Prager wrote:
>> I could not however reproduce the issue on any other device than a LSI
>> SAS controller (using SATA disks) - on a regular ICH10 using AHCI and a
>> SATA drive I don't see these i/o errors. But since I'm experiencing
>> these issues on two different systems (both with lsi controllers while
>> running vmware-guests on them) and Robert sees them on his
>> (non-virtualized) system with the same lsi controller (9211-8i), I'm
>> inclined to make the following assumptions:
>> Either it is an issue which is limited to this controller and possibly
>> sata disks hanging off it or it is a more general issue with sas
>> controllers and sata disks (again it could well affect sas disks too).
>> Lacking other controllers or sas disks I can't be sure.
>
> So, nothing in the libata stack generates NOT_READY - "initializing
> command required". I suppose it's LSI firmware / driver translating
> TUR to CHECK_POWER_MODE and generating NOT_READY. I don't know what
> SAT says about this but this can't be correct. An ATA device in
> standby mode is ready to process any commands. It should be able to
> come back to full operation on demand as necessary and that's why it
> can be transparently enabled from device side. Eric?
>
While reading the linux-scsi mailing list I stumbled upon
'[Bug 16070] Fail to issue Start/Stop Unit'
<http://marc.info/?l=linux-scsi&m=134278835822649&w=2>
(bugtracker: <https://bugzilla.kernel.org/show_bug.cgi?id=16070>)
which lead me to trying to enable the 'allow_restart' flag for my disks.
With this workaround a vanilla kernel 3.4.5 does not exhibit the i/o
errors on sleeping sata disks hanging off sas controllers.
I'm currently running one of my systems with a
'echo 1 | tee /sys/block/sd?/device/scsi_disk/*/allow_restart >/dev/null'
line added to the init scripts. This way I can use the untouched kernel
sources and still get around the i/o errors. But I reckon this is no
solution.
I'm no expert on scsi/sas/ata internals, so please take the following
thoughts with a grain of salt:
As far as I can see (and Tejun confirmed that - I think) Tejun commit
85ef06d1d252f6a2e73b678591ab71caad4667bb somehow exposes a bug, which
lies deeper in the sas/ata code. The 'sas_slave_configure()' function in
'drivers/scsi/libsas/sas_scsi_host.c' sets the 'allow_restart' flag for
sas disks hanging off sas controllers. But if it encounters a sata disk
it calls 'ata_sas_slave_configure()' in 'drivers/ata/libata_scsi.c'
instead and returns without enabling the 'allow_restart' flag. A simple
fix would be to set allow_restart=1 after having called
'ata_sas_slave_configure()' but before returning (in
'sas_slave_configure()').
Now I'm not sure this isn't taping over another bug. Which leads me to
my question: What is the correct behavior?
#1 Issuing a separate spin-up command (START UNIT?) prior to sending i/o
by setting allow_restart=1 for sata disks on sas controllers
or
#2 Teaching the sas drivers they do not need spin-up commands and can
simply start issuing i/o to sata disks
--
Matthias
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[SCSI Target Devel]
[Linux SCSI Target Infrastructure]
[Kernel Newbies]
[Share Photos]
[IDE]
[Security]
[Git]
[Netfilter]
[Bugtraq]
[Photos]
[Yosemite]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Linux ATA RAID]
[Linux IIO]
[Samba]
[Video 4 Linux]
[Device Mapper]
[Linux Resources]