Re: mpt2sas: BUG? corrupted nvram after using linkrate in sysfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Tue, Mar 20, 2012 at 03:20:48PM +0530, Nandigama, Nagalakshmi wrote:
> Hi,
> I verified on SAS2008 B1 the max/min changing value for link rate.
> I set the max link speed of the NVRAM value to 1.5 Gbps and rebooted. After reboot I read the current and NVRAM value of the max link speed for the phy that I set before. It shown value 1.5 Gbps.
> 
> I dint face any issue. 
> 
> Are you facing problem in this scenario only?
> 

BTW, I am on V9.0 of the firmware -- there was an email that
said this was fixed in 11.0, but I don't see it mentioned in
the errata unless I missed it.  Also flashing already
deployed systems isn't always easy.

So I did some more experiments, and the problem seems to be
both intermittent and relies on a particular drive
configuration.  I start with a brand new system (i.e.
never been run) and put 4 SAS drives in, each with 2 ports
(so I use all 8 phys of the controller).  These drives can
run at 6G and work fine.  Now I can power down, pull 2
drives, start again and it is still good.  I can change the
drive setup many times and all is fine.  Now if I go back to
the original setup (4 SAS drives) and change the link rate
to 6G then everything works.  Now, if I power down and pull
a drive or 2, then some phys will come up at 6G, some will
indicate Unknown, but accessing the drives won't work
correctly.  One drive *might* work, but the other usually
will give an error with "Read Capacity Fail" or something
different.  If I use lsiutil to reset the Pg1 configuration
to the defaults and reboot, then the drives will work again
at 6G. 

In fails, each drive is dual-ports so I'd expect both phys 
to come up at 6G.  What happens is that 1 phy will come up 
at 6G the other at Unknown and nothing works right. (i..e
we get drive errors as seen in Linux).

Upon first boot, our Pg1 current config looks like this.
0014 : a8000100
0018 : 00000071
001c : 00000000

The dynamic port group is 0, the autoconfig is enabled for
each of the 8 ports. (just showing 1 here).

After we attach drives and set the link rate, the port group
gets set to something [0:7].  Then when the data is written
back to the flash during the NVRAM update, the dynamic port
group is now an ordinal from 0,7 instead of port group 0.
It appears to us that this is the only difference but I
can't verify that other than when I restore defaults it
works.

In another test, I subverted the writes to NVRAM in the
driver.  In this case, I can make it run and fail, but at
least upon reboot it works again.

Thanks,
Ayman








> 
> 
> -----Original Message-----
> From: linux-scsi-owner@xxxxxxxxxxxxxxx [mailto:linux-scsi-owner@xxxxxxxxxxxxxxx] On Behalf Of Ayman El-Khashab
> Sent: Sunday, March 18, 2012 11:59 PM
> To: linux-scsi@xxxxxxxxxxxxxxx
> Subject: mpt2sas: BUG? corrupted nvram after using linkrate in sysfs
> 
> We have the SAS2008 on our board but we don't have the NVRAM
> populated.  We do have a flash on the board that holds the
> firmware and so on.  What we found (by experimentation) is
> that the firmware detects or knows there isn't an NVRAM and
> stores some of the NVRAM settings somewhere in the flash
> (we've dumped the flash and confirmed there are things in
> it).  
> 
> We've seen a problem where we use the max/min linkrate in
> sysfs -- it looks like the linkrates get set correctly in
> page1, we also copy the port/port flags/phy flags per some
> errata.  The problem shows up when it updates pg1.  At that
> point, the following code is executed ... I guess this would
> be ok, except what we see happen is that whatever gets
> written to the NVRAM fails to make anything work and since
> it is in the NVRAM, even rebooting doesn't help.  We have
> figured out that using restore defaults in lsiutil fixes the
> problem, but we would rather figure out why the linkrate
> doesn't work.  
> 
>  mpt2sas_config_set_sas_iounit_pg1(struct MPT2SAS_ADAPTER
> *ioc, Mpi2ConfigReply_t
>      *mpi_reply, Mpi2SasIOUnitPage1_t *config_page, u16
> ...
> MPI2_CONFIG_ACTION_PAGE_WRITE_NVRAM;
>          r = _config_request(ioc, &mpi_request, mpi_reply,
>             MPT2_CONFIG_PAGE_DEFAULT_TIMEOUT,
> config_page, sz);
> 
> 
> What it looks like is that Pg1 has the autoconfig set, but
> rather than having "0" in the dynamic port group, it how has
> many phy numbers so it seems that on the next reboot, the
> dynamic setup doesn't work -- however, I don't know that for
> sure since I don't really know what the f/w does with it.
> 
> Thanks
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Photos]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux