[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Google
  Web www.spinics.net

SCSI errors on loss of connectivity to SAN



Hi,

I have some questions on the expected behavior when connectivity to an
iSCSI target is lost while IOs are being issued to the target /dev.

I'm using linux-iscsi-4.0.1.11 on RHEL AS rel 4, accessing the LeftHand
Networks SAN. I login to a volume and run:

# dd if=/dev/zero of=/dev/sdb

Where /dev/sdb is the iSCSI volume dev.

Next, I unplug the cable to the SAN.

As expected, I see messages from the initiator indicating loss of
connection and session recovery attempts:

Feb 16 10:35:46 cr04-39 dhclient: bound to 10.40.5.39 -- renewal in 294
seconds.
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: 10 second timeout
expired rx 
754868941, ping 754873940, now 754878941
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: iscsi_send_scsi_cmnd
failed 
to send scsi cmnd data (131072 bytes)
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Session dropped
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17415 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17416 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17417 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17418 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17419 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17420 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17421 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17422 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17423 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 17424 with return code = 0x20000
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host3: Failing command cdb
0x2a 
task 4294967295 with return code = 0x20000
Feb 16 10:36:37 cr04-39 last message repeated 21 times
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host4: 10 second timeout
expired rx 
754868944, ping 754873943, now 754878944
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host4: iscsi_send_scsi_cmnd
failed 
to send scsi cmnd data (131072 bytes)
Feb 16 10:36:37 cr04-39 kernel: iscsi-sfnet:host4: Session dropped


After 2 minutes, SCSI operations apparently timeout and dd fails with an
IO error:


Feb 16 10:38:36 cr04-39 kernel: iscsi-sfnet:host2: Login I/O error,
failed to 
receive a PDU
Feb 16 10:38:36 cr04-39 kernel: iscsi-sfnet:host2: Waiting 1 seconds
before 
next login attempt
Feb 16 10:38:37 cr04-39 kernel: SCSI error : <3 0 0 0> return code =
0x20000
Feb 16 10:38:37 cr04-39 kernel: end_request: I/O error, dev sdc, sector
16576
Feb 16 10:38:37 cr04-39 kernel: Buffer I/O error on device sdc, logical
block 
2072
Feb 16 10:38:37 cr04-39 kernel: lost page write due to I/O error on sdc
Feb 16 10:38:37 cr04-39 kernel: SCSI error : <3 0 0 0> return code =
0x20000
Feb 16 10:38:37 cr04-39 kernel: end_request: I/O error, dev sdc, sector
16832
Feb 16 10:38:37 cr04-39 kernel: Buffer I/O error on device sdc, logical
block 
2104

These SCSI errors (and app IO failures) don't happen when I run the test
on 2.4 kernel/3.6.3 hosts.

My questions:

1. Are the SCSI errors after two minutes expected? 

2. Is the 2 minute interval adjustable?

As a SAN vendor, we are interested because we would like to be able to
tell our customers how long a SAN volume can be unavailable before
application IOs fail. If this interval is configurable, we would like to
recommend our customers set it long enough to survive a SAN power cycle
(which takes around 4 minutes in our case). Other host OS/initiators
(e.g. W2k/MS-iSCSI-2.0) have this adjustable setting.

I think a similar issue may have come up in this mailing list a while
ago but I don't think these particular questions were answered.

Thanks,
Bob Bawn
LeftHand Networks
303-217-9015



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd_______________________________________________
linux-iscsi-users mailing list
linux-iscsi-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/linux-iscsi-users


[IP Storage]     [IETF]     [Linux SCSI]     [iSCSI Book]     [Linux Resources]     [Yosemite News]     [Photo]     [Home]     [IETF Announcements]     [IETF Discussion]     [SCSI Hardware]