Re: [RFC] How to fix an async scan - rmmod race?
|[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]|
On 04/11/12 22:28, Mike Christie wrote: > On 04/11/2012 02:47 PM, Bart Van Assche wrote: >> disadvantage is that this approach will only work fine if the LLD stops >> I/O completion notifications before invoking scsi_remove_host(). Several > > I don't think you would want to do that, because you have IO from the > sd_shutdown path that you do want to execute. After the remove/shutdown > callouts have been run then you do not want new IO to be sent to the LLD. > > So scsi_remove_host sets the host state to cancel initially. It then > calls scsi_forget_host which will loop over devices and remove them. > That could cause IO to be sent by functions like sd_shutdown. So that means that with an operational transport layer it's wrong for a SCSI LLD to stop processing SCSI commands before scsi_remove_host() finished ? It looks like several SCSI LLD authors are not aware of this. I have found several examples of high-profile SCSI LLD drivers in the kernel tree that cause newly submitted SCSI commands to fail during kernel module removal even before scsi_remove_host() gets invoked. > After the ULD code is run __scsi_remove_device will set the state to > SDEV_DEL and scsi_remove_host will then set the state to SHOST_DEL. So > that would prevent new IO from getting queued. > > But then is there a race that you were hitting? scsi_remove_host() can get invoked after the SCSI core has submitted a request to the LLD via queuecommand() but before the LLD has received the I/O completion notification that will be generated once that request finishes. I see three alternatives to handle this: - The LLD stops I/O completion notifications before invoking scsi_remove_host() (which is not correct because it prevents sd_shutdown() to send SCSI commands to the device). - The SCSI core keeps the LLD around long enough until it is sure that no new I/O notifications will arrive. - The SCSI LLD stops I/O completion notifications after having invoked scsi_remove_host() and kills all pending SCSI commands before continuing with LLD-specific host removal tasks. As far as I can see the SCSI core doesn't provide a function yet that would allow an LLD to kill all pending requests. Maybe blk_abort_queue() could be helpful here. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html