linux-brain/drivers/scsi
Brian King c03ecc192c scsi: ibmvfc: Set default timeout to avoid crash during migration
[ Upstream commit 764907293edc1af7ac857389af9dc858944f53dc ]

While testing live partition mobility, we have observed occasional crashes
of the Linux partition. What we've seen is that during the live migration,
for specific configurations with large amounts of memory, slow network
links, and workloads that are changing memory a lot, the partition can end
up being suspended for 30 seconds or longer. This resulted in the following
scenario:

CPU 0                          CPU 1
-------------------------------  ----------------------------------
scsi_queue_rq                    migration_store
 -> blk_mq_start_request          -> rtas_ibm_suspend_me
  -> blk_add_timer                 -> on_each_cpu(rtas_percpu_suspend_me
              _______________________________________V
             |
             V
    -> IPI from CPU 1
     -> rtas_percpu_suspend_me
                                     -> __rtas_suspend_last_cpu

-- Linux partition suspended for > 30 seconds --
                                      -> for_each_online_cpu(cpu)
                                           plpar_hcall_norets(H_PROD
 -> scsi_dispatch_cmd
                                      -> scsi_times_out
                                       -> scsi_abort_command
                                        -> queue_delayed_work
  -> ibmvfc_queuecommand_lck
   -> ibmvfc_send_event
    -> ibmvfc_send_crq
     - returns H_CLOSED
   <- returns SCSI_MLQUEUE_HOST_BUSY
-> __blk_mq_requeue_request

                                      -> scmd_eh_abort_handler
                                       -> scsi_try_to_abort_cmd
                                         - returns SUCCESS
                                       -> scsi_queue_insert

Normally, the SCMD_STATE_COMPLETE bit would protect against the command
completion and the timeout, but that doesn't work here, since we don't
check that at all in the SCSI_MLQUEUE_HOST_BUSY path.

In this case we end up calling scsi_queue_insert on a request that has
already been queued, or possibly even freed, and we crash.

The patch below simply increases the default I/O timeout to avoid this race
condition. This is also the timeout value that nearly all IBM SAN storage
recommends setting as the default value.

Link: https://lore.kernel.org/r/1610463998-19791-1-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07 15:35:48 +01:00
..
aacraid scsi: aacraid: Fix error handling paths in aac_probe_one() 2020-10-01 13:17:56 +02:00
aic7xxx scsi: aic7xxx: Adjust indentation in ahc_find_syncrate 2020-02-24 08:36:38 +01:00
aic94xx scsi: aic94xx: Remove unnecessary null check 2019-07-30 12:12:59 -04:00
arcmsr treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
arm scsi: eesox: Fix different dev_id between request_irq() and free_irq() 2020-08-19 08:16:09 +02:00
be2iscsi scsi: be2iscsi: Revert "Fix a theoretical leak in beiscsi_create_eqs()" 2020-12-16 10:56:58 +01:00
bfa scsi: bfa: Fix error return in bfad_pci_init() 2020-10-29 09:57:57 +01:00
bnx2fc SCSI fixes on 20191004 2019-10-05 12:53:27 -07:00
bnx2i scsi: bnx2i: Requires MMU 2020-12-30 11:50:53 +01:00
csiostor scsi: csiostor: Fix wrong return value in csio_hw_prep_fw() 2020-10-29 09:57:37 +01:00
cxgbi scsi: cxgb4i: Fix TLS dependency 2021-01-06 14:48:38 +01:00
cxlflash scsi: cxlflash: Fix error return code in cxlflash_probe() 2020-10-01 13:18:02 +02:00
device_handler scsi: scsi_dh_alua: Avoid crash during alua_bus_detach() 2020-11-18 19:20:23 +01:00
dpt treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
esas2r scsi: esas2r: unlock on error in esas2r_nvram_read_direct() 2020-01-23 08:22:58 +01:00
fcoe scsi: fcoe: Memory leak fix in fcoe_sysfs_fcf_del() 2020-09-03 11:26:47 +02:00
fnic scsi: fnic: Fix memleak in vnic_dev_init_devcmd2 2021-02-07 15:35:48 +01:00
hisi_sas scsi: hisi_sas: Do not reset phy timer to wait for stray phy up 2020-06-24 17:50:15 +02:00
ibmvscsi scsi: ibmvfc: Set default timeout to avoid crash during migration 2021-02-07 15:35:48 +01:00
ibmvscsi_tgt scsi: ibmvscsi_tgt: Mark expected switch fall-throughs 2019-07-30 15:59:53 -04:00
isci scsi: libsas: aic94xx: hisi_sas: mvsas: pm8001: Use dev_is_expander() 2019-06-20 15:37:02 -04:00
libfc scsi: libfc: Avoid invoking response handler twice if ep is already completed 2021-02-07 15:35:48 +01:00
libsas scsi: libsas: Fix error path in sas_notify_lldd_dev_found() 2020-09-23 12:40:40 +02:00
lpfc scsi: lpfc: Make lpfc_defer_acc_rsp static 2021-01-23 15:57:55 +01:00
megaraid scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression 2021-01-27 11:47:47 +01:00
mpt3sas scsi: mpt3sas: Increase IOCInit request timeout to 30s 2020-12-30 11:50:57 +01:00
mvsas SCSI misc on 20190709 2019-07-11 15:14:01 -07:00
pcmcia SCSI sg on 20190709 2019-07-11 15:17:41 -07:00
pm8001 scsi: pm80xx: Fix error return in pm8001_pci_probe() 2020-12-30 11:51:21 +01:00
qedf scsi: qedf: Return SUCCESS if stale rport is encountered 2020-10-29 09:58:07 +01:00
qedi scsi: qedi: Correct max length of CHAP secret 2021-01-27 11:47:43 +01:00
qla2xxx scsi: qla2xxx: Fix crash during driver load on big endian machines 2020-12-30 11:51:43 +01:00
qla4xxx scsi: qla4xxx: Fix an error handling path in 'qla4xxx_get_host_stats()' 2020-10-29 09:57:36 +01:00
smartpqi scsi: smartpqi: Avoid crashing kernel for controller issues 2020-10-29 09:58:09 +01:00
snic scsi: snic: no need to check return value of debugfs_create functions 2019-01-29 00:40:54 -05:00
sym53c8xx_2 scsi: sym53c8xx_2: remove redundant assignment to retv 2019-08-12 21:58:07 -04:00
ufs scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback 2021-01-27 11:47:43 +01:00
.gitignore
3w-9xxx.c scsi: 3w-9xxx: fix calls to dma_set_mask_and_coherent() 2019-02-25 21:37:25 -05:00
3w-9xxx.h
3w-sas.c SCSI fixes on 20190302 2019-03-02 11:39:54 -08:00
3w-sas.h
3w-xxxx.c scsi: 3w-xxxx: fix indentation issue, add missing tab 2018-12-19 21:54:07 -05:00
3w-xxxx.h scsi: 3w-xxx: fully convert to the generic DMA API 2018-10-17 21:58:51 -04:00
53c700.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61 2019-05-24 17:36:45 +02:00
53c700.h
53c700.scr
53c700_d.h_shipped
BusLogic.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 388 2019-06-05 17:37:11 +02:00
BusLogic.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 388 2019-06-05 17:37:11 +02:00
FlashPoint.c
Kconfig scsi: sr: remove references to BLK_DEV_SR_VENDOR, leave it enabled 2020-07-22 09:32:57 +02:00
Makefile scsi: remove pointless $(MODVERDIR)/$(obj)/53c700.ver 2019-07-17 22:39:27 +09:00
NCR5380.c scsi: NCR5380: Add disconnect_mask module parameter 2020-01-04 19:18:16 +01:00
NCR5380.h Revert "scsi: ncr5380: Increase register polling limit" 2019-06-20 15:37:02 -04:00
a100u2w.c cross-tree: phase out dma_zalloc_coherent() 2019-01-08 07:58:37 -05:00
a100u2w.h
a2091.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
a2091.h
a3000.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
a3000.h
a4000t.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
advansys.c SCSI sg on 20190709 2019-07-11 15:17:41 -07:00
aha152x.c SCSI sg on 20190709 2019-07-11 15:17:41 -07:00
aha152x.h
aha1542.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
aha1542.h
aha1740.c scsi: flip the default on use_clustering 2018-12-18 23:13:12 -05:00
aha1740.h
am53c974.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
atari_scsi.c scsi: atari_scsi: sun3_scsi: Set sg_tablesize to 1 instead of SG_NONE 2020-01-04 19:18:10 +01:00
atp870u.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
atp870u.h
bvme6000_scsi.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
ch.c scsi: ch: Make it possible to open a ch device multiple times again 2019-10-09 23:39:35 -04:00
constants.c
dc395x.c scsi: remove the use_clustering flag 2018-12-18 23:19:21 -05:00
dc395x.h
dmx3191d.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61 2019-05-24 17:36:45 +02:00
dpt_i2o.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
dpti.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
esp_scsi.c SCSI sg on 20190709 2019-07-11 15:17:41 -07:00
esp_scsi.h scsi: esp: use sg helper to iterate over scatterlist 2019-06-20 15:21:33 -04:00
fdomain.c scsi: fdomain: use BSTAT_{MSG|CMD|IO} in fdomain_work() 2019-07-30 12:17:28 -04:00
fdomain.h scsi: fdomain: Add register definitions 2019-06-18 19:46:22 -04:00
fdomain_isa.c scsi: fdomain_isa: use CFG1_IRQ_MASK 2019-07-30 12:18:24 -04:00
fdomain_pci.c scsi: fdomain: Resurrect driver - PCI support 2019-06-18 19:46:18 -04:00
g_NCR5380.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
gdth.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 148 2019-05-30 11:25:18 -07:00
gdth.h scsi: gdth: remove ISA and EISA support 2019-01-08 21:58:35 -05:00
gdth_ioctl.h scsi: gdth: remove dead code under #ifdef GDTH_IOCTL_PROC 2019-01-08 21:58:35 -05:00
gdth_proc.c scsi: gdth: use generic DMA API 2019-01-08 21:58:35 -05:00
gdth_proc.h scsi: gdth: remove gdth_{alloc,free}_ioctl 2019-01-08 21:57:42 -05:00
gvp11.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
gvp11.h
hosts.c SCSI fixes on 20190720 2019-07-20 10:04:58 -07:00
hpsa.c scsi: hpsa: Fix memory leak in hpsa_init_one() 2020-11-18 19:20:22 +01:00
hpsa.h scsi: hpsa: correct device resets 2019-06-18 19:46:18 -04:00
hpsa_cmd.h SCSI misc on 20190709 2019-07-11 15:14:01 -07:00
hptiop.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 285 2019-06-05 17:36:37 +02:00
hptiop.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 285 2019-06-05 17:36:37 +02:00
imm.c SCSI sg on 20190709 2019-07-11 15:17:41 -07:00
imm.h
initio.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 18 2019-05-21 11:28:46 +02:00
initio.h
ipr.c scsi: ipr: Fix softlockup when rescanning devices in petitboot 2020-04-01 11:01:54 +02:00
ipr.h scsi: ipr: Fix softlockup when rescanning devices in petitboot 2020-04-01 11:01:54 +02:00
ips.c scsi: remove the use_clustering flag 2018-12-18 23:19:21 -05:00
ips.h scsi: ips: properly handle 64-bit DMA 2018-11-06 21:31:28 -05:00
iscsi_boot_sysfs.c scsi: iscsi: Fix reference count leak in iscsi_boot_create_kobj 2020-06-24 17:50:37 +02:00
iscsi_tcp.c scsi: iscsi: Don't destroy session if there are outstanding connections 2020-02-24 08:36:50 +01:00
iscsi_tcp.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
jazz_esp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
lasi700.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61 2019-05-24 17:36:45 +02:00
libiscsi.c scsi: libiscsi: Fix NOP race condition 2020-12-02 08:49:49 +01:00
libiscsi_tcp.c SCSI misc on 20190709 2019-07-11 15:14:01 -07:00
mac53c94.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mac53c94.h
mac_esp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mac_scsi.c scsi: atari_scsi: sun3_scsi: Set sg_tablesize to 1 instead of SG_NONE 2020-01-04 19:18:10 +01:00
megaraid.c scsi: megaraid: disable device when probe failed after enabled device 2019-09-23 23:09:42 -04:00
megaraid.h
mesh.c scsi: mesh: Fix panic after host or bus reset 2020-08-19 08:16:15 +02:00
mesh.h
mvme16x_scsi.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mvme147.c scsi: flip the default on use_clustering 2018-12-18 23:13:12 -05:00
mvme147.h
mvumi.c scsi: mvumi: Fix error return in mvumi_io_attach() 2020-10-29 09:58:04 +01:00
mvumi.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 167 2019-05-30 11:26:39 -07:00
myrb.c SCSI misc on 20181224 2018-12-28 14:48:06 -08:00
myrb.h scsi: myrb: Add Mylex RAID controller (block interface) 2018-10-17 21:06:49 -04:00
myrs.c scsi: myrs: Fix uninitialized variable 2019-05-20 10:56:43 -04:00
myrs.h scsi: myrs: Add Mylex RAID controller (SCSI interface) 2018-10-17 21:07:54 -04:00
ncr53c8xx.c scsi: ncr53c8xx: Mark expected switch fall-through 2019-08-07 21:53:23 -04:00
ncr53c8xx.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61 2019-05-24 17:36:45 +02:00
nsp32.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118 2019-05-24 17:39:02 +02:00
nsp32.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118 2019-05-24 17:39:02 +02:00
nsp32_debug.c
nsp32_io.h
pmcraid.c scsi: pmcraid: Fix a typo - pcmraid --> pmcraid 2019-08-12 21:57:13 -04:00
pmcraid.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
ppa.c scsi: ppa: use sg helper to iterate over scatterlist 2019-06-20 15:21:33 -04:00
ppa.h
ps3rom.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 164 2019-05-30 11:26:38 -07:00
qla1280.c qla1280: remove SGI SN2 support 2019-08-16 11:33:56 -07:00
qla1280.h qla1280: remove SGI SN2 support 2019-08-16 11:33:56 -07:00
qlogicfas.c scsi: remove the use_clustering flag 2018-12-18 23:19:21 -05:00
qlogicfas408.c scsi: qlogicfas408: clean up a couple of indentation issues 2019-03-19 17:11:37 -04:00
qlogicfas408.h
qlogicpti.c scsi: qlogicpti: Mark expected switch fall-throughs 2019-08-07 21:32:53 -04:00
qlogicpti.h scsi: qlogicpti: Use of_node_name_eq for node name comparisons 2019-02-13 22:07:03 -05:00
raid_class.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 437 2019-06-05 17:37:17 +02:00
script_asm.pl treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61 2019-05-24 17:36:45 +02:00
scsi.c SCSI misc on 20190709 2019-07-11 15:14:01 -07:00
scsi.h
scsi_common.c
scsi_debug.c scsi: scsi_debug: Add check for sdebug_max_queue during module init 2020-08-19 08:16:11 +02:00
scsi_debugfs.c scsi: scsi_debugfs: Use for_each_set_bit to simplify code 2019-07-30 12:42:55 -04:00
scsi_debugfs.h scsi: core: add SPDX tags to scsi midlayer files missing licensing information 2019-05-21 06:16:21 -04:00
scsi_devinfo.c scsi: dh: Add Fujitsu device to devinfo and dh lists 2020-07-29 10:18:27 +02:00
scsi_dh.c scsi: dh: Add Fujitsu device to devinfo and dh lists 2020-07-29 10:18:27 +02:00
scsi_error.c scsi: core: save/restore command resid for error handling 2019-10-03 21:43:04 -04:00
scsi_ioctl.c scsi: core: add SPDX tags to scsi midlayer files missing licensing information 2019-05-21 06:16:21 -04:00
scsi_lib.c scsi: core: Fix VPD LUN ID designator priorities 2020-12-30 11:51:08 +01:00
scsi_lib_dma.c
scsi_logging.c scsi: core: Reduce memory required for SCSI logging 2019-08-07 21:47:29 -04:00
scsi_logging.h
scsi_netlink.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
scsi_pm.c scsi: pm: Balance pm_only counter of request queue during system resume 2020-06-07 13:18:50 +02:00
scsi_priv.h scsi: sd: Rely on the driver core for asynchronous probing 2019-06-18 19:46:17 -04:00
scsi_proc.c drivers: Add generic helper to match any device 2019-07-30 13:07:42 +02:00
scsi_sas_internal.h
scsi_scan.c scsi: core: Don't start concurrent async scan on same host 2020-11-10 12:37:30 +01:00
scsi_sysctl.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 2019-05-30 11:29:53 -07:00
scsi_sysfs.c scsi: core: try to get module before removing device 2019-10-17 21:57:09 -04:00
scsi_trace.c scsi: core: scsi_trace: Use get_unaligned_be*() 2020-01-23 08:22:59 +01:00
scsi_transport_api.h
scsi_transport_fc.c SCSI misc on 20190709 2019-07-11 15:14:01 -07:00
scsi_transport_iscsi.c scsi: iscsi: Do not put host in iscsi_set_flashnode_param() 2020-09-03 11:26:47 +02:00
scsi_transport_sas.c scsi: scsi_transport_sas: Fix memory leak when removing devices 2020-01-23 08:22:58 +01:00
scsi_transport_spi.c scsi: scsi_transport_spi: Set RQF_PM for domain validation commands 2021-01-12 20:16:09 +01:00
scsi_transport_srp.c scsi: scsi_transport_srp: Don't block target in failfast state 2021-02-07 15:35:48 +01:00
scsicam.c
sd.c scsi: sd: Suppress spurious errors when WRITE SAME is being disabled 2021-01-27 11:47:43 +01:00
sd.h scsi: implement REQ_OP_ZONE_RESET_ALL 2019-08-04 21:41:29 -06:00
sd_dif.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 410 2019-06-05 17:37:14 +02:00
sd_zbc.c scsi: sd_zbc: Fix sd_zbc_complete() 2019-11-05 23:17:53 -05:00
sense_codes.h
ses.c SCSI misc on 20190709 2019-07-11 15:14:01 -07:00
sg.c scsi: sg: add sg_remove_request in sg_write 2020-05-20 08:20:07 +02:00
sgiwd93.c scsi: remove the use_clustering flag 2018-12-18 23:19:21 -05:00
sim710.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61 2019-05-24 17:36:45 +02:00
sni_53c710.c scsi: sni_53c710: fix compilation error 2019-10-09 23:35:42 -04:00
sr.c scsi: sr: Fix sr_probe() missing deallocate of device minor 2020-06-24 17:50:19 +02:00
sr.h
sr_ioctl.c
sr_vendor.c scsi: sr: remove references to BLK_DEV_SR_VENDOR, leave it enabled 2020-07-22 09:32:57 +02:00
st.c SCSI misc on 20190709 2019-07-11 15:14:01 -07:00
st.h
st_options.h
stex.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
storvsc_drv.c scsi: storvsc: Correctly set number of hardware queues for IDE disk 2020-01-23 08:22:38 +01:00
sun3_scsi.c scsi: atari_scsi: sun3_scsi: Set sg_tablesize to 1 instead of SG_NONE 2020-01-04 19:18:10 +01:00
sun3_scsi_vme.c
sun3x_esp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
sun_esp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
virtio_scsi.c scsi: virtio_scsi: unplug LUNs when events missed 2019-09-10 22:10:17 -04:00
vmw_pvscsi.c SCSI sg on 20190709 2019-07-11 15:17:41 -07:00
vmw_pvscsi.h
wd33c93.c scsi: wd33c93: Mark expected switch fall-through 2019-08-07 21:35:59 -04:00
wd33c93.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118 2019-05-24 17:39:02 +02:00
wd719x.c SCSI misc on 20190709 2019-07-11 15:14:01 -07:00
wd719x.h scsi: wd719x: use per-command private data 2018-11-15 14:27:08 -05:00
xen-scsifront.c scsi: xen-scsifront: remove DISABLE_CLUSTERING 2018-12-18 23:13:12 -05:00
zalon.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
zorro7xx.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
zorro_esp.c scsi: zorro_esp: Limit DMA transfers to 65536 bytes (except on Fastlane) 2020-01-04 19:17:37 +01:00