linux-brain/drivers/net/wireless/ath/ath10k
Miaoqing Pan e37674e1a4 ath10k: fix wmi mgmt tx queue full due to race condition
[ Upstream commit b55379e343a3472c35f4a1245906db5158cab453 ]

Failed to transmit wmi management frames:

[84977.840894] ath10k_snoc a000000.wifi: wmi mgmt tx queue is full
[84977.840913] ath10k_snoc a000000.wifi: failed to transmit packet, dropping: -28
[84977.840924] ath10k_snoc a000000.wifi: failed to submit frame: -28
[84977.840932] ath10k_snoc a000000.wifi: failed to transmit frame: -28

This issue is caused by race condition between skb_dequeue and
__skb_queue_tail. The queue of ‘wmi_mgmt_tx_queue’ is protected by a
different lock: ar->data_lock vs list->lock, the result is no protection.
So when ath10k_mgmt_over_wmi_tx_work() and ath10k_mac_tx_wmi_mgmt()
running concurrently on different CPUs, there appear to be a rare corner
cases when the queue length is 1,

  CPUx (skb_deuque)			CPUy (__skb_queue_tail)
					next=list
					prev=list
  struct sk_buff *skb = skb_peek(list);	WRITE_ONCE(newsk->next, next);
  WRITE_ONCE(list->qlen, list->qlen - 1);WRITE_ONCE(newsk->prev, prev);
  next       = skb->next;		WRITE_ONCE(next->prev, newsk);
  prev       = skb->prev;		WRITE_ONCE(prev->next, newsk);
  skb->next  = skb->prev = NULL;	list->qlen++;
  WRITE_ONCE(next->prev, prev);
  WRITE_ONCE(prev->next, next);

If the instruction ‘next = skb->next’ is executed before
‘WRITE_ONCE(prev->next, newsk)’, newsk will be lost, as CPUx get the
old ‘next’ pointer, but the length is still added by one. The final
result is the length of the queue will reach the maximum value but
the queue is empty.

So remove ar->data_lock, and use 'skb_queue_tail' instead of
'__skb_queue_tail' to prevent the potential race condition. Also switch
to use skb_queue_len_lockless, in case we queue a few SKBs simultaneously.

Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1.c2-00033-QCAHLSWMTPLZ-1

Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1608618887-8857-1-git-send-email-miaoqing@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-07 12:20:45 +01:00
..
ahb.c
ahb.h
bmi.c
bmi.h
ce.c ath10k: Fix the size used in a 'dma_free_coherent()' call in an error handling path 2020-10-29 09:57:35 +01:00
ce.h
core.c ath10k: fix latency issue for QCA988x 2019-10-14 11:43:36 +03:00
core.h
coredump.c ath10k: fix backtrace on coredump 2019-12-31 16:43:14 +01:00
coredump.h
debug.c ath10k: fix memory leak for tpc_stats_final 2020-10-01 13:17:12 +02:00
debug.h
debugfs_sta.c
hif.h
htc.c
htc.h
htt_rx.c ath10k: fix VHT NSS calculation when STBC is enabled 2020-11-05 11:43:15 +01:00
htt_tx.c ath10k: Acquire tx_lock in tx error paths 2020-08-19 08:16:06 +02:00
htt.c
htt.h ath10k: add flush tx packets for SDIO chip 2020-06-22 09:30:59 +02:00
hw.c ath10k: enable transmit data ack RSSI for QCA9884 2020-08-05 09:59:41 +02:00
hw.h Revert "ath10k: fix DMA related firmware crashes on multiple devices" 2020-09-03 11:26:49 +02:00
Kconfig
mac.c ath10k: fix wmi mgmt tx queue full due to race condition 2021-03-07 12:20:45 +01:00
mac.h
Makefile
p2p.c
p2p.h
pci.c ath10k: Fix the race condition in firmware dump work queue 2020-06-22 09:30:49 +02:00
pci.h
qmi_wlfw_v01.c ath10k: Fix HOST capability QMI incompatibility 2019-11-29 10:09:41 +01:00
qmi_wlfw_v01.h ath10k: Fix HOST capability QMI incompatibility 2019-11-29 10:09:41 +01:00
qmi.c ath10k: Fix HOST capability QMI incompatibility 2019-11-29 10:09:41 +01:00
qmi.h
rx_desc.h
sdio.c ath10k: start recovery process when payload length exceeds max htc length for sdio 2020-11-05 11:43:15 +01:00
sdio.h
snoc.c ath10k: Fix error handling in case of CE pipe init failure 2021-03-04 10:26:11 +01:00
snoc.h ath10k: Fix HOST capability QMI incompatibility 2019-11-29 10:09:41 +01:00
spectral.c
spectral.h
swap.c
swap.h
targaddrs.h
testmode_i.h
testmode.c
testmode.h
thermal.c
thermal.h
trace.c
trace.h
txrx.c ath10k: fix kernel null pointer dereference 2020-06-22 09:30:57 +02:00
txrx.h
usb.c ath10k: Release some resources in an error handling path 2020-12-30 11:51:15 +01:00
usb.h
wmi-ops.h ath10k: Remove msdu from idr when management pkt send fails 2020-06-22 09:31:05 +02:00
wmi-tlv.c ath10k: Fix the parsing error in service available event 2020-12-30 11:51:15 +01:00
wmi-tlv.h ath10k: fix channel info parsing for non tlv target 2019-09-12 17:54:38 +03:00
wmi.c ath10k: Fix the parsing error in service available event 2020-12-30 11:51:15 +01:00
wmi.h ath10k: Fix the parsing error in service available event 2020-12-30 11:51:15 +01:00
wow.c
wow.h