linux-brain

Commit Graph

Author	SHA1	Message	Date
David Hildenbrand	e84c5b7617	mm/memory_hotplug: shrink zones when offlining memory commit feee6b2989165631b17ac6d4ccdbf6759254e85a upstream. We currently try to shrink a single zone when removing memory. We use the zone of the first page of the memory we are removing. If that memmap was never initialized (e.g., memory was never onlined), we will read garbage and can trigger kernel BUGs (due to a stale pointer): BUG: unable to handle page fault for address: 000000000000353d #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:clear_zone_contiguous+0x5/0x10 Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840 RSP: 0018:ffffad2400043c98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000 RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40 RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000 R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680 FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __remove_pages+0x4b/0x640 arch_remove_memory+0x63/0x8d try_remove_memory+0xdb/0x130 __remove_memory+0xa/0x11 acpi_memory_device_remove+0x70/0x100 acpi_bus_trim+0x55/0x90 acpi_device_hotplug+0x227/0x3a0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x221/0x550 worker_thread+0x50/0x3b0 kthread+0x105/0x140 ret_from_fork+0x3a/0x50 Modules linked in: CR2: 000000000000353d Instead, shrink the zones when offlining memory or when onlining failed. Introduce and use remove_pfn_range_from_zone(() for that. We now properly shrink the zones, even if we have DIMMs whereby - Some memory blocks fall into no zone (never onlined) - Some memory blocks fall into multiple zones (offlined+re-onlined) - Multiple memory blocks that fall into different zones Drop the zone parameter (with a potential dubious value) from __remove_pages() and __remove_section(). Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com Fixes: `f1dd2cd13c` ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after `d0dc12e86b`] Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: <stable@vger.kernel.org> [5.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:56 +01:00
Hans Verkuil	80d9e63714	media: cec: check 'transmit_in_progress', not 'transmitting' commit ac479b51f3f4aaa852b5d3f00ecfb9290230cf64 upstream. Currently wait_event_interruptible_timeout is called in cec_thread_func() when adap->transmitting is set. But if the adapter is unconfigured while transmitting, then adap->transmitting is set to NULL. But the hardware is still actually transmitting the message, and that's indicated by adap->transmit_in_progress and we should wait until that is finished or times out before transmitting new messages. As the original commit says: adap->transmitting is the userspace view, adap->transmit_in_progress reflects the hardware state. However, if adap->transmitting is NULL and adap->transmit_in_progress is true, then wait_event_interruptible is called (no timeout), which can get stuck indefinitely if the CEC driver is flaky and never marks the transmit-in-progress as 'done'. So test against transmit_in_progress when deciding whether to use the timeout variant or not, instead of testing against adap->transmitting. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: `32804fcb61` ("media: cec: keep track of outstanding transmits") Cc: <stable@vger.kernel.org> # for v4.19 and up Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:56 +01:00
Hans Verkuil	9a7130220a	media: cec: avoid decrementing transmit_queue_sz if it is 0 commit 95c29d46ab2a517e4c26d0a07300edca6768db17 upstream. WARN if transmit_queue_sz is 0 but do not decrement it. The CEC adapter will become unresponsive if it goes below 0 since then it thinks there are 4 billion messages in the queue. Obviously this should not happen, but a driver bug could cause this. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: <stable@vger.kernel.org> # for v4.12 and up Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:56 +01:00
Hans Verkuil	e572db9a4d	media: cec: CEC 2.0-only bcast messages were ignored commit cec935ce69fc386f13959578deb40963ebbb85c3 upstream. Some messages are allowed to be a broadcast message in CEC 2.0 only, and should be ignored by CEC 1.4 devices. Unfortunately, the check was wrong, causing such messages to be marked as invalid under CEC 2.0. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: <stable@vger.kernel.org> # for v4.10 and up Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:56 +01:00
Hans Verkuil	f868e597a3	media: pulse8-cec: fix lost cec_transmit_attempt_done() call commit e5a52a1d15c79bb48a430fb263852263ec1d3f11 upstream. The periodic PING command could interfere with the result of a CEC transmit, causing a lost cec_transmit_attempt_done() call. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: <stable@vger.kernel.org> # for v4.10 and up Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:56 +01:00
Paul Burton	5b004a2384	MIPS: Avoid VDSO ABI breakage due to global register variable commit bbcc5672b0063b0e9d65dc8787a4f09c3b5bb5cc upstream. Declaring __current_thread_info as a global register variable has the effect of preventing GCC from saving & restoring its value in cases where the ABI would typically do so. To quote GCC documentation: > If the register is a call-saved register, call ABI is affected: the > register will not be restored in function epilogue sequences after the > variable has been assigned. Therefore, functions cannot safely return > to callers that assume standard ABI. When our position independent VDSO is built for the n32 or n64 ABIs all functions it exposes should be preserving the value of $gp/$28 for their caller, but in the presence of the __current_thread_info global register variable GCC stops doing so & simply clobbers $gp/$28 when calculating the address of the GOT. In cases where the VDSO returns success this problem will typically be masked by the caller in libc returning & restoring $gp/$28 itself, but that is by no means guaranteed. In cases where the VDSO returns an error libc will typically contain a fallback path which will now fail (typically with a bad memory access) if it attempts anything which relies upon the value of $gp/$28 - eg. accessing anything via the GOT. One fix for this would be to move the declaration of __current_thread_info inside the current_thread_info() function, demoting it from global register variable to local register variable & avoiding inadvertently creating a non-standard calling ABI for the VDSO. Unfortunately this causes issues for clang, which doesn't support local register variables as pointed out by commit `fe92da0f35` ("MIPS: Changed current_thread_info() to an equivalent supported by both clang and GCC") which introduced the global register variable before we had a VDSO to worry about. Instead, fix this by continuing to use the global register variable for the kernel proper but declare __current_thread_info as a simple extern variable when building the VDSO. It should never be referenced, and will cause a link error if it is. This resolves the calling convention issue for the VDSO without having any impact upon the build of the kernel itself for either clang or gcc. Signed-off-by: Paul Burton <paulburton@kernel.org> Fixes: `ebb5e78cc6` ("MIPS: Initial implementation of a VDSO") Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <christian.brauner@canonical.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: <stable@vger.kernel.org> # v4.4+ Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:56 +01:00
Alexander Lobakin	2e0bee3669	MIPS: BPF: eBPF JIT: check for MIPS ISA compliance in Kconfig commit f596cf0d8062cb5d0a4513a8b3afca318c13be10 upstream. It is completely wrong to check for compile-time MIPS ISA revision in the body of bpf_int_jit_compile() as it may lead to get MIPS JIT fully omitted by the CC while the rest system will think that the JIT is actually present and works [1]. We can check if the selected CPU really supports MIPS eBPF JIT at configure time and avoid such situations when kernel can be built without both JIT and interpreter, but with CONFIG_BPF_SYSCALL=y. [1] https://lore.kernel.org/linux-mips/09d713a59665d745e21d021deeaebe0a@dlink.ru/ Fixes: `716850ab10` ("MIPS: eBPF: Initial eBPF support for MIPS32 architecture.") Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: Alexander Lobakin <alobakin@dlink.ru> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: Hassan Naveed <hnaveed@wavecomp.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: bpf@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:56 +01:00
Paul Burton	adbe05523e	MIPS: BPF: Disable MIPS32 eBPF JIT commit f8fffebdea752a25757b906f3dffecf1a59a6194 upstream. Commit `716850ab10` ("MIPS: eBPF: Initial eBPF support for MIPS32 architecture.") enabled our eBPF JIT for MIPS32 kernels, whereas it has previously only been availailable for MIPS64. It was my understanding at the time that the BPF test suite was passing & JITing a comparable number of tests to our cBPF JIT [1], but it turns out that was not the case. The eBPF JIT has a number of problems on MIPS32: - Most notably various code paths still result in emission of MIPS64 instructions which will cause reserved instruction exceptions & kernel panics when run on MIPS32 CPUs. - The eBPF JIT doesn't account for differences between the O32 ABI used by MIPS32 kernels versus the N64 ABI used by MIPS64 kernels. Notably arguments beyond the first 4 are passed on the stack in O32, and this is entirely unhandled when JITing a BPF_CALL instruction. Stack space must be reserved for arguments even if they all fit in registers, and the callee is free to assume that stack space has been reserved for its use - with the eBPF JIT this is not the case, so calling any function can result in clobbering values on the stack & unpredictable behaviour. Function arguments in eBPF are always 64-bit values which is also entirely unhandled - the JIT still uses a single (32-bit) register per argument. As a result all function arguments are always passed incorrectly when JITing a BPF_CALL instruction, leading to kernel crashes or strange behavior. - The JIT attempts to bail our on use of ALU64 instructions or 64-bit memory access instructions. The code doing this at the start of build_one_insn() incorrectly checks whether BPF_OP() equals BPF_DW, when it should really be checking BPF_SIZE() & only doing so when BPF_CLASS() is one of BPF_{LD,LDX,ST,STX}. This results in false positives that cause more bailouts than intended, and that in turns hides some of the problems described above. - The kernel's cBPF->eBPF translation makes heavy use of 64-bit eBPF instructions that the MIPS32 eBPF JIT bails out on, leading to most cBPF programs not being JITed at all. Until these problems are resolved, revert the enabling of the eBPF JIT on MIPS32 done by commit `716850ab10` ("MIPS: eBPF: Initial eBPF support for MIPS32 architecture."). Note that this does not undo the changes made to the eBPF JIT by that commit, since they are a useful starting point to providing MIPS32 support - they're just not nearly complete. [1] https://lore.kernel.org/linux-mips/MWHPR2201MB13583388481F01A422CE7D66D4410@MWHPR2201MB1358.namprd22.prod.outlook.com/ Signed-off-by: Paul Burton <paulburton@kernel.org> Fixes: `716850ab10` ("MIPS: eBPF: Initial eBPF support for MIPS32 architecture.") Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Hassan Naveed <hnaveed@wavecomp.com> Cc: Tony Ambardar <itugrok@yahoo.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Cc: <stable@vger.kernel.org> # v5.2+ Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:56 +01:00
Alex Deucher	f72e33675f	drm/amdgpu/smu: add metrics table lock for vega20 (v2) commit 1c455101c6d10c99b310d6bcf613244c97854012 upstream. To protect access to the metrics table. v2: unlock on error Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:56 +01:00
Alex Deucher	86164784cf	drm/amdgpu/smu: add metrics table lock for navi (v2) commit e0e384c398d4638e54b6d2098f0ceaafdab870ee upstream. To protect access to the metrics table. v2: unlock on error Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:55 +01:00
Alex Deucher	881b399da3	drm/amdgpu/smu: add metrics table lock for arcturus (v2) commit 1da87c9f67c98d552679974dbfc1f0f65b6a0a53 upstream. To protect access to the metrics table. v2: unlock on error Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:55 +01:00
Alex Deucher	7514bbe975	drm/amdgpu/smu: add metrics table lock commit 073d5eef9e043c2b7e3ef12bc6c879b1d248e831 upstream. This table is used for lots of things, add it's own lock. Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:55 +01:00
Stefan Mavrodiev	55ab031c0a	drm/sun4i: hdmi: Remove duplicate cleanup calls commit 57177d214ee0816c4436c23d6c933ccb32c571f1 upstream. When the HDMI unbinds drm_connector_cleanup() and drm_encoder_cleanup() are called. This also happens when the connector and the encoder are destroyed. This double call triggers a NULL pointer exception. The patch fixes this by removing the cleanup calls in the unbind function. Cc: <stable@vger.kernel.org> Fixes: `9c5681011a` ("drm/sun4i: Add HDMI support") Signed-off-by: Stefan Mavrodiev <stefan@olimex.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20191217124632.20820-1-stefan@olimex.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:55 +01:00
Kailang Yang	52702a539c	ALSA: hda/realtek - Add headset Mic no shutup for ALC283 commit 66c5d718e5a6f80153b5e8d6ad8ba8e9c3320839 upstream. Chrome machine had humming noise from external speaker plugin at codec D3 state. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/2692449396954c6c968f5b75e2660358@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:55 +01:00
Takashi Iwai	0844feca25	ALSA: hda - Apply sync-write workaround to old Intel platforms, too commit c366b3dbbab14b28d044b94eb9ce77c23482ea35 upstream. Klaus Ethgen reported occasional high CPU usages in his system that seem caused by HD-audio driver. The perf output revealed that it's in the unsolicited event handling in the workqueue, and the problem seems triggered by some communication stall between the controller and the codec at the runtime or system resume. Actually a similar phenomenon was seen in the past for other Intel platforms, and we already applied the workaround to enforce sync-write for CORB/RIRB verbs for Skylake and newer chipsets (commit `2756d9143a` "ALSA: hda - Fix intermittent CORB/RIRB stall on Intel chips"). Fortunately, the same workaround is applicable to the old chipset, and the experiment showed the positive effect. Based on the experiment result, this patch enables the sync-write workaround for all Intel chipsets. The only reason I hesitated to apply this workaround was about the possibly slightly higher CPU usage. But if the lack of sync causes a much severer problem even for quite old chip, we should think this would be necessary for all Intel chips. Reported-by: Klaus Ethgen <Klaus@ethgen.ch> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191223171833.GA17053@chua Link: https://lore.kernel.org/r/20191223221816.32572-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:55 +01:00
Hui Wang	56f395fb0a	ALSA: usb-audio: set the interface format after resume on Dell WD19 commit 92adc96f8eecd9522a907c197cc3d62e405539fe upstream. Recently we found the headset-mic on the Dell Dock WD19 doesn't work anymore after s3 (s2i or deep), this problem could be workarounded by closing (pcm_close) the app and then reopening (pcm_open) the app, so this bug is not easy to be detected by users. When problem happens, retire_capture_urb() could still be called periodically, but the size of captured data is always 0, it could be a firmware bug on the dock. Anyway I found after resuming, the snd_usb_pcm_prepare() will be called, and if we forcibly run set_format() to set the interface and its endpoint, the capture size will be normal again. This problem and workaound also apply to playback. To fix it in the kernel, add a quirk to let set_format() run forcibly once after resume. Signed-off-by: Hui Wang <hui.wang@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191218132650.6303-1-hui.wang@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:55 +01:00
Johan Hovold	f47e521243	ALSA: usb-audio: fix set_format altsetting sanity check commit 0141254b0a74b37aa7eb13d42a56adba84d51c73 upstream. Make sure to check the return value of usb_altnum_to_altsetting() to avoid dereferencing a NULL pointer when the requested alternate settings is missing. The format altsetting number may come from a quirk table and there does not seem to be any other validation of it (the corresponding index is checked however). Fixes: `b099b9693d` ("ALSA: usb-audio: Avoid superfluous usb_set_interface() calls") Cc: stable <stable@vger.kernel.org> # 4.18 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191220093134.1248-1-johan@kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:55 +01:00
Takashi Iwai	60a6c5d810	ALSA: ice1724: Fix sleep-in-atomic in Infrasonic Quartet support code commit 0aec96f5897ac16ad9945f531b4bef9a2edd2ebd upstream. Jia-Ju Bai reported a possible sleep-in-atomic scenario in the ice1724 driver with Infrasonic Quartet support code: namely, ice->set_rate callback gets called inside ice->reg_lock spinlock, while the callback in quartet.c holds ice->gpio_mutex. This patch fixes the invalid call: it simply moves the calls of ice->set_rate and ice->set_mclk callbacks outside the spinlock. Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/5d43135e-73b9-a46a-2155-9e91d0dcdf83@gmail.com Link: https://lore.kernel.org/r/20191218192606.12866-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:55 +01:00
Johannes Weiner	173fa52f7f	mm: drop mmap_sem before calling balance_dirty_pages() in write fault [ Upstream commit 89b15332af7c0312a41e50846819ca6613b58b4c ] One of our services is observing hanging ps/top/etc under heavy write IO, and the task states show this is an mmap_sem priority inversion: A write fault is holding the mmap_sem in read-mode and waiting for (heavily cgroup-limited) IO in balance_dirty_pages(): balance_dirty_pages+0x724/0x905 balance_dirty_pages_ratelimited+0x254/0x390 fault_dirty_shared_page.isra.96+0x4a/0x90 do_wp_page+0x33e/0x400 __handle_mm_fault+0x6f0/0xfa0 handle_mm_fault+0xe4/0x200 __do_page_fault+0x22b/0x4a0 page_fault+0x45/0x50 Somebody tries to change the address space, contending for the mmap_sem in write-mode: call_rwsem_down_write_failed_killable+0x13/0x20 do_mprotect_pkey+0xa8/0x330 SyS_mprotect+0xf/0x20 do_syscall_64+0x5b/0x100 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 The waiting writer locks out all subsequent readers to avoid lock starvation, and several threads can be seen hanging like this: call_rwsem_down_read_failed+0x14/0x30 proc_pid_cmdline_read+0xa0/0x480 __vfs_read+0x23/0x140 vfs_read+0x87/0x130 SyS_read+0x42/0x90 do_syscall_64+0x5b/0x100 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 To fix this, do what we do for cache read faults already: drop the mmap_sem before calling into anything IO bound, in this case the balance_dirty_pages() function, and return VM_FAULT_RETRY. Link: http://lkml.kernel.org/r/20190924194238.GA29030@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:55 +01:00
Ming Lei	943cd69efa	block: add bio_truncate to fix guard_bio_eod [ Upstream commit 85a8ce62c2eabe28b9d76ca4eecf37922402df93 ] Some filesystem, such as vfat, may send bio which crosses device boundary, and the worse thing is that the IO request starting within device boundaries can contain more than one segment past EOD. Commit `dce30ca9e3` ("fs: fix guard_bio_eod to check for real EOD errors") tries to fix this issue by returning -EIO for this situation. However, this way lets fs user code lose chance to handle -EIO, then sync_inodes_sb() may hang for ever. Also the current truncating on last segment is dangerous by updating the last bvec, given bvec table becomes not immutable any more, and fs bio users may not retrieve the truncated pages via bio_for_each_segment_all() in its .end_io callback. Fixes this issue by supporting multi-segment truncating. And the approach is simpler: - just update bio size since block layer can make correct bvec with the updated bio size. Then bvec table becomes really immutable. - zero all truncated segments for read bio Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: linux-fsdevel@vger.kernel.org Fixed-by: `dce30ca9e3` ("fs: fix guard_bio_eod to check for real EOD errors") Reported-by: syzbot+2b9e54155c8c25d8d165@syzkaller.appspotmail.com Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:54 +01:00
Phil Sutter	2922cf593f	netfilter: nft_tproxy: Fix port selector on Big Endian [ Upstream commit 8cb4ec44de42b99b92399b4d1daf3dc430ed0186 ] On Big Endian architectures, u16 port value was extracted from the wrong parts of u32 sreg_port, just like commit `10596608c4` ("netfilter: nf_tables: fix mismatch in big-endian system") describes. Fixes: `4ed8eb6570` ("netfilter: nf_tables: Add native tproxy support") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Florian Westphal <fw@strlen.de> Acked-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:54 +01:00
Takashi Iwai	d8acc0f2c2	ALSA: hda - Downgrade error message for single-cmd fallback [ Upstream commit 475feec0c41ad71cb7d02f0310e56256606b57c5 ] We made the error message for the CORB/RIRB communication clearer by upgrading to dev_WARN() so that user can notice better. But this struck us like a boomerang: now it caught syzbot and reported back as a fatal issue although it's not really any too serious bug that worth for stopping the whole system. OK, OK, let's be softy, downgrade it to the standard dev_err() again. Fixes: `dd65f7e19c` ("ALSA: hda - Show the fatal CORB/RIRB error more clearly") Reported-by: syzbot+b3028ac3933f5c466389@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20191216151224.30013-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:54 +01:00
Christian Brauner	d25bf5a341	taskstats: fix data-race [ Upstream commit 0b8d616fb5a8ffa307b1d3af37f55c15dae14f28 ] When assiging and testing taskstats in taskstats_exit() there's a race when setting up and reading sig->stats when a thread-group with more than one thread exits: write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0: taskstats_tgid_alloc kernel/taskstats.c:567 [inline] taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596 do_exit+0x2c2/0x18e0 kernel/exit.c:864 do_group_exit+0xb4/0x1c0 kernel/exit.c:983 get_signal+0x2a2/0x1320 kernel/signal.c:2734 do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815 exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159 prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline] syscall_return_slowpath arch/x86/entry/common.c:274 [inline] do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1: taskstats_tgid_alloc kernel/taskstats.c:559 [inline] taskstats_exit+0xb2/0x717 kernel/taskstats.c:596 do_exit+0x2c2/0x18e0 kernel/exit.c:864 do_group_exit+0xb4/0x1c0 kernel/exit.c:983 __do_sys_exit_group kernel/exit.c:994 [inline] __se_sys_exit_group kernel/exit.c:992 [inline] __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992 do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by using smp_load_acquire() and smp_store_release(). Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: `34ec12349c` ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Marco Elver <elver@google.com> Reviewed-by: Will Deacon <will@kernel.org> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lore.kernel.org/r/20191009114809.8643-1-christian.brauner@ubuntu.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:54 +01:00
Kirill A. Shutemov	d7af03159b	shmem: pin the file in shmem_fault() if mmap_sem is dropped [ Upstream commit 8897c1b1a1795cab23d5ac13e4e23bf0b5f4e0c6 ] syzbot found the following crash: BUG: KASAN: use-after-free in perf_trace_lock_acquire+0x401/0x530 include/trace/events/lock.h:13 Read of size 8 at addr ffff8880a5cf2c50 by task syz-executor.0/26173 CPU: 0 PID: 26173 Comm: syz-executor.0 Not tainted 5.3.0-rc6 #146 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: perf_trace_lock_acquire+0x401/0x530 include/trace/events/lock.h:13 trace_lock_acquire include/trace/events/lock.h:13 [inline] lock_acquire+0x2de/0x410 kernel/locking/lockdep.c:4411 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] shmem_fault+0x5ec/0x7b0 mm/shmem.c:2034 __do_fault+0x111/0x540 mm/memory.c:3083 do_shared_fault mm/memory.c:3535 [inline] do_fault mm/memory.c:3613 [inline] handle_pte_fault mm/memory.c:3840 [inline] __handle_mm_fault+0x2adf/0x3f20 mm/memory.c:3964 handle_mm_fault+0x1b5/0x6b0 mm/memory.c:4001 do_user_addr_fault arch/x86/mm/fault.c:1441 [inline] __do_page_fault+0x536/0xdd0 arch/x86/mm/fault.c:1506 do_page_fault+0x38/0x590 arch/x86/mm/fault.c:1530 page_fault+0x39/0x40 arch/x86/entry/entry_64.S:1202 It happens if the VMA got unmapped under us while we dropped mmap_sem and inode got freed. Pinning the file if we drop mmap_sem fixes the issue. Link: http://lkml.kernel.org/r/20190927083908.rhifa4mmaxefc24r@box Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: syzbot+03ee87124ee05af991bd@syzkaller.appspotmail.com Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Hugh Dickins <hughd@google.com> Cc: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:54 +01:00
Eric Dumazet	d56c69c5ef	tcp: fix data-race in tcp_recvmsg() [ Upstream commit a5a7daa52edb5197a3b696afee13ef174dc2e993 ] Reading tp->recvmsg_inq after socket lock is released raises a KCSAN warning [1] Replace has_tss & has_cmsg by cmsg_flags and make sure to not read tp->recvmsg_inq a second time. [1] BUG: KCSAN: data-race in tcp_chrono_stop / tcp_recvmsg write to 0xffff888126adef24 of 2 bytes by interrupt on cpu 0: tcp_chrono_set net/ipv4/tcp_output.c:2309 [inline] tcp_chrono_stop+0x14c/0x280 net/ipv4/tcp_output.c:2338 tcp_clean_rtx_queue net/ipv4/tcp_input.c:3165 [inline] tcp_ack+0x274f/0x3170 net/ipv4/tcp_input.c:3688 tcp_rcv_established+0x37e/0xf50 net/ipv4/tcp_input.c:5696 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561 tcp_v4_rcv+0x19dc/0x1bb0 net/ipv4/tcp_ipv4.c:1942 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5214 napi_skb_finish net/core/dev.c:5677 [inline] napi_gro_receive+0x28f/0x330 net/core/dev.c:5710 read to 0xffff888126adef25 of 1 bytes by task 7275 on cpu 1: tcp_recvmsg+0x77b/0x1a30 net/ipv4/tcp.c:2187 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec net/socket.c:871 [inline] sock_recvmsg net/socket.c:889 [inline] sock_recvmsg+0x92/0xb0 net/socket.c:885 sock_read_iter+0x15f/0x1e0 net/socket.c:967 call_read_iter include/linux/fs.h:1889 [inline] new_sync_read+0x389/0x4f0 fs/read_write.c:414 __vfs_read+0xb1/0xc0 fs/read_write.c:427 vfs_read fs/read_write.c:461 [inline] vfs_read+0x143/0x2c0 fs/read_write.c:446 ksys_read+0xd5/0x1b0 fs/read_write.c:587 __do_sys_read fs/read_write.c:597 [inline] __se_sys_read fs/read_write.c:595 [inline] __x64_sys_read+0x4c/0x60 fs/read_write.c:595 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 7275 Comm: sshd Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: `b75eba76d3` ("tcp: send in-queue bytes in cmsg upon read") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:54 +01:00
Jaroslav Kysela	d53678610b	ALSA: hda - fixup for the bass speaker on Lenovo Carbon X1 7th gen [ Upstream commit d2cd795c4ece1a24fda170c35eeb4f17d9826cbb ] The auto-parser assigns the bass speaker to DAC3 (NID 0x06) which is without the volume control. I do not see a reason to use DAC2, because the shared output to all speakers produces the sufficient and well balanced sound. The stereo support is enough for this purpose (laptop). Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20191129144027.14765-1-perex@perex.cz Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:53 +01:00
Takashi Iwai	9538659160	PCI: Fix missing inline for pci_pr3_present() [ Upstream commit 46b4bff6572b0552b1ee062043621e4b252638d8 ] The inline prefix was missing in the dummy function pci_pr3_present() definition. Fix it. Reported-by: kbuild test robot <lkp@intel.com> Fixes: 52525b7a3cf8 ("PCI: Add a helper to check Power Resource Requirements _PR3 existence") Link: https://lore.kernel.org/r/201910212111.qHm6OcWx%lkp@intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:53 +01:00
Kai-Heng Feng	891f29feeb	ALSA: hda: Allow HDA to be runtime suspended when dGPU is not bound to a driver [ Upstream commit bacd861452d2be86a4df341b12e32db7dac8021e ] Nvidia proprietary driver doesn't support runtime power management, so when a user only wants to use the integrated GPU, it's a common practice to let dGPU not to bind any driver, and let its upstream port to be runtime suspended. At the end of runtime suspension the port uses platform power management to disable power through _OFF method of power resource, which is listed by _PR3. After commit `b516ea586d` ("PCI: Enable NVIDIA HDA controllers"), when the dGPU comes with an HDA function, the HDA won't be suspended if the dGPU is unbound, so the power resource can't be turned off by its upstream port driver. Commit `37a3a98ef6` ("ALSA: hda - Enable runtime PM only for discrete GPU") only allows HDA to be runtime suspended once GPU is bound, to keep APU's HDA working. However, HDA on dGPU isn't that useful if dGPU is not bound to any driver. So let's relax the runtime suspend requirement for dGPU's HDA function, to disable the power source to save lots of power. BugLink: https://bugs.launchpad.net/bugs/1840835 Fixes: `b516ea586d` ("PCI: Enable NVIDIA HDA controllers") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Link: https://lore.kernel.org/r/20191018073848.14590-2-kai.heng.feng@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:53 +01:00
Kai-Heng Feng	eef2e98832	PCI: Add a helper to check Power Resource Requirements _PR3 existence [ Upstream commit 52525b7a3cf82adec5c6cf0ecbd23ff228badc94 ] A driver may want to know the existence of _PR3, to choose different runtime suspend behavior. A user will be add in next patch. This is mostly the same as nouveau_pr3_present(). Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20191018073848.14590-1-kai.heng.feng@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:52 +01:00
Chris Chiu	f1f7ec8e5b	ALSA: hda/realtek - Enable the bass speaker of ASUS UX431FLC [ Upstream commit 48e01504cf5315cbe6de9b7412e792bfcc3dd9e1 ] ASUS reported that there's an bass speaker in addition to internal speaker and it uses DAC 0x02. It was not enabled in the commit 436e25505f34 ("ALSA: hda/realtek - Enable internal speaker of ASUS UX431FLC") which only enables the amplifier and the front speaker. This commit enables the bass speaker on top of the aforementioned work to improve the acoustic experience. Fixes: 436e25505f34 ("ALSA: hda/realtek - Enable internal speaker of ASUS UX431FLC") Signed-off-by: Chris Chiu <chiu@endlessm.com> Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191230031118.95076-1-chiu@endlessm.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:52 +01:00
Kailang Yang	65e8768eb2	ALSA: hda/realtek - Add Bass Speaker and fixed dac for bass speaker [ Upstream commit e79c22695abd3b75a6aecf4ea4b9607e8d82c49c ] Dell has new platform which has dual speaker connecting. They want dual speaker which use same dac for output. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/229c7efa2b474a16b7d8a916cd096b68@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:51 +01:00
Andy Whitcroft	0360ce1eaf	PM / hibernate: memory_bm_find_bit(): Tighten node optimisation [ Upstream commit da6043fe85eb5ec621e34a92540735dcebbea134 ] When looking for a bit by number we make use of the cached result from the preceding lookup to speed up operation. Firstly we check if the requested pfn is within the cached zone and if not lookup the new zone. We then check if the offset for that pfn falls within the existing cached node. This happens regardless of whether the node is within the zone we are now scanning. With certain memory layouts it is possible for this to false trigger creating a temporary alias for the pfn to a different bit. This leads the hibernation code to free memory which it was never allocated with the expected fallout. Ensure the zone we are scanning matches the cached zone before considering the cached node. Deep thanks go to Andrea for many, many, many hours of hacking and testing that went into cornering this bug. Reported-by: Andrea Righi <andrea.righi@canonical.com> Tested-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:51 +01:00
Juergen Gross	33fa919df6	xen/balloon: fix ballooned page accounting without hotplug enabled [ Upstream commit c673ec61ade89bf2f417960f986bc25671762efb ] When CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not defined reserve_additional_memory() will set balloon_stats.target_pages to a wrong value in case there are still some ballooned pages allocated via alloc_xenballooned_pages(). This will result in balloon_process() no longer be triggered when ballooned pages are freed in batches. Reported-by: Nicholas Tsirakis <niko.tsirakis@gmail.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:51 +01:00
Paul Durrant	ec177a46e9	xen-blkback: prevent premature module unload [ Upstream commit fa2ac657f9783f0891b2935490afe9a7fd29d3fa ] Objects allocated by xen_blkif_alloc come from the 'blkif_cache' kmem cache. This cache is destoyed when xen-blkif is unloaded so it is necessary to wait for the deferred free routine used for such objects to complete. This necessity was missed in commit 14855954f636 "xen-blkback: allow module to be cleanly unloaded". This patch fixes the problem by taking/releasing extra module references in xen_blkif_alloc/free() respectively. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:51 +01:00
Maor Gottlieb	e56db866ce	IB/mlx5: Fix steering rule of drop and count [ Upstream commit ed9085fed9d95d5921582e3c8474f3736c5d2782 ] There are two flow rule destinations: QP and packet. While users are setting DROP packet rule, the QP should not be set as a destination. Fixes: `3b3233fbf0` ("IB/mlx5: Add flow counters binding support") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20191212091214.315005-4-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:50 +01:00
Parav Pandit	c251d5f5b1	IB/mlx4: Follow mirror sequence of device add during device removal [ Upstream commit 89f988d93c62384758b19323c886db917a80c371 ] Current code device add sequence is: ib_register_device() ib_mad_init() init_sriov_init() register_netdev_notifier() Therefore, the remove sequence should be, unregister_netdev_notifier() close_sriov() mad_cleanup() ib_unregister_device() However it is not above. Hence, make do above remove sequence. Fixes: `fa417f7b52` ("IB/mlx4: Add support for IBoE") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20191212091214.315005-3-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:49 +01:00
Mark Zhang	e0f34320f4	RDMA/counter: Prevent auto-binding a QP which are not tracked with res [ Upstream commit 33df2f1929df4a1cb13303e344fbf8a75f0dc41f ] Some QPs (e.g. XRC QP) are not tracked in kernel, in this case they have an invalid res and should not be bound to any dynamically-allocated counter in auto mode. This fixes below call trace: BUG: kernel NULL pointer dereference, address: 0000000000000390 PGD 80000001a7233067 P4D 80000001a7233067 PUD 1a7215067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 24822 Comm: ibv_xsrq_pingpo Not tainted 5.4.0-rc5+ #21 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014 RIP: 0010:rdma_counter_bind_qp_auto+0x142/0x270 [ib_core] Code: e1 48 85 c0 48 89 c2 0f 84 bc 00 00 00 49 8b 06 48 39 42 48 75 d6 40 3a aa 90 00 00 00 75 cd 49 8b 86 00 01 00 00 48 8b 4a 28 <8b> 80 90 03 00 00 39 81 90 03 00 00 75 b4 85 c0 74 b0 48 8b 04 24 RSP: 0018:ffffc900003f39c0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff88820020ec00 RSI: 0000000000000004 RDI: ffffffffffffffc0 RBP: 0000000000000001 R08: ffff888224149ff0 R09: ffffc900003f3968 R10: ffffffffffffffff R11: ffff8882249c5848 R12: ffffffffffffffff R13: ffff88821d5aca50 R14: ffff8881f7690800 R15: ffff8881ff890000 FS: 00007fe53a3e1740(0000) GS:ffff888237b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000390 CR3: 00000001a7292006 CR4: 00000000003606a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: _ib_modify_qp+0x3a4/0x3f0 [ib_core] ? lookup_get_idr_uobject.part.8+0x23/0x40 [ib_uverbs] modify_qp+0x322/0x3e0 [ib_uverbs] ib_uverbs_modify_qp+0x43/0x70 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xb1/0xf0 [ib_uverbs] ib_uverbs_run_method+0x6be/0x760 [ib_uverbs] ? uverbs_disassociate_api+0xd0/0xd0 [ib_uverbs] ib_uverbs_cmd_verbs+0x18d/0x3a0 [ib_uverbs] ? get_acl+0x1a/0x120 ? __alloc_pages_nodemask+0x15d/0x2c0 ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs] do_vfs_ioctl+0xa5/0x610 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x48/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `99fa331dc8` ("RDMA/counter: Add "auto" configuration mode support") Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Ido Kalir <idok@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20191212091214.315005-2-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:49 +01:00
Thomas Richter	217c8169c6	s390/cpum_sf: Avoid SBD overflow condition in irq handler [ Upstream commit 0539ad0b22877225095d8adef0c376f52cc23834 ] The s390 CPU Measurement sampling facility has an overflow condition which fires when all entries in a SBD are used. The measurement alert interrupt is triggered and reads out all samples in this SDB. It then tests the successor SDB, if this SBD is not full, the interrupt handler does not read any samples at all from this SDB The design waits for the hardware to fill this SBD and then trigger another meassurement alert interrupt. This scheme works nicely until an perf_event_overflow() function call discards the sample due to a too high sampling rate. The interrupt handler has logic to read out a partially filled SDB when the perf event overflow condition in linux common code is met. This causes the CPUM sampling measurement hardware and the PMU device driver to operate on the same SBD's trailer entry. This should not happen. This can be seen here using this trace: cpumsf_pmu_add: tear:0xb5286000 hw_perf_event_update: sdbt 0xb5286000 full 1 over 0 flush_all:0 hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0 above shows 1. interrupt hw_perf_event_update: sdbt 0xb5286008 full 1 over 0 flush_all:0 hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0 above shows 2. interrupt ... this goes on fine until... hw_perf_event_update: sdbt 0xb5286068 full 1 over 0 flush_all:0 perf_push_sample1: overflow one or more samples read from the IRQ handler are rejected by perf_event_overflow() and the IRQ handler advances to the next SDB and modifies the trailer entry of a partially filled SDB. hw_perf_event_update: sdbt 0xb5286070 full 0 over 0 flush_all:1 timestamp: 14:32:52.519953 Next time the IRQ handler is called for this SDB the trailer entry shows an overflow count of 19 missed entries. hw_perf_event_update: sdbt 0xb5286070 full 1 over 19 flush_all:1 timestamp: 14:32:52.970058 Remove access to a follow on SDB when event overflow happened. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:49 +01:00
Thomas Richter	9c320bb692	s390/cpum_sf: Adjust sampling interval to avoid hitting sample limits [ Upstream commit 39d4a501a9ef55c57b51e3ef07fc2aeed7f30b3b ] Function perf_event_ever_overflow() and perf_event_account_interrupt() are called every time samples are processed by the interrupt handler. However function perf_event_account_interrupt() has checks to avoid being flooded with interrupts (more then 1000 samples are received per task_tick). Samples are then dropped and a PERF_RECORD_THROTTLED is added to the perf data. The perf subsystem limit calculation is: maximum sample frequency := 100000 --> 1 samples per 10 us task_tick = 10ms = 10000us --> 1000 samples per task_tick The work flow is measurement_alert() uses SDBT head and each SBDT points to 511 SDB pages, each with 126 sample entries. After processing 8 SBDs and for each valid sample calling: perf_event_overflow() perf_event_account_interrupts() there is a considerable amount of samples being dropped, especially when the sample frequency is very high and near the 100000 limit. To avoid the high amount of samples being dropped near the end of a task_tick time frame, increment the sampling interval in case of dropped events. The CPU Measurement sampling facility on the s390 supports only intervals, specifiing how many CPU cycles have to be executed before a sample is generated. Increase the interval when the samples being generated hit the task_tick limit. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:49 +01:00
Zhiqiang Liu	25432fa3ac	md: raid1: check rdev before reference in raid1_sync_request func [ Upstream commit 028288df635f5a9addd48ac4677b720192747944 ] In raid1_sync_request func, rdev should be checked before reference. Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:48 +01:00
Guoqing Jiang	aae93615aa	raid5: need to set STRIPE_HANDLE for batch head [ Upstream commit a7ede3d16808b8f3915c8572d783530a82b2f027 ] With commit `6ce220dd2f` ("raid5: don't set STRIPE_HANDLE to stripe which is in batch list"), we don't want to set STRIPE_HANDLE flag for sh which is already in batch list. However, the stripe which is the head of batch list should set this flag, otherwise panic could happen inside init_stripe at BUG_ON(sh->batch_head), it is reproducible with raid5 on top of nvdimm devices per Xiao oberserved. Thanks for Xiao's effort to verify the change. Fixes: `6ce220dd2f` ("raid5: don't set STRIPE_HANDLE to stripe which is in batch list") Reported-by: Xiao Ni <xni@redhat.com> Tested-by: Xiao Ni <xni@redhat.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:48 +01:00
David Howells	57a21cdbec	afs: Fix creation calls in the dynamic root to fail with EOPNOTSUPP [ Upstream commit 1da4bd9f9d187f53618890d7b66b9628bbec3c70 ] Fix the lookup method on the dynamic root directory such that creation calls, such as mkdir, open(O_CREAT), symlink, etc. fail with EOPNOTSUPP rather than failing with some odd error (such as EEXIST). lookup() itself tries to create automount directories when it is invoked. These are cached locally in RAM and not committed to storage. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Jonathan Billings <jsbillings@jsbillings.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:48 +01:00
David Howells	e4086478da	afs: Fix mountpoint parsing [ Upstream commit 158d58335393af3956a9c06f0816ee75ed1f1447 ] Each AFS mountpoint has strings that define the target to be mounted. This is required to end in a dot that is supposed to be stripped off. The string can include suffixes of ".readonly" or ".backup" - which are supposed to come before the terminal dot. To add to the confusion, the "fs lsmount" afs utility does not show the terminal dot when displaying the string. The kernel mount source string parser, however, assumes that the terminal dot marks the suffix and that the suffix is always "" and is thus ignored. In most cases, there is no suffix and this is not a problem - but if there is a suffix, it is lost and this affects the ability to mount the correct volume. The command line mount command, on the other hand, is expected not to include a terminal dot - so the problem doesn't arise there. Fix this by making sure that the dot exists and then stripping it when passing the string to the mount configuration. Fixes: `bec5eb6141` ("AFS: Implement an autocell mount capability [ver #2]") Reported-by: Jonathan Billings <jsbillings@jsbillings.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Jonathan Billings <jsbillings@jsbillings.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:48 +01:00
Jens Axboe	b1954fda6b	net: make socket read/write_iter() honor IOCB_NOWAIT [ Upstream commit ebfcd8955c0b52eb793bcbc9e71140e3d0cdb228 ] The socket read/write helpers only look at the file O_NONBLOCK. not the iocb IOCB_NOWAIT flag. This breaks users like preadv2/pwritev2 and io_uring that rely on not having the file itself marked nonblocking, but rather the iocb itself. Cc: netdev@vger.kernel.org Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:47 +01:00
EJ Hsu	ae6e5f8d51	usb: gadget: fix wrong endpoint desc [ Upstream commit e5b5da96da50ef30abb39cb9f694e99366404d24 ] Gadget driver should always use config_ep_by_speed() to initialize usb_ep struct according to usb device's operating speed. Otherwise, usb_ep struct may be wrong if usb devcie's operating speed is changed. The key point in this patch is that we want to make sure the desc pointer in usb_ep struct will be set to NULL when gadget is disconnected. This will force it to call config_ep_by_speed() to correctly initialize usb_ep struct based on the new operating speed when gadget is re-connected later. Reviewed-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: EJ Hsu <ejh@nvidia.com> Signed-off-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:47 +01:00
Ben Skeggs	0f7cb06912	drm/nouveau/kms/nv50-: fix panel scaling [ Upstream commit 3d1890ef8023e61934e070021b06cc9f417260c0 ] Under certain circumstances, encoder atomic_check() can be entered without adjusted_mode having been reset to the same as mode, which confuses the scaling logic and can lead to a misprogrammed display. Fix this by checking against the user-provided mode directly. Link: https://bugs.freedesktop.org/show_bug.cgi?id=108615 Link: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/464 Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:47 +01:00
Hans de Goede	bcfa071bfe	drm/nouveau: Fix drm-core using atomic code-paths on pre-nv50 hardware [ Upstream commit 64d17f25dcad518461ccf0c260544e1e379c5b35 ] We do not support atomic modesetting on pre-nv50 hardware, but until now our connector code was setting drm_connector->state on pre-nv50 hardware. This causes the core to enter atomic modesetting paths in at least: 1. drm_connector_get_encoder(), returning connector->state->best_encoder which is always 0, causing us to always report 0 as encoder_id in the drmModeConnector struct returned by drmModeGetConnector(). 2. drm_encoder_get_crtc(), returning NULL because uses_atomic get set, causing us to always report 0 as crtc_id in the drmModeEncoder struct returned by drmModeGetEncoder() Which in turn confuses userspace, at least plymouth thinks that the pipe has changed because of this and tries to reconfigure it unnecessarily. More in general we should not set drm_connector->state in the non-atomic code as this violates the drm-core's expectations. This commit fixes this by using a nouveau_conn_atom struct embedded in the nouveau_connector struct for property handling in the non-atomic case. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1706557 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:47 +01:00
Hans de Goede	29da513a33	drm/nouveau: Move the declaration of struct nouveau_conn_atom up a bit [ Upstream commit 37a68eab4cd92b507c9e8afd760fdc18e4fecac6 ] Place the declaration of struct nouveau_conn_atom above that of struct nouveau_connector. This commit makes no changes to the moved block what so ever, it just moves it up a bit. This is a preparation patch to fix some issues with connector handling on pre nv50 displays (which do not use atomic modesetting). Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:46 +01:00
Kay Friedrich	742d03aff8	staging/wlan-ng: add CRC32 dependency in Kconfig [ Upstream commit 2740bd3351cd5a4351f458aabaa1c9b77de3867b ] wlan-ng uses the function crc32_le, but CRC32 wasn't a dependency of wlan-ng Co-developed-by: Michael Kupfer <michael.kupfer@fau.de> Signed-off-by: Michael Kupfer <michael.kupfer@fau.de> Signed-off-by: Kay Friedrich <kay.friedrich@fau.de> Link: https://lore.kernel.org/r/20191127112457.2301-1-kay.friedrich@fau.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:46 +01:00
Bo Wu	d45a917138	scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func [ Upstream commit bba340c79bfe3644829db5c852fdfa9e33837d6d ] In iscsi_if_rx func, after receiving one request through iscsi_if_recv_msg func, iscsi_if_send_reply will be called to try to reply to the request in a do-while loop. If the iscsi_if_send_reply function keeps returning -EAGAIN, a deadlock will occur. For example, a client only send msg without calling recvmsg func, then it will result in the watchdog soft lockup. The details are given as follows: sock_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ISCSI); retval = bind(sock_fd, (struct sock addr*) & src_addr, sizeof(src_addr); while (1) { state_msg = sendmsg(sock_fd, &msg, 0); //Note: recvmsg(sock_fd, &msg, 0) is not processed here. } close(sock_fd); watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [netlink_test:253305] Sample time: 4000897528 ns(HZ: 250) Sample stat: curr: user: 675503481560, nice: 321724050, sys: 448689506750, idle: 4654054240530, iowait: 40885550700, irq: 14161174020, softirq: 8104324140, st: 0 deta: user: 0, nice: 0, sys: 3998210100, idle: 0, iowait: 0, irq: 1547170, softirq: 242870, st: 0 Sample softirq: TIMER: 992 SCHED: 8 Sample irqstat: irq 2: delta 1003, curr: 3103802, arch_timer CPU: 7 PID: 253305 Comm: netlink_test Kdump: loaded Tainted: G OE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO) pc : __alloc_skb+0x104/0x1b0 lr : __alloc_skb+0x9c/0x1b0 sp : ffff000033603a30 x29: ffff000033603a30 x28: 00000000000002dd x27: ffff800b34ced810 x26: ffff800ba7569f00 x25: 00000000ffffffff x24: 0000000000000000 x23: ffff800f7c43f600 x22: 0000000000480020 x21: ffff0000091d9000 x20: ffff800b34eff200 x19: ffff800ba7569f00 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0001000101000100 x13: 0000000101010000 x12: 0101000001010100 x11: 0001010101010001 x10: 00000000000002dd x9 : ffff000033603d58 x8 : ffff800b34eff400 x7 : ffff800ba7569200 x6 : ffff800b34eff400 x5 : 0000000000000000 x4 : 00000000ffffffff x3 : 0000000000000000 x2 : 0000000000000001 x1 : ffff800b34eff2c0 x0 : 0000000000000300 Call trace: __alloc_skb+0x104/0x1b0 iscsi_if_rx+0x144/0x12bc [scsi_transport_iscsi] netlink_unicast+0x1e0/0x258 netlink_sendmsg+0x310/0x378 sock_sendmsg+0x4c/0x70 sock_write_iter+0x90/0xf0 __vfs_write+0x11c/0x190 vfs_write+0xac/0x1c0 ksys_write+0x6c/0xd8 __arm64_sys_write+0x24/0x30 el0_svc_common+0x78/0x130 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E3D4D2@dggeml505-mbx.china.huawei.com Signed-off-by: Bo Wu <wubo40@huawei.com> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-09 10:19:46 +01:00

1 2 3 4 5 ...

874627 Commits All Branches Search

874627 Commits

All Branches