linux-brain/arch/x86
Mike Rapoport 08f33350ed x86/mm: Fix kern_addr_valid() to cope with existing but not present entries
commit 34b1999da935a33be6239226bfa6cd4f704c5c88 upstream.

Jiri Olsa reported a fault when running:

  # cat /proc/kallsyms | grep ksys_read
  ffffffff8136d580 T ksys_read
  # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore

  /proc/kcore:     file format elf64-x86-64

  Segmentation fault

  general protection fault, probably for non-canonical address 0xf887ffcbff000: 0000 [#1] SMP PTI
  CPU: 12 PID: 1079 Comm: objdump Not tainted 5.14.0-rc5qemu+ #508
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
  RIP: 0010:kern_addr_valid
  Call Trace:
   read_kcore
   ? rcu_read_lock_sched_held
   ? rcu_read_lock_sched_held
   ? rcu_read_lock_sched_held
   ? trace_hardirqs_on
   ? rcu_read_lock_sched_held
   ? lock_acquire
   ? lock_acquire
   ? rcu_read_lock_sched_held
   ? lock_acquire
   ? rcu_read_lock_sched_held
   ? rcu_read_lock_sched_held
   ? rcu_read_lock_sched_held
   ? lock_release
   ? _raw_spin_unlock
   ? __handle_mm_fault
   ? rcu_read_lock_sched_held
   ? lock_acquire
   ? rcu_read_lock_sched_held
   ? lock_release
   proc_reg_read
   ? vfs_read
   vfs_read
   ksys_read
   do_syscall_64
   entry_SYSCALL_64_after_hwframe

The fault happens because kern_addr_valid() dereferences existent but not
present PMD in the high kernel mappings.

Such PMDs are created when free_kernel_image_pages() frees regions larger
than 2Mb. In this case, a part of the freed memory is mapped with PMDs and
the set_memory_np_noalias() -> ... -> __change_page_attr() sequence will
mark the PMD as not present rather than wipe it completely.

Have kern_addr_valid() check whether higher level page table entries are
present before trying to dereference them to fix this issue and to avoid
similar issues in the future.

Stable backporting note:
------------------------

Note that the stable marking is for all active stable branches because
there could be cases where pagetable entries exist but are not valid -
see 9a14aefc1d ("x86: cpa, fix lookup_address"), for example. So make
sure to be on the safe side here and use pXY_present() accessors rather
than pXY_none() which could #GP when accessing pages in the direct map.

Also see:

  c40a56a781 ("x86/mm/init: Remove freed kernel image areas from alias mapping")

for more info.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: <stable@vger.kernel.org>	# 4.4+
Link: https://lkml.kernel.org/r/20210819132717.19358-1-rppt@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-22 12:26:40 +02:00
..
boot x86/boot: Add .text.* to setup.ld 2021-06-16 11:59:38 +02:00
configs vgacon: remove software scrollback support 2020-09-17 13:47:54 +02:00
crypto crypto: x86/aes-ni-xts - use direct calls to and 4-way stride 2021-03-20 10:39:47 +01:00
entry x86/asm/32: Add ENDs to some functions and relabel with SYM_CODE_* 2021-01-17 14:05:30 +01:00
events perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op 2021-09-15 09:47:39 +02:00
hyperv x86/hyperv: check cpu mask after interrupt has been disabled 2021-01-19 18:26:12 +01:00
ia32 syscalls/x86: Use COMPAT_SYSCALL_DEFINE0 for IA32 (rt_)sigreturn 2020-01-17 19:48:30 +01:00
include x86/fpu: Make init_fpstate correct with optimized XSAVE 2021-08-26 08:36:11 -04:00
kernel x86/resctrl: Fix a maybe-uninitialized build warning treated as error 2021-09-15 09:47:40 +02:00
kvm KVM: nVMX: Unconditionally clear nested.pi_pending on nested VM-Enter 2021-09-15 09:47:41 +02:00
lib x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes 2021-05-22 11:38:27 +02:00
math-emu x86: math-emu: Fix up 'cmp' insn for clang ias 2020-07-29 10:18:40 +02:00
mm x86/mm: Fix kern_addr_valid() to cope with existing but not present entries 2021-09-22 12:26:40 +02:00
net bpf: Introduce BPF nospec instruction for mitigating Spectre v4 2021-09-15 09:47:38 +02:00
oprofile
pci PCI: Add AMD RS690 quirk to enable 64-bit DMA 2021-06-30 08:47:49 -04:00
platform efi/x86: Free efi_pgd with free_pages() 2020-11-24 13:29:19 +01:00
power x86/asm/32: Add ENDs to some functions and relabel with SYM_CODE_* 2021-01-17 14:05:30 +01:00
purgatory x86/purgatory: Disable various profiling and sanitizing options 2020-06-24 17:50:20 +02:00
ras RAS/CEC: Add CONFIG_RAS_CEC_DEBUG and move CEC debug features there 2019-06-08 17:39:24 +02:00
realmode x86/asm/32: Add ENDs to some functions and relabel with SYM_CODE_* 2021-01-17 14:05:30 +01:00
tools x86/tools: Fix objdump version check again 2021-08-18 08:57:01 +02:00
um um: Implement copy_thread_tls 2020-01-14 20:08:35 +01:00
video treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
xen xen: reset legacy rtc flag for PV domU 2021-09-22 12:26:39 +02:00
.gitignore
Kbuild treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
Kconfig x86/platform/uv: Fix !KEXEC build failure 2021-05-14 09:44:22 +02:00
Kconfig.cpu x86/cpu: Create Zhaoxin processors architecture support file 2019-06-22 11:45:57 +02:00
Kconfig.debug x86, perf: Fix the dependency of the x86 insn decoder selftest 2019-09-02 20:05:58 +02:00
Makefile x86/build: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS) 2021-05-11 14:04:06 +02:00
Makefile.um x86, powerpc: Remove -funit-at-a-time compiler option entirely 2018-12-09 11:55:32 +01:00
Makefile_32.cpu