linux-brain/Documentation/filesystems
Michal Hocko 696ce77bba mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps
[ Upstream commit 7550c60798 ]

Patch series "THP eligibility reporting via proc".

This series of three patches aims at making THP eligibility reporting much
more robust and long term sustainable.  The trigger for the change is a
regression report [2] and the long follow up discussion.  In short the
specific application didn't have good API to query whether a particular
mapping can be backed by THP so it has used VMA flags to workaround that.
These flags represent a deep internal state of VMAs and as such they
should be used by userspace with a great deal of caution.

A similar has happened for [3] when users complained that VM_MIXEDMAP is
no longer set on DAX mappings.  Again a lack of a proper API led to an
abuse.

The first patch in the series tries to emphasise that that the semantic of
flags might change and any application consuming those should be really
careful.

The remaining two patches provide a more suitable interface to address [2]
and provide a consistent API to query the THP status both for each VMA and
process wide as well.  [1]

http://lkml.kernel.org/r/20181120103515.25280-1-mhocko@kernel.org [2]
http://lkml.kernel.org/r/http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
[3] http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz

This patch (of 3):

Even though vma flags exported via /proc/<pid>/smaps are explicitly
documented to be not guaranteed for future compatibility the warning
doesn't go far enough because it doesn't mention semantic changes to those
flags.  And they are important as well because these flags are a deep
implementation internal to the MM code and the semantic might change at
any time.

Let's consider two recent examples:
http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz
: commit e1fb4a0864 "dax: remove VM_MIXEDMAP for fsdax and device dax" has
: removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the
: mean time certain customer of ours started poking into /proc/<pid>/smaps
: and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA
: flags, the application just fails to start complaining that DAX support is
: missing in the kernel.

http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
: Commit 1860033237 ("mm: make PR_SET_THP_DISABLE immediately active")
: introduced a regression in that userspace cannot always determine the set
: of vmas where thp is ineligible.
: Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps
: to determine if a vma is eligible to be backed by hugepages.
: Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to
: be disabled and emit "nh" as a flag for the corresponding vmas as part of
: /proc/pid/smaps.  After the commit, thp is disabled by means of an mm
: flag and "nh" is not emitted.
: This causes smaps parsing libraries to assume a vma is eligible for thp
: and ends up puzzling the user on why its memory is not backed by thp.

In both cases userspace was relying on a semantic of a specific VMA flag.
The primary reason why that happened is a lack of a proper interface.
While this has been worked on and it will be fixed properly, it seems that
our wording could see some refinement and be more vocal about semantic
aspect of these flags as well.

Link: http://lkml.kernel.org/r/20181211143641.3503-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paul Oppenheimer <bepvte@gmail.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-26 09:37:07 +01:00
..
caching fscache: remove unused ->now_uncached callback 2017-09-06 17:27:26 -07:00
cifs CIFS/SMB3: Update documentation to reflect SMB3 and various changes 2017-09-17 10:48:00 -05:00
configfs fs: configfs: don't return anything from drop_link 2016-12-01 10:50:49 +01:00
nfs doc: ReSTify keys-request-key.txt 2017-05-18 10:33:51 -06:00
pohmelfs Documentation: fix common spelling mistakes 2016-04-28 07:51:59 -06:00
9p.txt 9p: update documentation 2014-01-24 10:55:21 -06:00
00-INDEX logfs: remove from tree 2016-12-14 23:48:11 -05:00
adfs.txt
affs.txt affs: add mount option to avoid filename truncates 2014-04-07 16:36:08 -07:00
afs.txt rxrpc: Change module filename to rxrpc.ko 2017-02-17 15:09:19 -05:00
autofs4-mount-control.txt autofs: update ioctl documentation regarding struct autofs_dev_ioctl 2017-02-27 18:43:45 -08:00
autofs4.txt Fix up over-eager 'wait_queue_t' renaming 2017-07-10 11:40:19 -07:00
automount-support.txt Documentation: remove outdated information from automount-support.txt 2015-05-15 01:10:38 -04:00
befs.txt
bfs.txt Tigran has moved 2017-05-12 15:57:15 -07:00
btrfs.txt Documentation: btrfs: remove usage specific information 2016-03-11 17:02:09 +01:00
ceph.txt ceph: set io_pages bdi hint 2017-02-20 12:16:05 +01:00
coda.txt
conf.py docs-rst: convert filesystems book to ReST 2017-05-16 08:44:08 -03:00
cramfs.txt mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage 2016-04-04 10:41:08 -07:00
dax.txt dax: use common 4k zero page for dax mmap reads 2017-09-06 17:27:24 -07:00
debugfs.txt debugfs: Pass bool pointer to debugfs_create_bool() 2015-10-04 11:36:07 +01:00
devpts.txt devpts: Make each mount of devpts an independent filesystem. 2016-06-05 10:36:01 -07:00
directory-locking vfs: remove unused i_op->rename 2016-09-27 11:03:58 +02:00
dlmfs.txt ocfs2: update web page + git tree in documentation 2015-02-28 09:57:50 -08:00
dnotify.txt
ecryptfs.txt
efivarfs.txt efi: Make efivarfs entries immutable by default 2016-02-10 16:25:52 +00:00
exofs.txt
ext2.txt fs: Remove ext3 filesystem driver 2015-07-23 20:59:40 +02:00
ext3.txt fs: Remove ext3 filesystem driver 2015-07-23 20:59:40 +02:00
ext4.txt ext4: correct documentation for grpid mount option 2018-02-22 15:42:26 +01:00
f2fs.txt f2fs: support journalled quota 2017-08-21 15:54:48 -07:00
fiemap.txt fsioctl.c: make generic_block_fiemap() signal-tolerant 2015-02-10 14:30:30 -08:00
files.txt
fuse.txt
gfs2-glocks.txt gfs2: Remove gl_spin define 2015-10-29 12:57:48 -05:00
gfs2-uevents.txt
gfs2.txt
hfs.txt
hfsplus.txt Documentation: update URL to hfsplus Technote 1150 2014-02-20 14:48:51 +01:00
hpfs.txt
index.rst docs-rst: don't ignore internal functions for jbd2 docs 2017-05-16 08:44:09 -03:00
inotify.txt inotify: update documentation to reflect code changes 2015-02-10 14:30:28 -08:00
isofs.txt
jfs.txt doc: fix misspellings with 'codespell' tool 2013-05-28 12:02:12 +02:00
Locking vfs: add flags to d_real() 2017-09-04 21:42:22 +02:00
locks.txt docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
mandatory-locking.txt
ncpfs.txt
nilfs2.txt nilfs2: move ioctl interface and disk layout to uapi separately 2016-08-02 19:35:21 -04:00
ntfs.txt NTFS: Remove changelog from Documentation/filesystems/ntfs.txt. 2014-10-16 12:43:57 +01:00
ocfs2-online-filecheck.txt Doc: ocfs: Fix typo in filesystems/ocfs2-online-filecheck.txt 2016-07-01 16:17:15 -06:00
ocfs2.txt ocfs2: update web page + git tree in documentation 2015-02-28 09:57:50 -08:00
omfs.txt
orangefs.txt orangefs: documentation clean up 2017-09-14 14:54:39 -04:00
overlayfs.txt ovl: fix regression caused by exclusive upper/work dir protection 2017-10-05 15:53:18 +02:00
path-lookup.md Documentation: add new description of path-name lookup. 2015-11-02 18:18:25 -07:00
path-lookup.txt Documentation: add new description of path-name lookup. 2015-11-02 18:18:25 -07:00
porting VFS: Differentiate mount flags (MS_*) from internal superblock flags 2017-07-17 08:45:35 +01:00
proc.txt mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps 2019-01-26 09:37:07 +01:00
qnx6.txt Documentation: fix common spelling mistakes 2016-04-28 07:51:59 -06:00
quota.txt scripts/spelling.txt: add "an user" pattern and fix typo instances 2017-02-27 18:43:46 -08:00
ramfs-rootfs-initramfs.txt initmpfs: use initramfs if rootfstype= or root= specified 2013-09-11 15:59:38 -07:00
relay.txt doc: Fix typo in doucmentations 2013-07-25 12:34:15 +02:00
romfs.txt
seq_file.txt Documentation: update seq_file 2014-12-29 15:40:18 -07:00
sharedsubtree.txt doc: fix grammar 2016-03-09 15:33:06 -07:00
spufs.txt
squashfs.txt Squashfs: Add LZ4 compression configuration option 2014-11-27 18:48:44 +00:00
sysfs-pci.txt PCI: Add pci_mmap_resource_range() and use it for ARM64 2017-04-20 08:47:47 -05:00
sysfs-tagging.txt sysfs-tagging.txt: fix pre-kernfs references 2015-09-13 14:38:51 -06:00
sysfs.txt driver core: remove DRIVER_ATTR 2017-09-19 09:20:33 +02:00
sysv-fs.txt
tmpfs.txt Documenation: update cgroup's document path 2016-08-03 15:43:58 -06:00
ubifs.txt
udf.txt
ufs.txt
vfat.txt fat: add config option to set UTF-8 mount option by default 2016-03-22 15:36:02 -07:00
vfs.txt Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2017-09-13 09:11:44 -07:00
xfs-delayed-logging-design.txt
xfs-self-describing-metadata.txt
xfs.txt xfs: deprecate barrier/nobarrier mount option 2016-12-09 16:49:54 +11:00