Commit Graph

5821 Commits

Author SHA1 Message Date
Dave Chinner aee7754bbe xfs: move xfs_dir2_addname()
This gets rid of the need for a forward  declaration of the static
function xfs_dir2_addname_int() and readies the code for factoring
of xfs_dir2_addname_int().

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-30 22:43:56 -07:00
Darrick J. Wong 39ee2239a5 xfs: remove all *_ITER_CONTINUE values
Iterator functions already use 0 to signal "continue iterating", so get
rid of the #defines and just do it directly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-30 22:43:56 -07:00
Al Viro ce6595a28a kill the last users of user_{path,lpath,path_dir}()
old wrappers with few callers remaining; put them out of their misery...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-08-30 21:30:13 -04:00
Deepa Dinamani 22b139691f fs: Fill in max and min timestamps in superblock
Fill in the appropriate limits to avoid inconsistencies
in the vfs cached inode times when timestamps are
outside the permitted range.

Even though some filesystems are read-only, fill in the
timestamps to reflect the on-disk representation.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-By: Tigran Aivazian <aivazian.tigran@gmail.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Cc: aivazian.tigran@gmail.com
Cc: al@alarsen.net
Cc: coda@cs.cmu.edu
Cc: darrick.wong@oracle.com
Cc: dushistov@mail.ru
Cc: dwmw2@infradead.org
Cc: hch@infradead.org
Cc: jack@suse.com
Cc: jaharkes@cs.cmu.edu
Cc: luisbg@kernel.org
Cc: nico@fluxnic.net
Cc: phillip@squashfs.org.uk
Cc: richard@nod.at
Cc: salah.triki@gmail.com
Cc: shaggy@kernel.org
Cc: linux-xfs@vger.kernel.org
Cc: codalist@coda.cs.cmu.edu
Cc: linux-ext4@vger.kernel.org
Cc: linux-mtd@lists.infradead.org
Cc: jfs-discussion@lists.sourceforge.net
Cc: reiserfs-devel@vger.kernel.org
2019-08-30 07:27:17 -07:00
Darrick J. Wong e7ee96dfb8 xfs: remove all *_ITER_ABORT values
Use -ECANCELED to signal "stop iterating" instead of these magical
*_ITER_ABORT values, since it's duplicative.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-29 21:22:41 -07:00
Eric Sandeen 7f313eda8f xfs: log proper length of btree block in scrub/repair
xfs_trans_log_buf() takes a final argument of the last byte to
log in the buffer; b_length is in basic blocks, so this isn't
the correct last byte.  Fix it.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-28 08:31:02 -07:00
Darrick J. Wong ffb5696f75 xfs: reinitialize rm_flags when unpacking an offset into an rmap irec
In xfs_rmap_irec_offset_unpack, we should always clear the contents of
rm_flags before we begin unpacking the encoded (ondisk) offset into the
incore rm_offset and incore rm_flags fields.  Remove the open-coded
field zeroing as this encourages api misuse.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28 08:31:02 -07:00
Darrick J. Wong 3e08f42ae7 xfs: remove unnecessary int returns from deferred bmap functions
Remove the return value from the functions that schedule deferred bmap
operations since they never fail and do not return status.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28 08:31:02 -07:00
Darrick J. Wong 74b4c5d4a9 xfs: remove unnecessary int returns from deferred refcount functions
Remove the return value from the functions that schedule deferred
refcount operations since they never fail and do not return status.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28 08:31:02 -07:00
Darrick J. Wong bc46ac6471 xfs: remove unnecessary int returns from deferred rmap functions
Remove the return value from the functions that schedule deferred rmap
operations since they never fail and do not return status.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28 08:31:01 -07:00
Darrick J. Wong 2ca09177ab xfs: remove unnecessary parameter from xfs_iext_inc_seq
This function doesn't use the @state parameter, so get rid of it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28 08:31:01 -07:00
Darrick J. Wong b521c89027 xfs: fix sign handling problem in xfs_bmbt_diff_two_keys
In xfs_bmbt_diff_two_keys, we perform a signed int64_t subtraction with
two unsigned 64-bit quantities.  If the second quantity is actually the
"maximum" key (all ones) as used in _query_all, the subtraction
effectively becomes addition of two positive numbers and the function
returns incorrect results.  Fix this with explicit comparisons of the
unsigned values.  Nobody needs this now, but the online repair patches
will need this to work properly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28 08:31:01 -07:00
Darrick J. Wong 7380e8fec1 xfs: don't return _QUERY_ABORT from xfs_rmap_has_other_keys
The xfs_rmap_has_other_keys helper aborts the iteration as soon as it
has an answer.  Don't let this abort leak out to callers.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28 08:31:01 -07:00
Darrick J. Wong c94613feef xfs: fix maxicount division by zero error
In xfs_ialloc_setup_geometry, it's possible for a malicious/corrupt fs
image to set an unreasonably large value for sb_inopblog which will
cause ialloc_blks to be zero.  If sb_imax_pct is also set, this results
in a division by zero error in the second do_div call.  Therefore, force
maxicount to zero if ialloc_blks is zero.

Note that the kernel metadata verifiers will catch the garbage inopblog
value and abort the fs mount long before it tries to set up the inode
geometry; this is needed to avoid a crash in xfs_db while setting up the
xfs_mount structure.

Found by fuzzing sb_inopblog to 122 in xfs/350.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2019-08-28 08:31:01 -07:00
Darrick J. Wong 519e5869d5 xfs: bmap scrub should only scrub records once
The inode block mapping scrub function does more work for btree format
extent maps than is absolutely necessary -- first it will walk the bmbt
and check all the entries, and then it will load the incore tree and
check every entry in that tree, possibly for a second time.

Simplify the code and decrease check runtime by separating the two
responsibilities.  The bmbt walk will make sure the incore extent
mappings are loaded, check the shape of the bmap btree (via xchk_btree)
and check that every bmbt record has a corresponding incore extent map;
and the incore extent map walk takes all the responsibility for checking
the mapping records and cross referencing them with other AG metadata.

This enables us to clean up some messy parameter handling and reduce
redundant code.  Rename a few functions to make the split of
responsibilities clearer.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-08-26 17:43:15 -07:00
zhengbin 71912e08e0 xfs: remove excess function parameter description in 'xfs_btree_sblock_v5hdr_verify'
Fixes gcc warning:

fs/xfs/libxfs/xfs_btree.c:4475: warning: Excess function parameter 'max_recs' description in 'xfs_btree_sblock_v5hdr_verify'
fs/xfs/libxfs/xfs_btree.c:4475: warning: Excess function parameter 'pag_max_level' description in 'xfs_btree_sblock_v5hdr_verify'

Fixes: c5ab131ba0 ("libxfs: refactor short btree block verification")
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26 17:43:15 -07:00
Dave Chinner f8f9ee4794 xfs: add kmem_alloc_io()
Memory we use to submit for IO needs strict alignment to the
underlying driver contraints. Worst case, this is 512 bytes. Given
that all allocations for IO are always a power of 2 multiple of 512
bytes, the kernel heap provides natural alignment for objects of
these sizes and that suffices.

Until, of course, memory debugging of some kind is turned on (e.g.
red zones, poisoning, KASAN) and then the alignment of the heap
objects is thrown out the window. Then we get weird IO errors and
data corruption problems because drivers don't validate alignment
and do the wrong thing when passed unaligned memory buffers in bios.

TO fix this, introduce kmem_alloc_io(), which will guaranteeat least
512 byte alignment of buffers for IO, even if memory debugging
options are turned on. It is assumed that the minimum allocation
size will be 512 bytes, and that sizes will be power of 2 mulitples
of 512 bytes.

Use this everywhere we allocate buffers for IO.

This no longer fails with log recovery errors when KASAN is enabled
due to the brd driver not handling unaligned memory buffers:

# mkfs.xfs -f /dev/ram0 ; mount /dev/ram0 /mnt/test

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26 17:43:15 -07:00
Dave Chinner d916275aa4 xfs: get allocation alignment from the buftarg
Needed to feed into the allocation routine to guarantee the memory
buffers we add to bios are correctly aligned to the underlying
device.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26 17:43:14 -07:00
Dave Chinner 0ad95687c3 xfs: add kmem allocation trace points
When trying to correlate XFS kernel allocations to memory reclaim
behaviour, it is useful to know what allocations XFS is actually
attempting. This information is not directly available from
tracepoints in the generic memory allocation and reclaim
tracepoints, so these new trace points provide a high level
indication of what the XFS memory demand actually is.

There is no per-filesystem context in this code, so we just trace
the type of allocation, the size and the allocation constraints.
The kmem code also doesn't include much of the common XFS headers,
so there are a few definitions that need to be added to the trace
headers and a couple of types that need to be made common to avoid
needing to include the whole world in the kmem code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26 17:43:14 -07:00
Tetsuo Handa 707e0ddaf6 fs: xfs: Remove KM_NOSLEEP and KM_SLEEP.
Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP,
we can remove KM_NOSLEEP and replace KM_SLEEP with 0.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26 12:06:22 -07:00
Darrick J. Wong 1fb254aa98 xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
Benjamin Moody reported to Debian that XFS partially wedges when a chgrp
fails on account of being out of disk quota.  I ran his reproducer
script:

# adduser dummy
# adduser dummy plugdev

# dd if=/dev/zero bs=1M count=100 of=test.img
# mkfs.xfs test.img
# mount -t xfs -o gquota test.img /mnt
# mkdir -p /mnt/dummy
# chown -c dummy /mnt/dummy
# xfs_quota -xc 'limit -g bsoft=100k bhard=100k plugdev' /mnt

(and then as user dummy)

$ dd if=/dev/urandom bs=1M count=50 of=/mnt/dummy/foo
$ chgrp plugdev /mnt/dummy/foo

and saw:

================================================
WARNING: lock held when returning to user space!
5.3.0-rc5 #rc5 Tainted: G        W
------------------------------------------------
chgrp/47006 is leaving the kernel with locks still held!
1 lock held by chgrp/47006:
 #0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs]

...which is clearly caused by xfs_setattr_nonsize failing to unlock the
ILOCK after the xfs_qm_vop_chown_reserve call fails.  Add the missing
unlock.

Reported-by: benjamin.moody@gmail.com
Fixes: 253f4911f2 ("xfs: better xfs_trans_alloc interface")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
2019-08-22 20:55:54 -07:00
Ira Weiny b68271609c fs/xfs: Fix return code of xfs_break_leased_layouts()
The parens used in the while loop would result in error being assigned
the value 1 rather than the intended errno value.

This is required to return -ETXTBSY from follow on break_layout()
changes.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-19 18:15:28 -07:00
Darrick J. Wong 5d888b481e xfs: fix reflink source file racing with directio writes
While trawling through the dedupe file comparison code trying to fix
page deadlocking problems, Dave Chinner noticed that the reflink code
only takes shared IOLOCK/MMAPLOCKs on the source file.  Because
page_mkwrite and directio writes do not take the EXCL versions of those
locks, this means that reflink can race with writer processes.

For pure remapping this can lead to undefined behavior and file
corruption; for dedupe this means that we cannot be sure that the
contents are identical when we decide to go ahead with the remapping.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-08-18 18:53:25 -07:00
Christoph Hellwig 4529e6d7a6 xfs: compat_ioctl: use compat_ptr()
For 31-bit s390 user space, we have to pass pointer arguments through
compat_ptr() in the compat_ioctl handler.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-16 18:42:59 -07:00
Christoph Hellwig 314e01a6d7 xfs: fall back to native ioctls for unhandled compat ones
Always try the native ioctl if we don't have a compat handler.  This
removes a lot of boilerplate code as 'modern' ioctls should generally
be compat clean, and fixes the missing entries for the recently added
FS_IOC_GETFSLABEL/FS_IOC_SETFSLABEL ioctls.

Fixes: f7664b3197 ("xfs: implement online get/set fs label")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-16 18:42:59 -07:00
Darrick J. Wong 8612de3f7b xfs: don't crash on null attr fork xfs_bmapi_read
Zorro Lang reported a crash in generic/475 if we try to inactivate a
corrupt inode with a NULL attr fork (stack trace shortened somewhat):

RIP: 0010:xfs_bmapi_read+0x311/0xb00 [xfs]
RSP: 0018:ffff888047f9ed68 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888047f9f038 RCX: 1ffffffff5f99f51
RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000012
RBP: ffff888002a41f00 R08: ffffed10005483f0 R09: ffffed10005483ef
R10: ffffed10005483ef R11: ffff888002a41f7f R12: 0000000000000004
R13: ffffe8fff53b5768 R14: 0000000000000005 R15: 0000000000000001
FS:  00007f11d44b5b80(0000) GS:ffff888114200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000ef6000 CR3: 000000002e176003 CR4: 00000000001606e0
Call Trace:
 xfs_dabuf_map.constprop.18+0x696/0xe50 [xfs]
 xfs_da_read_buf+0xf5/0x2c0 [xfs]
 xfs_da3_node_read+0x1d/0x230 [xfs]
 xfs_attr_inactive+0x3cc/0x5e0 [xfs]
 xfs_inactive+0x4c8/0x5b0 [xfs]
 xfs_fs_destroy_inode+0x31b/0x8e0 [xfs]
 destroy_inode+0xbc/0x190
 xfs_bulkstat_one_int+0xa8c/0x1200 [xfs]
 xfs_bulkstat_one+0x16/0x20 [xfs]
 xfs_bulkstat+0x6fa/0xf20 [xfs]
 xfs_ioc_bulkstat+0x182/0x2b0 [xfs]
 xfs_file_ioctl+0xee0/0x12a0 [xfs]
 do_vfs_ioctl+0x193/0x1000
 ksys_ioctl+0x60/0x90
 __x64_sys_ioctl+0x6f/0xb0
 do_syscall_64+0x9f/0x4d0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f11d39a3e5b

The "obvious" cause is that the attr ifork is null despite the inode
claiming an attr fork having at least one extent, but it's not so
obvious why we ended up with an inode in that state.

Reported-by: Zorro Lang <zlang@redhat.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204031
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2019-08-12 09:32:44 -07:00
Darrick J. Wong 858b44dc62 xfs: remove more ondisk directory corruption asserts
Continue our game of replacing ASSERTs for corrupt ondisk metadata with
EFSCORRUPTED returns.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2019-08-12 09:32:44 -07:00
Tetsuo Handa 294fc7a4c8 fs: xfs: xfs_log: Don't use KM_MAYFAIL at xfs_log_reserve().
When the system is close-to-OOM, fsync() may fail due to -ENOMEM because
xfs_log_reserve() is using KM_MAYFAIL. It is a bad thing to fail writeback
operation due to user-triggerable OOM condition. Since we are not using
KM_MAYFAIL at xfs_trans_alloc() before calling xfs_log_reserve(), let's
use the same flags at xfs_log_reserve().

  oom-torture: page allocation failure: order:0, mode:0x46c40(GFP_NOFS|__GFP_NOWARN|__GFP_RETRY_MAYFAIL|__GFP_COMP), nodemask=(null)
  CPU: 7 PID: 1662 Comm: oom-torture Kdump: loaded Not tainted 5.3.0-rc2+ #925
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00
  Call Trace:
   dump_stack+0x67/0x95
   warn_alloc+0xa9/0x140
   __alloc_pages_slowpath+0x9a8/0xbce
   __alloc_pages_nodemask+0x372/0x3b0
   alloc_slab_page+0x3a/0x8d0
   new_slab+0x330/0x420
   ___slab_alloc.constprop.94+0x879/0xb00
   __slab_alloc.isra.89.constprop.93+0x43/0x6f
   kmem_cache_alloc+0x331/0x390
   kmem_zone_alloc+0x9f/0x110 [xfs]
   kmem_zone_alloc+0x9f/0x110 [xfs]
   xlog_ticket_alloc+0x33/0xd0 [xfs]
   xfs_log_reserve+0xb4/0x410 [xfs]
   xfs_trans_reserve+0x1d1/0x2b0 [xfs]
   xfs_trans_alloc+0xc9/0x250 [xfs]
   xfs_setfilesize_trans_alloc.isra.27+0x44/0xc0 [xfs]
   xfs_submit_ioend.isra.28+0xa5/0x180 [xfs]
   xfs_vm_writepages+0x76/0xa0 [xfs]
   do_writepages+0x17/0x80
   __filemap_fdatawrite_range+0xc1/0xf0
   file_write_and_wait_range+0x53/0xa0
   xfs_file_fsync+0x87/0x290 [xfs]
   vfs_fsync_range+0x37/0x80
   do_fsync+0x38/0x60
   __x64_sys_fsync+0xf/0x20
   do_syscall_64+0x4a/0x1c0
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: eb01c9cd87 ("[XFS] Remove the xlog_ticket allocator")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-03 09:36:43 -07:00
Jia-Ju Bai afa1d96d14 xfs: Fix possible null-pointer dereferences in xchk_da_btree_block_check_sibling()
In xchk_da_btree_block_check_sibling(), there is an if statement on
line 274 to check whether ds->state->altpath.blk[level].bp is NULL:
    if (ds->state->altpath.blk[level].bp)

When ds->state->altpath.blk[level].bp is NULL, it is used on line 281:
    xfs_trans_brelse(..., ds->state->altpath.blk[level].bp);
        struct xfs_buf_log_item *bip = bp->b_log_item;
        ASSERT(bp->b_transp == tp);

Thus, possible null-pointer dereferences may occur.

To fix these bugs, ds->state->altpath.blk[level].bp is checked before
being used.

These bugs are found by a static analysis tool STCheck written by us.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-07-30 11:28:20 -07:00
Darrick J. Wong 2e616d9f9c xfs: fix stack contents leakage in the v1 inumber ioctls
Explicitly initialize the onstack structures to zero so we don't leak
kernel memory into userspace when converting the in-core inumbers
structure to the v1 inogrp ioctl structure.  Add a comment about why we
have to use memset to ensure that the padding holes in the structures
are set to zero.

Fixes: 5f19c7fc68 ("xfs: introduce v5 inode group structure")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2019-07-28 21:12:32 -07:00
Linus Torvalds 366a4e38b8 Also new for 5.3:
- Bring fs/xfs/libxfs/xfs_trans_inode.c in sync with userspace libxfs.
 - Convert the xfs administrator guide to rst and move it into the
   official admin guide under Documentation
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl0t6wIACgkQ+H93GTRK
 tOuCqQ//TN2065e+0avyxuwNgUQIS73ujC/R8KivO3f9PlhMzjiMhgCBiIurGsi5
 8rInaCDr4L0DRL67W+3d72iWAc1eSP+l/zp2vEMUs/2YgDTGli5JfoWR0JeuKaiD
 7ymSxPvShfrwDwidMn6S7tXvcEjUEJrms4jeuERBrHwZMOvxzMiM3910oG12AfbM
 Fipvn9eDTmaInHX9T9nuBhvK077/1bXF1J2r/4XieOuhPnNjV5mLH3fJnPswu3uL
 RCSS1WYuJMBV/EQbLCgiM47dgVAtig8QSacqfpGtAZrDLAXy/Y6T+AwvclPFTE5c
 HlHLHf1+zI6FB8t2BNtnapMrRm1wmhCCHEEJLa+iHENY+J113R0I7c6Jscs6tnJd
 oW253Mc9juiB3bnC3iIAgcXrny68qPPpwozzYktQ9XsPxz5XEDEsWBpQS9rzoJRk
 u1zaU3VUxAO2sTJpKE4OXs5DLWvM5C5Yz8jbPXX3U1GBtNzk8EM9SmI0nA1+OVCA
 B4kbN6jKWMsG13bIIb16TbfkrhAucK/09Pbp2/4spayf6PYFci6E6Bg5NuVp2iaH
 n7oIk5WaobFPDbtdIaG27awAX0UCgItFrPMqLN+bBqeiYhNPuxc+ycmyb16VlRWf
 ob0OzWm6+nRJCNGQ1yBd3flILAPxV/Fa7Fegtf/HIDKCLfDW4xM=
 =+DS3
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.3-merge-13' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs cleanups from Darrick Wong:
 "We had a few more lateish cleanup patches come in for 5.3 -- a couple
  of syncups with the userspace libxfs code and a conversion of the XFS
  administrator's guide to ReST format.

  Summary:

   - Bring fs/xfs/libxfs/xfs_trans_inode.c in sync with userspace
     libxfs.

   - Convert the xfs administrator guide to rst and move it into the
     official admin guide under Documentation"

* tag 'xfs-5.3-merge-13' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  Documentation: filesystem: Convert xfs.txt to ReST
  xfs: sync up xfs_trans_inode with userspace
  xfs: move xfs_trans_inode.c to libxfs/
2019-07-18 11:18:00 -07:00
Linus Torvalds f8c3500cd1 - virtio_pmem: The new virtio_pmem facility introduces a paravirtualized
persistent memory device that allows a guest VM to use DAX mechanisms to
   access a host-file with host-page-cache. It arranges for MAP_SYNC to
   be disabled and instead triggers a host fsync() when a 'write-cache
   flush' command is sent to the virtual disk device.
 
 - Miscellaneous small fixups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdMHwpAAoJEB7SkWpmfYgCUYoP/3vcgYBAaXNksyALF0iowPoP
 z4J0KoaOA1CzRFEQtCWUQa84CWj+XoSewwSeyrIkqKQvx/gghXblK+GVjVzBn0BD
 hmmiKr8af4DdxfzYdEXJp65cCpIiVMaJiGr20Aj9ObwvWJb4QZbz9q7hnPt6KgiI
 jVND3BpP3OERb4ZFcibdmJT5foKooMcXVG6+luVe+hc1+ZZQxJBsBaqie4brQIFq
 j59NX3HfHH2fr1vVwnVH0CO4tgbgYg9wZ2EivGu6wBWvORjrr7KiSSbOYP68EBtd
 lUoNps+vQtGnfXGwNzAjp1wuknrQYYh4/KMKjep7hiZD39rgyvBpbHbyynKzQCWV
 REe8cXr/nwphsENvBAUBiqY999EWVIxdT2iaVaSA6K/31JQAC5AFyxVK/P2Ke1SK
 rvePZ++iLQ1o4phTxQPNlVUqF9jOrFVVICGwMDqaqSkOsD9YKQdFClfOF/1ntlDz
 V0bs+Y0Pe8AJCd9ESep4X+vHAWRRIb4EQIuwLaX8RJoY+r1fGye9RPthpYYzvXKp
 DI2iJztFO3anzj2i9htNPUFIaiUmIhzEvG32O2If2yc5FL02hMpHPoFx6vHhe6s3
 f8OJ+olsJK+/IIrV8+DHqYvhzylOYIhmRTvIxIxaNDPHkhR1i2RDQ6KKK1YZmsr8
 MjAZ+Ym0GadDivs+wcM6
 =uAMG
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "Primarily just the virtio_pmem driver:

   - virtio_pmem

     The new virtio_pmem facility introduces a paravirtualized
     persistent memory device that allows a guest VM to use DAX
     mechanisms to access a host-file with host-page-cache. It arranges
     for MAP_SYNC to be disabled and instead triggers a host fsync()
     when a 'write-cache flush' command is sent to the virtual disk
     device.

   - Miscellaneous small fixups"

* tag 'libnvdimm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  virtio_pmem: fix sparse warning
  xfs: disable map_sync for async flush
  ext4: disable map_sync for async flush
  dax: check synchronous mapping is supported
  dm: enable synchronous dax
  libnvdimm: add dax_dev sync flag
  virtio-pmem: Add virtio pmem driver
  libnvdimm: nd_region flush callback support
  libnvdimm, namespace: Drop uuid_t implementation detail
2019-07-18 10:52:08 -07:00
Linus Torvalds 9637d51734 for-linus-20190715
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl0s1ZEQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpiCEEACE9H/pXoegTTWIVPVajMlsa19UHIeilk4N
 GI7oKSiirQEMZnAOmrEzgB4/0zyYQsVypys0gZlYUD3GJVsXDT3zzjNXL5NpVg/O
 nqwSGWMHBSjWkLbaM40Pb2QLXsYgveptNL+9PtxrgtoYPoT5/+TyrJMFrRfi72EK
 WFeNDKOu6aJxpJ26JSsckJ0gluKeeEpRoEqsgHGIwaMIGHQf+b+ikk7tel5FAIgA
 uDwwD+Oxsdgh/ChsXL0d90GkcbcSp6GQ7GybxVmw/tPijx6mpeIY72xY3Zx+t8zF
 b71UNk6NmCKjOPO/6fiuYKKTYw+KhzlyEKO0j675HKfx2AhchEwKw0irp4yUlydA
 zxWYmz4U7iRgktJtymv3J4FEQQ3S6d1EnuQkQNX1LwiOsEsfzhkWi+7jy7KFhZoJ
 AqtYzqnOXvLx92q0vloj06HtK6zo+I/MINldy0+qn9lq0N0VF+dctyztAHLsF7P6
 pUtS6i7l1JSFKAmMhC31sIj5TImaehM2e/TWMUPEDZaO96oKCmQwOF1oiloc6vlW
 h4xWsxP/9zOFcWNyPzy6Vo3JUXWRvFA7K+jV3Hsukw6rVHiNCGVYGSlTv8Roi5b7
 I4ggu9R2JOGyku7UIlL50IRxEyjAp11LaO8yHhcCnRB65rmyBuNMQNcfOsfxpZ5Y
 1mtSNhm5TQ==
 =g8xI
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190715' of git://git.kernel.dk/linux-block

Pull more block updates from Jens Axboe:
 "A later pull request with some followup items. I had some vacation
  coming up to the merge window, so certain things items were delayed a
  bit. This pull request also contains fixes that came in within the
  last few days of the merge window, which I didn't want to push right
  before sending you a pull request.

  This contains:

   - NVMe pull request, mostly fixes, but also a few minor items on the
     feature side that were timing constrained (Christoph et al)

   - Report zones fixes (Damien)

   - Removal of dead code (Damien)

   - Turn on cgroup psi memstall (Josef)

   - block cgroup MAINTAINERS entry (Konstantin)

   - Flush init fix (Josef)

   - blk-throttle low iops timing fix (Konstantin)

   - nbd resize fixes (Mike)

   - nbd 0 blocksize crash fix (Xiubo)

   - block integrity error leak fix (Wenwen)

   - blk-cgroup writeback and priority inheritance fixes (Tejun)"

* tag 'for-linus-20190715' of git://git.kernel.dk/linux-block: (42 commits)
  MAINTAINERS: add entry for block io cgroup
  null_blk: fixup ->report_zones() for !CONFIG_BLK_DEV_ZONED
  block: Limit zone array allocation size
  sd_zbc: Fix report zones buffer allocation
  block: Kill gfp_t argument of blkdev_report_zones()
  block: Allow mapping of vmalloc-ed buffers
  block/bio-integrity: fix a memory leak bug
  nvme: fix NULL deref for fabrics options
  nbd: add netlink reconfigure resize support
  nbd: fix crash when the blksize is zero
  block: Disable write plugging for zoned block devices
  block: Fix elevator name declaration
  block: Remove unused definitions
  nvme: fix regression upon hot device removal and insertion
  blk-throttle: fix zero wait time for iops throttled group
  block: Fix potential overflow in blk_report_zones()
  blkcg: implement REQ_CGROUP_PUNT
  blkcg, writeback: Implement wbc_blkcg_css()
  blkcg, writeback: Add wbc->no_cgroup_owner
  blkcg, writeback: Rename wbc_account_io() to wbc_account_cgroup_owner()
  ...
2019-07-15 21:20:52 -07:00
Eric Sandeen 79ba2a2185 xfs: sync up xfs_trans_inode with userspace
Add an XFS_ICHGTIME_CREATE case to xfs_trans_ichgtime() to keep in
sync with userspace.  (Currently no kernel caller sends this flag.)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-07-15 08:10:34 -07:00
Eric Sandeen 3f6d70e885 xfs: move xfs_trans_inode.c to libxfs/
Userspace now has an identical xfs_trans_inode.c which it has already
moved to libxfs/ so do the same move for kernelspace.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-07-15 08:10:18 -07:00
Linus Torvalds 4ce9d181eb New stuff for 5.3:
- Refactor inode geometry calculation into a single structure instead of
   open-coding pieces everywhere.
 - Add online repair to build options.
 - Remove unnecessary function call flags and functions.
 - Claim maintainership of various loose xfs documentation and header
   files.
 - Use struct bio directly for log buffer IOs instead of struct xfs_buf.
 - Reduce log item boilerplate code requirements.
 - Merge log item code spread across too many files.
 - Further distinguish between log item commits and cancellations.
 - Various small cleanups to the ag small allocator.
 - Support cgroup-aware writeback
 - libxfs refactoring for mkfs cleanup
 - Remove unneeded #includes
 - Fix a memory allocation miscalculation in the new log bio code
 - Fix bisection problems
 - Fix a crash in ioend processing caused by tripping over freeing of
   preallocated transactions
 - Split out a generic inode walk mechanism from the bulkstat code, hook
   up all the internal users to use the walking code, then clean up
   bulkstat to serve only the bulkstat ioctls.
 - Add a multithreaded iwalk implementation to speed up quotacheck on
   fast storage with many CPUs.
 - Remove unnecessary return values in logging teardown functions.
 - Supplement the bstat and inogrp structures with new bulkstat and
   inumbers structures that have all the fields we need for v5
   filesystem features and none of the padding problems of their
   predecessors.
 - Wire up new ioctls that use the new structures with a much simpler
   bulk_ireq structure at the head instead of the pointerhappy mess we
   had before.
 - Enable userspace to constrain bulkstat returns to a single AG or a
   single special inode so that we can phase out a lot of geometry
   guesswork in userspace.
 - Reduce memory consumption and zeroing overhead in extended attribute
   scrub code.
 - Fix some behavioral regressions in the new bulkstat backend code.
 - Fix some behavioral regressions in the new log bio code.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl0mGqwACgkQ+H93GTRK
 tOvUGA//d9+SVDaisbFzwxQVHX2/MPJExKA56kFE4PvSQhIpvPZeGNXUXCZfQXGt
 Hkwm/uEVUdndH9ERo8Z4L2nmT1WQCt3ONUihHn/aE8NSh/g1Bb1J+/t2Wr+Cdqic
 sgU5imhEoMTh0cknoFcGg9PdAJ7ZOv4mr2wzDrj3odq0etP63U+zabxxlX2Lg0VW
 3EXu6YhY+bmq54jdyc4xIL930opfYC64FDVmG7iNK4xlX1M2wJjdEGEP5fEchxf8
 XgsWgJqxRyxJWVH3KTbpXREQkrbj+uDM8MqqFF9uYzjUt2UHc9H206BxJCUfWo7/
 HGhTQmWqXmYMlA+ngEhJ3lcwsgAmHnnLL4YczbprgrQcwIbxMkpw2leEWUGOiLhH
 HdNCnJNVgPcAtmjjDxENqGUR19qslrexgSH2eUrFZFM15/3GHMYT3wdEHyqiZDya
 LiTHllcLDvauUmv8df51jJ+l5dd2Zj9c5iLQUOV5nRGVUj7Fyi7kAuFL+4SVcIbI
 VJVFgtHKsSwKYRPMqb7iTtDnTHkwZNTtQQ1Ptpe+TUH5ngTmqNn5EQVeoUtG6HUi
 0xEaA2MrhoA3VkHk1zFfzStlR8g1YsX5ajBq7luotYY+i3cC4oHLlrvIx9zI7elI
 WMc6Wnn7ozetwehYROVXvgsV+wiqC4D7yWGsnFyhqaS1QakwIM0=
 =VEdn
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.3-merge-12' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Darrick Wong:
 "In this release there are a significant amounts of consolidations and
  cleanups in the log code; restructuring of the log to issue struct
  bios directly; new bulkstat ioctls to return v5 fs inode information
  (and fix all the padding problems of the old ioctl); the beginnings of
  multithreaded inode walks (e.g. quotacheck); and a reduction in memory
  usage in the online scrub code leading to reduced runtimes.

   - Refactor inode geometry calculation into a single structure instead
     of open-coding pieces everywhere.

   - Add online repair to build options.

   - Remove unnecessary function call flags and functions.

   - Claim maintainership of various loose xfs documentation and header
     files.

   - Use struct bio directly for log buffer IOs instead of struct
     xfs_buf.

   - Reduce log item boilerplate code requirements.

   - Merge log item code spread across too many files.

   - Further distinguish between log item commits and cancellations.

   - Various small cleanups to the ag small allocator.

   - Support cgroup-aware writeback

   - libxfs refactoring for mkfs cleanup

   - Remove unneeded #includes

   - Fix a memory allocation miscalculation in the new log bio code

   - Fix bisection problems

   - Fix a crash in ioend processing caused by tripping over freeing of
     preallocated transactions

   - Split out a generic inode walk mechanism from the bulkstat code,
     hook up all the internal users to use the walking code, then clean
     up bulkstat to serve only the bulkstat ioctls.

   - Add a multithreaded iwalk implementation to speed up quotacheck on
     fast storage with many CPUs.

   - Remove unnecessary return values in logging teardown functions.

   - Supplement the bstat and inogrp structures with new bulkstat and
     inumbers structures that have all the fields we need for v5
     filesystem features and none of the padding problems of their
     predecessors.

   - Wire up new ioctls that use the new structures with a much simpler
     bulk_ireq structure at the head instead of the pointerhappy mess we
     had before.

   - Enable userspace to constrain bulkstat returns to a single AG or a
     single special inode so that we can phase out a lot of geometry
     guesswork in userspace.

   - Reduce memory consumption and zeroing overhead in extended
     attribute scrub code.

   - Fix some behavioral regressions in the new bulkstat backend code.

   - Fix some behavioral regressions in the new log bio code"

* tag 'xfs-5.3-merge-12' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (100 commits)
  xfs: chain bios the right way around in xfs_rw_bdev
  xfs: bump INUMBERS cursor correctly in xfs_inumbers_walk
  xfs: don't update lastino for FSBULKSTAT_SINGLE
  xfs: online scrub needn't bother zeroing its temporary buffer
  xfs: only allocate memory for scrubbing attributes when we need it
  xfs: refactor attr scrub memory allocation function
  xfs: refactor extended attribute buffer pointer functions
  xfs: attribute scrub should use seen_enough to pass error values
  xfs: allow single bulkstat of special inodes
  xfs: specify AG in bulk req
  xfs: wire up the v5 inumbers ioctl
  xfs: wire up new v5 bulkstat ioctls
  xfs: introduce v5 inode group structure
  xfs: introduce new v5 bulkstat structure
  xfs: rename bulkstat functions
  xfs: remove various bulk request typedef usage
  fs: xfs: xfs_log: Change return type from int to void
  xfs: poll waiting for quotacheck
  xfs: multithreaded iwalk implementation
  xfs: refactor INUMBERS to use iwalk functions
  ...
2019-07-12 17:17:51 -07:00
Linus Torvalds 5010fe9f09 New for 5.3:
- Standardize parameter checking for the SETFLAGS and FSSETXATTR ioctls
   (which were the file attribute setters for ext4 and xfs and have now
   been hoisted to the vfs)
 - Only allow the DAX flag to be set on files and directories.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl0aJgMACgkQ+H93GTRK
 tOuKkg//SJaxcB63uVPZk9hDraYTmyo9OXRRX6X9WwDKPTWwa88CUwS1ny1QF7Mt
 zMkgzG2/y2Rs9PQ0ARoPbh1hNb2CXnvA+xnzUEev1MW6UN/nTFMZEOPn2ZQ+DxQE
 gg/0U56kKgtjtXzBZVpTgHzSETivdXwHxFW3hiTtyRXg+4ulgDIZLOjN2wRB+Pdb
 X8ZmM6MqKOTbhQEXlw13TlCKBzoMjC1w4UU4rkZPjoSjAaUWiPfrk/XU7qgguf9p
 v1dbSN2dADQ19jzZ1dmggXnlJsRMZjk/ls5rxJlB5DHDbh6YgnA2TE+tYrtH28eB
 uyKfD+RQnMzRVdmH8PsMQRQQFXR2UYyprVP7a6wi6TkB+gytn7sR5uT4sbAhmhcF
 TiTYfYNRXzemHCewyOwOsUE/7oCeiJcdbqiPAHHD/jYLZfRjSXDcGzz3+7ZYZ3GO
 hRxUhpxHPbkmK4T2OxhzReCbRsLN/0BeEcDdLkNWmi2FTh3V1gYzMGkgI9wsVbsd
 pHjoGIHbMPWqktF/obuGq96WVfYBBaWJ6WNzQqKT4dQYAJBW2omxitXQHLpi6cjt
 hG5ncxa3cPpWx4t3Lx2hb0TPS7RyYvuoQIcS/Me2RWioxrwWrgnOqdHFfLEwWpfN
 jRowdWiGgOIsq8hMt7qycmGCXzbgsbaA/7oRqh8TiwM9taPOM4c=
 =uH2E
 -----END PGP SIGNATURE-----

Merge tag 'vfs-fix-ioctl-checking-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull common SETFLAGS/FSSETXATTR parameter checking from Darrick Wong:
 "Here's a patch series that sets up common parameter checking functions
  for the FS_IOC_SETFLAGS and FS_IOC_FSSETXATTR ioctl implementations.

  The goal here is to reduce the amount of behaviorial variance between
  the filesystems where those ioctls originated (ext2 and XFS,
  respectively) and everybody else.

   - Standardize parameter checking for the SETFLAGS and FSSETXATTR
     ioctls (which were the file attribute setters for ext4 and xfs and
     have now been hoisted to the vfs)

   - Only allow the DAX flag to be set on files and directories"

* tag 'vfs-fix-ioctl-checking-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  vfs: only allow FSSETXATTR to set DAX flag on files and dirs
  vfs: teach vfs_ioc_fssetxattr_check to check extent size hints
  vfs: teach vfs_ioc_fssetxattr_check to check project id info
  vfs: create a generic checking function for FS_IOC_FSSETXATTR
  vfs: create a generic checking and prep function for FS_IOC_SETFLAGS
2019-07-12 16:54:37 -07:00
Linus Torvalds 40f06c7995 Changes to copy_file_range for 5.3 from Dave and Amir:
- Create a generic copy_file_range handler and make individual
   filesystems responsible for calling it (i.e. no more assuming that
   do_splice_direct will work or is appropriate)
 - Refactor copy_file_range and remap_range parameter checking where they
   are the same
 - Install missing copy_file_range parameter checking(!)
 - Remove suid/sgid and update mtime like any other file write
 - Change the behavior so that a copy range crossing the source file's
   eof will result in a short copy to the source file's eof instead of
   EINVAL
 - Permit filesystems to decide if they want to handle cross-superblock
   copy_file_range in their local handlers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl0BGvAACgkQ+H93GTRK
 tOu2aw/+KGG7PiXm9ED3ZXUppKVddrZMOgqM7mSfHo6TBgW3pJUJcRIhawK0Wz/P
 stgTsOkurHSl3iT3vQyX4GTZvLoGN/rfsRLPxogJptBUqVv3BOrXsrI53f7V/kbm
 rtjlYsgExji7VBUiMTe5kOWWqxyR7B4nXyvY/8rier57rW/8C1I58B0OrxAmTK0k
 rz1e5BtE1dg91xA7cSdEc38FInz8MW8cvsrEzW9vyYY4IVE0PBuhhA1EvryxTrAZ
 hfthHFfzwxhJkI0mdha8uqNufNWrHLSqiwyjYC7pwAwSQzQPiQz9U17flu+URnfF
 kXaR5LdXbBP3pl46RdthrfuonWsv612cC1Qwfjs8PBG9lG7b9PGJ40MGVTiw7LlQ
 924/03ho0zAnV0E8Qn5O9nPshQNDJhwhzMS39EmMyFKb1D5XGzdMV0gDdIfx6hdO
 HDbw6VQ33S59gvk7v/gxsFB5Bs4PKfamHx/QmwQwpqWM5XExcr0yJ90OTBtAuY4r
 S+9gwG6uED3aPh8HbQ5UgnA8bZmMmi8AkcBvqJ9GgNw5SbZl0oyv9Sj6JNpoOejV
 8y9JkhoZUxqiihnKTw/vtMrj5RCOfifNBjMSwrShfLdLKtK0AZl1mXC0/1Q3VnEQ
 TUcyRHEzrtHgJ9/AK9xIyDNvNYzvHSLZj7maoZZumgQa2FOFrmw=
 =qM44
 -----END PGP SIGNATURE-----

Merge tag 'copy-file-range-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull copy_file_range updates from Darrick Wong:
 "This fixes numerous parameter checking problems and inconsistent
  behaviors in the new(ish) copy_file_range system call.

  Now the system call will actually check its range parameters
  correctly; refuse to copy into files for which the caller does not
  have sufficient privileges; update mtime and strip setuid like file
  writes are supposed to do; and allows copying up to the EOF of the
  source file instead of failing the call like we used to.

  Summary:

   - Create a generic copy_file_range handler and make individual
     filesystems responsible for calling it (i.e. no more assuming that
     do_splice_direct will work or is appropriate)

   - Refactor copy_file_range and remap_range parameter checking where
     they are the same

   - Install missing copy_file_range parameter checking(!)

   - Remove suid/sgid and update mtime like any other file write

   - Change the behavior so that a copy range crossing the source file's
     eof will result in a short copy to the source file's eof instead of
     EINVAL

   - Permit filesystems to decide if they want to handle
     cross-superblock copy_file_range in their local handlers"

* tag 'copy-file-range-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  fuse: copy_file_range needs to strip setuid bits and update timestamps
  vfs: allow copy_file_range to copy across devices
  xfs: use file_modified() helper
  vfs: introduce file_modified() helper
  vfs: add missing checks to copy_file_range
  vfs: remove redundant checks from generic_remap_checks()
  vfs: introduce generic_file_rw_checks()
  vfs: no fallback for ->copy_file_range
  vfs: introduce generic_copy_file_range()
2019-07-10 20:32:37 -07:00
Christoph Hellwig 488ca3d8d0 xfs: chain bios the right way around in xfs_rw_bdev
We need to chain the earlier bios to the later ones, so that
submit_bio_wait waits on the bio that all the completions are
dispatched to.

Fixes: 6ad5b3255b ("xfs: use bios directly to read and write the log recovery buffers")
Reported-by: Dave Chinner <david@fromorbit.com>
Tested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-07-10 10:04:16 -07:00
Darrick J. Wong 0df5c39b3e xfs: bump INUMBERS cursor correctly in xfs_inumbers_walk
There's a subtle unit conversion error when we increment the INUMBERS
cursor at the end of xfs_inumbers_walk.  If there's an inode chunk at
the very end of the AG /and/ the AG size is a perfect power of two, the
startino of that last chunk (which is in units of AG inodes) will be 63
less than (1 << agino_log).  If we add XFS_INODES_PER_CHUNK to the
startino, we end up with a startino that's larger than (1 << agino_log)
and when we convert that back to fs inode units we'll rip off that upper
bit and wind up back at the start of the AG.

Fix this by converting to units of fs inodes before adding
XFS_INODES_PER_CHUNK so that we'll harmlessly end up pointing to the
next AG.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-09 08:06:20 -07:00
Darrick J. Wong 211bbf3c38 xfs: don't update lastino for FSBULKSTAT_SINGLE
The kernel test robot found a regression of xfs/054 in the conversion of
bulkstat to use the new iwalk infrastructure -- if a caller set *lastip
= 128 and invoked FSBULKSTAT_SINGLE, the bstat info would be for inode
128, but *lastip would be increased by the kernel to 129.

FSBULKSTAT_SINGLE never incremented lastip before, so it's incorrect to
make such an update to the internal lastino value now.

Fixes: 2810bd6840 ("xfs: convert bulkstat to new iwalk infrastructure")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-06 21:05:10 -07:00
Pankaj Gupta b21fec4140 xfs: disable map_sync for async flush
Dont support 'MAP_SYNC' with non-DAX files and DAX files
with asynchronous dax_device. Virtio pmem provides
asynchronous host page cache flush mechanism. We don't
support 'MAP_SYNC' with virtio pmem and xfs.

Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-07-05 15:19:10 -07:00
Darrick J. Wong 036f463fe1 xfs: online scrub needn't bother zeroing its temporary buffer
The xattr scrubber functions use the temporary memory buffer either for
storing bitmaps or for testing if attribute value extraction works.  The
bitmap code always zeroes what it needs and the value extraction sets
the buffer contents, so it's not necessary to waste CPU time zeroing on
allocation.

Note that while we never read the contents that the attr value
extraction function sets, we do need to call it to check the remote
attribute header and CRCs to check for corruption.

A flame graph analysis showed that we were spending 7% of a xfs_scrub
run (the whole program, not just the attr scrubber itself) allocating
and zeroing 64k segments needlessly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-05 10:29:56 -07:00
Darrick J. Wong 6d6ccedd76 xfs: only allocate memory for scrubbing attributes when we need it
In examining a flame graph of time spent running xfs_scrub on various
filesystems, I noticed that we spent nearly 7% of the total runtime on
allocating a zeroed 65k buffer for every SCRUB_TYPE_XATTR invocation.
We do this even if none of the attribute values were anywhere near 64k
in size, even if there were no attribute blocks to check space on, and
even if it just turns out there are no attributes at all.

Therefore, rearrange the xattr buffer setup code to support reallocating
with a bigger buffer and redistribute the callers of that function so
that we only allocate memory just prior to needing it, and only allocate
as much as we need.  If we can't get memory with the ILOCK held we'll
bail out with EDEADLOCK which will allocate the maximum memory.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-05 10:29:56 -07:00
Darrick J. Wong 0081675933 xfs: refactor attr scrub memory allocation function
Move the code that allocates memory buffers for the extended attribute
scrub code into a separate function so we can reduce memory allocations
in the next patch.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-05 10:29:55 -07:00
Darrick J. Wong 3addd24880 xfs: refactor extended attribute buffer pointer functions
Replace the open-coded attribute buffer pointer calculations with helper
functions to make it more obvious what we're doing with our freeform
memory allocation w.r.t. either storing xattr values or computing btree
block free space.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-05 10:29:55 -07:00
Darrick J. Wong 2c3b83d7ca xfs: attribute scrub should use seen_enough to pass error values
When we're iterating all the attributes using the built-in xattr
iterator, we can use the seen_enough variable to pass error codes back
to the main scrub function instead of flattening them into 0/1.  This
will be used in a more exciting fashion in upcoming patches.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-05 10:29:54 -07:00
Darrick J. Wong bf3cb39447 xfs: allow single bulkstat of special inodes
Create a new bulk ireq flag that enables userspace to ask us for a
special inode number instead of interpreting @ino as a literal inode
number.  This enables us to query the root inode easily.

The reason for adding the ability to query specifically the root
directory inode is that certain programs (xfsdump and xfsrestore) want
to confirm when they've been pointed to the root directory.  The
userspace code assumes the root directory is always the first result
from calling bulkstat with lastino == 0, but this isn't true if the
(initial btree roots + initial AGFL + inode alignment padding) is itself
long enough to be allocated to new inodes if all of those blocks should
happen to be free at the same time.  Rather than make userspace guess
at internal filesystem state, we provide a direct query.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-04 07:52:24 -07:00
Darrick J. Wong 13d59a2a61 xfs: specify AG in bulk req
Add a new xfs_bulk_ireq flag to constrain the iteration to a single AG.
If the passed-in startino value is zero then we start with the first
inode in the AG that the user passes in; otherwise, we iterate only
within the same AG as the passed-in inode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-04 07:52:23 -07:00
Darrick J. Wong fba9760a43 xfs: wire up the v5 inumbers ioctl
Wire up the v5 INUMBERS ioctl.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03 20:36:28 -07:00
Darrick J. Wong 0448b6f488 xfs: wire up new v5 bulkstat ioctls
Wire up the new v5 BULKSTAT ioctl.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03 20:36:27 -07:00
Darrick J. Wong 5f19c7fc68 xfs: introduce v5 inode group structure
Introduce a new "v5" inode group structure that fixes the alignment
and padding problems of the existing structure.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03 20:36:27 -07:00
Darrick J. Wong 7035f9724f xfs: introduce new v5 bulkstat structure
Introduce a new version of the in-core bulkstat structure that supports
our new v5 format features.  This structure also fills the gaps in the
previous structure.  We leave wiring up the ioctls for the next patch.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03 20:36:26 -07:00
Darrick J. Wong 8bfe9d1810 xfs: rename bulkstat functions
Rename the bulkstat functions to 'fsbulkstat' so that they match the
ioctl names.  We will be introducing a new set of bulkstat/inumbers
ioctls soon, and it will be important to keep the names straight.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03 20:36:26 -07:00
Darrick J. Wong 6f71fb6838 xfs: remove various bulk request typedef usage
Remove xfs_bstat_t, xfs_fsop_bulkreq_t, xfs_inogrp_t, and similarly
named compat typedefs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03 20:36:25 -07:00
Hariprasad Kelam a7a9250e18 fs: xfs: xfs_log: Change return type from int to void
Change return types of below functions as they never fails
xfs_log_mount_cancel
xlog_recover_cancel
xlog_recover_cancel_intents

fix below issue reported by coccicheck
fs/xfs/xfs_log_recover.c:4886:7-12: Unneeded variable: "error". Return
"0" on line 4926

Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-07-03 08:21:58 -07:00
Darrick J. Wong 3e5a428b26 xfs: poll waiting for quotacheck
Create a pwork destroy function that uses polling instead of
uninterruptible sleep to wait for work items to finish so that we can
touch the softlockup watchdog.  IOWs, gross hack.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03 08:21:58 -07:00
Darrick J. Wong 40786717c8 xfs: multithreaded iwalk implementation
Create a parallel iwalk implementation and switch quotacheck to use it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-03 07:33:26 -07:00
Darrick J. Wong 677717fbd4 xfs: refactor INUMBERS to use iwalk functions
Now that we have generic functions to walk inode records, refactor the
INUMBERS implementation to use it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:06 -07:00
Darrick J. Wong 04b8fba2e1 xfs: refactor iwalk code to handle walking inobt records
Refactor xfs_iwalk_ag_start and xfs_iwalk_ag so that the bits that are
particular to bulkstat (trimming the start irec, starting inode
readahead, and skipping empty groups) can be controlled via flags in the
iwag structure.

This enables us to add a new function to walk all inobt records which
will be used for the new INUMBERS implementation in the next patch.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:06 -07:00
Darrick J. Wong 2b5eb82601 xfs: refactor xfs_iwalk_grab_ichunk
In preparation for reusing the iwalk code for the inogrp walking code
(aka INUMBERS), move the initial inobt lookup and retrieval code out of
xfs_iwalk_grab_ichunk so that we call the masking code only when we need
to trim out the inodes that came before the cursor in the inobt record
(aka BULKSTAT).

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:06 -07:00
Darrick J. Wong 688f7c3678 xfs: clean up long conditionals in xfs_iwalk_ichunk_ra
Refactor xfs_iwalk_ichunk_ra to avoid long conditionals.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:06 -07:00
Darrick J. Wong 5e29f3b720 xfs: change xfs_iwalk_grab_ichunk to use startino, not lastino
Now that the inode chunk grabbing function is a static function in the
iwalk code, change its behavior so that @agino is the inode where we
want to /start/ the iteration.  This reduces cognitive friction with the
callers and simplifes the code.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:05 -07:00
Darrick J. Wong da1d9e5912 xfs: move bulkstat ichunk helpers to iwalk code
Now that we've reworked the bulkstat code to use iwalk, we can move the
old bulkstat ichunk helpers to xfs_iwalk.c.  No functional changes here.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:05 -07:00
Darrick J. Wong 938c710d99 xfs: calculate inode walk prefetch more carefully
The existing inode walk prefetch is based on the old bulkstat code,
which simply allocated 4 pages worth of memory and prefetched that many
inobt records, regardless of however many inodes the caller requested.
65536 inodes is a lot to prefetch (~32M on x64, ~512M on arm64) so let's
scale things down a little more intelligently based on the number of
inodes requested, etc.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:05 -07:00
Darrick J. Wong 2810bd6840 xfs: convert bulkstat to new iwalk infrastructure
Create a new ibulk structure incore to help us deal with bulk inode stat
state tracking and then convert the bulkstat code to use the new iwalk
iterator.  This disentangles inode walking from bulk stat control for
simpler code and enables us to isolate the formatter functions to the
ioctl handling code.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:05 -07:00
Darrick J. Wong f16fe3ecde xfs: bulkstat should copy lastip whenever userspace supplies one
When userspace passes in a @lastip pointer we should copy the results
back, even if the @ocount pointer is NULL.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:05 -07:00
Darrick J. Wong ebd126a651 xfs: convert quotacheck to use the new iwalk functions
Convert quotacheck to use the new iwalk iterator to dig through the
inodes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:05 -07:00
Darrick J. Wong a211432c27 xfs: create simplified inode walk function
Create a new iterator function to simplify walking inodes in an XFS
filesystem.  This new iterator will replace the existing open-coded
walking that goes on in various places.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:05 -07:00
Darrick J. Wong 5bb46e3e18 xfs: create iterator error codes
Currently, xfs doesn't have generic error codes defined for "stop
iterating"; we just reuse the XFS_BTREE_QUERY_* return values.  This
looks a little weird if we're not actually iterating a btree index.
Before we start adding more iterators, we should create general
XFS_ITER_{CONTINUE,ABORT} return values and define the XFS_BTREE_QUERY_*
ones from that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-07-02 09:40:05 -07:00
Darrick J. Wong ca29be7534 vfs: teach vfs_ioc_fssetxattr_check to check extent size hints
Move the extent size hint checks that aren't xfs-specific to the vfs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
2019-07-01 08:25:36 -07:00
Darrick J. Wong f991492ed1 vfs: teach vfs_ioc_fssetxattr_check to check project id info
Standardize the project id checks for FSSETXATTR.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
2019-07-01 08:25:35 -07:00
Darrick J. Wong 7b0e492e6b vfs: create a generic checking function for FS_IOC_FSSETXATTR
Create a generic checking function for the incoming FS_IOC_FSSETXATTR
fsxattr values so that we can standardize some of the implementation
behaviors.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
2019-07-01 08:25:35 -07:00
Ming Lei 79d08f89bb block: fix .bi_size overflow
'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1
bytes.

Before 07173c3ec2 ("block: enable multipage bvecs"), one bio can
include very limited pages, and usually at most 256, so the fs bio
size won't be bigger than 1M bytes most of times.

Since we support multi-page bvec, in theory one fs bio really can
be added > 1M pages, especially in case of hugepage, or big writeback
with too many dirty pages. Then there is chance in which .bi_size
is overflowed.

Fixes this issue by using bio_full() to check if the added segment may
overflow .bi_size.

Cc: Liu Yiding <liuyd.fnst@cn.fujitsu.com>
Cc: kernel test robot <rong.a.chen@intel.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: 07173c3ec2 ("block: enable multipage bvecs")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-01 08:18:54 -06:00
Christoph Hellwig 73d30d4874 xfs: remove XFS_TRANS_NOFS
Instead of a magic flag for xfs_trans_alloc, just ensure all callers
that can't relclaim through the file system use memalloc_nofs_save to
set the per-task nofs flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-30 09:05:17 -07:00
Christoph Hellwig fe64e0d26b xfs: simplify xfs_ioend_can_merge
Compare the block layer status directly instead of converting it to
an errno first.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-30 09:05:17 -07:00
Christoph Hellwig 7dbae9fbde xfs: allow merging ioends over append boundaries
There is no real problem merging ioends that go beyond i_size into an
ioend that doesn't.  We just need to move the append transaction to the
base ioend.  Also use the opportunity to use a real error code instead
of the magic 1 to cancel the transactions, and write a comment
explaining the scheme.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-30 09:05:17 -07:00
Christoph Hellwig 0290d9c1e5 xfs: fix a comment typo in xfs_submit_ioend
The fail argument is long gone, update the comment.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-30 09:05:17 -07:00
Christoph Hellwig 1fdafce55c xfs: remove the unused xfs_count_page_state declaration
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-30 09:05:17 -07:00
Christoph Hellwig 89b171acb2 xfs: fix iclog allocation size
Properly allocate the space for the bio_vecs instead of just one byte
per bio_vec.

Fixes: 79b54d9bfc ("xfs: use bios directly to write log buffers")
Reported-by: syzbot+b75afdbe271a0d7ac4f6@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 21:02:45 -07:00
Eric Sandeen 250d4b4c40 xfs: remove unused header files
There are many, many xfs header files which are included but
unneeded (or included twice) in the xfs code, so remove them.

nb: xfs_linux.h includes about 9 headers for everyone, so those
explicit includes get removed by this.  I'm not sure what the
preference is, but if we wanted explicit includes everywhere,
a followup patch could remove those xfs_*.h includes from
xfs_linux.h and move them into the files that need them.
Or it could be left as-is.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:43 -07:00
Christoph Hellwig adfb5fb46a xfs: implement cgroup aware writeback
Link every newly allocated writeback bio to cgroup pointed to by the
writeback control structure, and charge every byte written back to it.

Tested-by: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:22 -07:00
Christoph Hellwig a247373596 xfs: simplify xfs_chain_bio
Move setting up operation and write hint to xfs_alloc_ioend, and
then just copy over all needed information from the previous bio
in xfs_chain_bio and stop passing various parameters to it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:22 -07:00
Darrick J. Wong f327a00745 xfs: account for log space when formatting new AGs
When we're writing out a fresh new AG, make sure that we don't list an
internal log as free and that we create the rmap for the region.  growfs
never does this, but we will need it when we hook up mkfs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-28 19:30:21 -07:00
Darrick J. Wong 8d90857cff xfs: refactor free space btree record initialization
Refactor the code that populates the free space btrees of a new AG so
that we can avoid code duplication once things start getting
complicated.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-28 19:30:21 -07:00
Brian Foster 7e36a3a63d xfs: always update params on small allocation
xfs_alloc_ag_vextent_small() doesn't update the output parameters in
the event of an AGFL allocation. Instead, it updates the
xfs_alloc_arg structure directly to complete the allocation.

Update both args and the output params to provide consistent
behavior for future callers.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:20 -07:00
Brian Foster 6691cd9267 xfs: skip small alloc cntbt logic on NULL cursor
The small allocation helper is implemented in a way that is fairly
tightly integrated to the existing allocation algorithms. It expects
a cntbt cursor beyond the end of the tree, attempts to locate the
last record in the tree and only attempts an AGFL allocation if the
cntbt is empty.

The upcoming generic algorithm doesn't rely on the cntbt processing
of this function. It will only call this function when the cntbt
doesn't have a big enough extent or is empty and thus AGFL
allocation is the only remaining option. Tweak
xfs_alloc_ag_vextent_small() to handle a NULL cntbt cursor and skip
the cntbt logic. This facilitates use by the existing allocation
code and new code that only requires an AGFL allocation attempt.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:20 -07:00
Brian Foster c63cdd4fc9 xfs: move small allocation helper
Move the small allocation helper further up in the file to avoid the
need for a function declaration. The remaining declarations will be
removed by followup patches. No functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:19 -07:00
Brian Foster 2a4f35f984 xfs: clean up small allocation helper
xfs_alloc_ag_vextent_small() is kind of a mess. Clean it up in
preparation for future changes. No functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:19 -07:00
Christoph Hellwig caeaea9858 xfs: merge xfs_trans_bmap.c into xfs_bmap_item.c
Keep all bmap item related code together.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:29:42 -07:00
Christoph Hellwig 3cfce1e3ce xfs: merge xfs_trans_rmap.c into xfs_rmap_item.c
Keep all rmap item related code together in one file.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:29:41 -07:00
Christoph Hellwig effd5e96e7 xfs: merge xfs_trans_refcount.c into xfs_refcount_item.c
Keep all the refcount item related code together in one file.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:29:41 -07:00
Christoph Hellwig 81f4004173 xfs: merge xfs_trans_extfree.c into xfs_extfree_item.c
Keep all the extree item related code together in one file.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:28:17 -07:00
Christoph Hellwig 73f0d23633 xfs: merge xfs_bud_init into xfs_trans_get_bud
There is no good reason to keep these two functions separate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:36 -07:00
Christoph Hellwig 60883447f4 xfs: merge xfs_rud_init into xfs_trans_get_rud
There is no good reason to keep these two functions separate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:36 -07:00
Christoph Hellwig ebeb8e0629 xfs: merge xfs_cud_init into xfs_trans_get_cud
There is no good reason to keep these two functions separate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:35 -07:00
Christoph Hellwig 9c5e7c2ae3 xfs: merge xfs_efd_init into xfs_trans_get_efd
There is no good reason to keep these two functions separate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:35 -07:00
Christoph Hellwig 95cf0e4a0d xfs: remove a pointless comment duplicated above all xfs_item_ops instances
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:34 -07:00
Christoph Hellwig 89ae379d56 xfs: use a list_head for iclog callbacks
Replace the hand grown linked list handling and cil context attachment
with the standard list_head structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:34 -07:00
Christoph Hellwig efe2330fdc xfs: remove the xfs_log_item_t typedef
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:33 -07:00
Christoph Hellwig b3b14aacc6 xfs: don't cast inode_log_items to get the log_item
The cast is not type safe, and we can just dereference the first
member instead to start with.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:33 -07:00
Christoph Hellwig 9ce632a28a xfs: add a flag to release log items on commit
We have various items that are released from ->iop_comitting.  Add a
flag to just call ->iop_release from the commit path to avoid tons
of boilerplate code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:32 -07:00
Christoph Hellwig ddf92053e4 xfs: split iop_unlock
The iop_unlock method is called when comitting or cancelling a
transaction.  In the latter case, the transaction may or may not be
aborted.  While there is no known problem with the current code in
practice, this implementation is limited in that any log item
implementation that might want to differentiate between a commit and a
cancellation must rely on the aborted state.  The aborted bit is only
set when the cancelled transaction is dirty, however.  This means that
there is no way to distinguish between a commit and a clean transaction
cancellation.

For example, intent log items currently rely on this distinction.  The
log item is either transferred to the CIL on commit or released on
transaction cancel. There is currently no possibility for a clean intent
log item in a transaction, but if that state is ever introduced a cancel
of such a transaction will immediately result in memory leaks of the
associated log item(s).  This is an interface deficiency and landmine.

To clean this up, replace the iop_unlock method with an iop_release
method that is specific to transaction cancel.  The existing
iop_committing method occurs at the same time as iop_unlock in the
commit path and there is no need for two separate callbacks here.
Overload the iop_committing method with the current commit time
iop_unlock implementations to eliminate the need for the latter and
further simplify the interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:32 -07:00
Christoph Hellwig 195cd83d1b xfs: don't use xfs_trans_free_items in the commit path
While commiting items looks very similar to freeing them on error it is
a different operation, and they will diverge a bit soon.

Split out the commit case from xfs_trans_free_items, inline it into
xfs_log_commit_cil and give it a separate trace point.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:31 -07:00
Christoph Hellwig 8e4b20ea83 xfs: remove the dummy iop_push implementation for inode creation items
This method should never be called, so don't waste code on it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:31 -07:00
Christoph Hellwig e8b78db77d xfs: don't require log items to implement optional methods
Just check if they are present first.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:30 -07:00
Christoph Hellwig d15cbf2f38 xfs: stop using XFS_LI_ABORTED as a parameter flag
Just pass a straight bool aborted instead of abusing XFS_LI_ABORTED as a
flag in function parameters.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:30 -07:00
Christoph Hellwig 086252c34b xfs: fix a trivial comment typo in xfs_trans_committed_bulk
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:29 -07:00
Christoph Hellwig dbd329f1e4 xfs: add struct xfs_mount pointer to struct xfs_buf
We need to derive the mount pointer from a buffer in a lot of place.
Add a direct pointer to short cut the pointer chasing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:29 -07:00
Christoph Hellwig 8124b9b601 xfs: remove the b_io_length field in struct xfs_buf
This field is now always idential to b_length.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:28 -07:00
Christoph Hellwig e99b4bd0cb xfs: properly type the b_log_item field in struct xfs_buf
Now that the log code doesn't abuse this field any more we can
declare it as a struct xfs_buf_log_item pointer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:28 -07:00
Christoph Hellwig 0564501ff5 xfs: remove unused buffer cache APIs
Now that the log code uses bios directly we can drop various special
cases in the buffer cache code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:27 -07:00
Christoph Hellwig 6e9b3dd80f xfs: stop using bp naming for log recovery buffers
Now that we don't use struct xfs_buf to hold log recovery buffer rename
the related functions and variables to just talk of a buffer instead of
using the bp name that we usually use for xfs_buf related functionality.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:27 -07:00
Christoph Hellwig 6ad5b3255b xfs: use bios directly to read and write the log recovery buffers
The xfs_buf structure is basically used as a glorified container for
a memory allocation in the log recovery code.  Replace it with a
call to kmem_alloc_large and a simple abstraction to read into or
write from it synchronously using chained bios.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:26 -07:00
Christoph Hellwig 18ffb8c3f0 xfs: return an offset instead of a pointer from xlog_align
This simplifies both the helper and the callers.  We lost a bit of
size sanity checking, but that is already covered by KASAN if needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:26 -07:00
Christoph Hellwig 1058d0f5ee xfs: move the log ioend workqueue to struct xlog
Move the workqueue used for log I/O completions from struct xfs_mount
to struct xlog to keep it self contained in the log code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: destroy the log workqueue after ensuring log ios are done]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:25 -07:00
Christoph Hellwig 79b54d9bfc xfs: use bios directly to write log buffers
Currently the XFS logging code uses the xfs_buf structure and
associated APIs to write the log buffers to disk.  This requires
various special cases in the log code and is generally not very
optimal.

Instead of using a buffer just allocate a kmem_alloc_larger region for
each log buffer, and use a bio and bio_vec array embedded in the iclog
structure to write the buffer to disk.  This also allows for using
the bio split and chaining case to deal with the case of a log
buffer wrapping around the end of the log.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: don't split if/else with an #endif]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:25 -07:00
Christoph Hellwig 2d15d2c0e0 xfs: make use of the l_targ field in struct xlog
Use the slightly shorter way to get at the buftarg for the log device
wherever we can in the log and log recovery code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:24 -07:00
Christoph Hellwig abca1f33f8 xfs: remove the syncing argument from xlog_verify_iclog
The only caller unconditionally passes true here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:24 -07:00
Christoph Hellwig 9b0489c1d1 xfs: update both stat counters together in xlog_sync
Just a small bit of code tidying up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:23 -07:00
Christoph Hellwig db0a6faf93 xfs: factor out iclog size calculation from xlog_sync
Split out another self-contained bit of code from xlog_sync.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:23 -07:00
Christoph Hellwig 5693384805 xfs: factor out splitting of an iclog from xlog_sync
Split out a self-contained chunk of code from xlog_sync that calculates
the split offset for an iclog that wraps the log end and bumps the
cycles for the second half.

Use the chance to bring some sanity to the variables used to track the
split in xlog_sync by not changing the count variable, and instead use
split as the offset for the split and use those to calculate the
sizes and offsets for the two write buffers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:22 -07:00
Christoph Hellwig 94860a301b xfs: factor out log buffer writing from xlog_sync
Replace the not very useful xlog_bdstrat wrapper with a new version that
that takes care of all the common logic for writing log buffers.  Use
the opportunity to avoid overloading the buffer address with the log
relative address, and to shed the unused return value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:22 -07:00
Christoph Hellwig 1f9489be02 xfs: don't use REQ_PREFLUSH for split log writes
If we have to split a log write because it wraps the end of the log we
can't just use REQ_PREFLUSH to flush before the first log write,
as the writes might get reordered somewhere in the I/O stack.  Issue
a manual flush in that case so that the ordering of the two log I/Os
doesn't matter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:21 -07:00
Christoph Hellwig 366fc4b898 xfs: remove XLOG_STATE_IOABORT
This value is the only flag in ic_state, which we otherwise use as
a state.  Switch it to a new debug-only field and also report and
actual error in the buffer in the I/O completion path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:21 -07:00
Christoph Hellwig 9bff313253 xfs: reformat xlog_get_lowest_lsn
Reformat xlog_get_lowest_lsn to our usual style.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:20 -07:00
Christoph Hellwig 4f62282a36 xfs: cleanup xlog_get_iclog_buffer_size
We don't really need all the messy branches in the function, as it
really does three things, out of which 2 are common for all branches:

 1) set up mount point log buffer size and count values if not already
    done from mount options
 2) calculate the number of log headers
 3) set up all the values in struct xlog based on the above

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:20 -07:00
Christoph Hellwig 76ce9823ac xfs: remove the l_iclog_size_log field from struct xlog
This field is never used, so we can simply kill it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:19 -07:00
Christoph Hellwig 72945d86dd xfs: make mem_to_page available outside of xfs_buf.c
Rename the function to kmem_to_page and move it to kmem.h together
with our kmem_large allocator that may either return kmalloced or
vmalloc pages.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:19 -07:00
Christoph Hellwig ce89755cdf xfs: renumber XBF_WRITE_FAIL
Assining a numerical value that is not close to the flags
defined near by is just asking for conflicts later on.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:18 -07:00
Christoph Hellwig 153fd7b57c xfs: remove the never used _XBF_COMPOUND flag
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:18 -07:00
Christoph Hellwig 1e85a3670d xfs: remove the no-op spinlock_destroy stub
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:17 -07:00
Darrick J. Wong 5467b34bd1 xfs: move xfs_ino_geometry to xfs_shared.h
The inode geometry structure isn't related to ondisk format; it's
support for the mount structure.  Move it to xfs_shared.h.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-28 19:25:35 -07:00
Christoph Hellwig ff896738be block: return from __bio_try_merge_page if merging occured in the same page
We currently have an input same_page parameter to __bio_try_merge_page
to prohibit merging in the same page.  The rationale for that is that
some callers need to account for every page added to a bio.  Instead of
letting these callers call twice into the merge code to account for the
new vs existing page cases, just turn the paramter into an output one that
returns if a merge in the same page occured and let them act accordingly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-17 09:33:02 -06:00
Eric Sandeen f5b999c03f xfs: remove unused flag arguments
There are several functions which take a flag argument that is
only ever passed as "0," so remove these arguments.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 09:00:00 -07:00
Christoph Hellwig 76dee76921 xfs: remove the debug-only q_transp field from struct xfs_dquot
The field is only used for a few assertations.  Shrink the dqout
structure instead, similarly to what commit f3ca87389d
("xfs: remove i_transp") did for the xfs_inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 08:59:59 -07:00
Christoph Hellwig f9a196ee5a xfs: merge xfs_buf_zero and xfs_buf_iomove
xfs_buf_zero is the only caller of xfs_buf_iomove.  Remove support
for copying from or to the buffer in xfs_buf_iomove and merge the
two functions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 08:59:59 -07:00
Eric Sandeen 8c9ce2f707 xfs: remove unused flags arg from getsb interfaces
The flags value is always passed as 0 so remove the argument.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 08:59:58 -07:00
Eric Sandeen d03a2f1b9f xfs: include WARN, REPAIR build options in XFS_BUILD_OPTIONS
The XFS_BUILD_OPTIONS string, shown at module init time and
in modinfo output, does not currently include all available
build options.  So, add in CONFIG_XFS_WARN and CONFIG_XFS_REPAIR.

It has been suggested in some quarters
That this is not enough.
Well ...

Anybody who would like to see this in a sysfs file can send
a patch.  :)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 08:37:40 -07:00
Darrick J. Wong 4b4d98cca3 xfs: finish converting to inodes_per_cluster
Finish converting all the old inode_cluster_size >> inopblog users to
inodes_per_cluster.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-06-12 08:37:40 -07:00
Darrick J. Wong 490d451fa5 xfs: fix inode_cluster_size rounding mayhem
inode_cluster_size is supposed to represent the size (in bytes) of an
inode cluster buffer.  We avoid having to handle multiple clusters per
filesystem block on filesystems with large blocks by openly rounding
this value up to 1 FSB when necessary.  However, we never reset
inode_cluster_size to reflect this new rounded value, which adds to the
potential for mistakes in calculating geometries.

Fix this by setting inode_cluster_size to reflect the rounded-up size if
needed, and special-case the few places in the sparse inodes code where
we actually need the smaller value to validate on-disk metadata.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-06-12 08:37:40 -07:00
Darrick J. Wong 494dba7b27 xfs: refactor inode geometry setup routines
Migrate all of the inode geometry setup code from xfs_mount.c into a
single libxfs function that we can share with xfsprogs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-06-12 08:37:40 -07:00
Darrick J. Wong ef32595999 xfs: separate inode geometry
Separate the inode geometry information into a distinct structure.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-06-12 08:37:40 -07:00
Amir Goldstein 8c3f406c09 xfs: use file_modified() helper
Note that by using the helper, the order of calling file_remove_privs()
after file_update_mtime() in xfs_file_aio_write_checks() has changed.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:19 -07:00
Linus Torvalds 01047631df Changes since last update:
- Fix some forgotten strings in a log debugging function
 - Fix incorrect unit conversion in online fsck code
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlz1Ur0ACgkQ+H93GTRK
 tOshcA//X+tEFZGF+oosQ3h/IJqr9pODZfVf76mQIpg/5IU7WUzONGjTsjiHyH93
 zC2qw7l7m/3LyCSXfuJfuWOCyPXRRj7Dv+iUCbjmAh0OAn6Alpa1VN9jxABTeTuY
 xeCoNdRBCg6wF2XLLVgEN+VEdrFc7F7x/4OyoW5dmmBQakNOWlVgLtQqO+7gK/hS
 Qc+xw9ekKvkceHi//NXJTSXAZ0EfntzNZ/Fg41O2cztWV4oKxl2a4ej+j0g8/u9e
 rabTP54RH8dNJIejRmWU3dxz/w7OHSOO84LW/Q4LMKykfhLV6lhPL25igy1eLNhY
 OpcCWQio4ZKBwd8UoXCH0HA4dprusPipI1JIXp/YDHGFb0PumhkZaO/1xM0G5J+o
 3n1A1m7MW16jXAwlwBq0A6UNUFGyMfhfQiCluikBwkpR5dhWQoLQ5/IWZxs7ibUy
 0M5POwnYUsx7xcx3/TjUsLaGbBrqD/F4LXXGmxVJQa6Dt6B0ctyLDnjN08mNwR0e
 4Kvck4yxUHrUoXGaOhtddZmloVcrY58CwwHPabwoA3qC39ktKvp5PPrTNPiao8Ew
 oILteFrpQxPfg1AItXO/QMcGvj7ragHSA2O0XqD5Tm61jWErfpIAyRQ89oXT5GUt
 Z52uOI164u/8T4S/HUvi9tqYuZ+xudG4krIClDCXBaww1W2xNjo=
 =WFWW
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.2-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "Here are a couple more bug fixes for 5.2. Changes since last update:

   - Fix some forgotten strings in a log debugging function

   - Fix incorrect unit conversion in online fsck code"

* tag 'xfs-5.2-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: inode btree scrubber should calculate im_boffset correctly
  xfs: fix broken log reservation debugging
2019-06-06 12:36:54 -07:00
Darrick J. Wong 025197ebb0 xfs: inode btree scrubber should calculate im_boffset correctly
The im_boffset field is in units of bytes, whereas XFS_INO_OFFSET
returns a value in units of inodes.  Convert the units so that scrub on
a 64k-block filesystem works correctly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-06-03 09:18:40 -07:00
Darrick J. Wong d31d718528 xfs: fix broken log reservation debugging
xlog_print_tic_res() is supposed to print a human readable string for
each element of the log ticket reservation array.  Unfortunately, I
forgot to update the string array when we added rmap & reflink support,
so the debug message prints "region[3]: (null) - 352 bytes" which isn't
useful at all.  Add the missing elements and add a build check so that
we don't forget again to add a string when adding a new XLOG_REG_TYPE.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-05-24 07:32:01 -07:00
Linus Torvalds 4dde821e42 Fixes for 5.1:
- Fix an accounting mistake where we included the log space when
   calculating the reserve space for metadata expansion.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlzkEpIACgkQ+H93GTRK
 tOuuXg//Zjuz7KxWtfMRNvNPL3w9YTuu0FyvlQRwFeIKOu9oc2bq+3JIEjakFNM0
 C4htke3Q0CWEokLGNMxjTg53/IbWlrq2G3mszjyQbH/xTidO+ULm6A6z7aIYa4Wj
 hShpHnkd12F543nUgsDUg4+caHrfqlU2ZOkpn7HtmVsCASzNOeBBINSFDUAGqouB
 gWAq17PeBwnUMJLBFvOW4CMcfYfxOkhM4XsAnmJn1+Mk+CVVichgXy8HsxIqOlGT
 wUbt4iVYLcoAd9dsRkTpiCE+ni5AtfyF2aCtMb39Vm18wzGhokIvHlH+IVqdR0k8
 fUoexMPx7XsDWbm0MR5n8uirQUfL6RIBxxVdSwFRNvngmYiAdvxiNlC+rzZm+kdb
 r4tKk+SNXMIonv73bmwBBEuu4LRhb77Ls4jc+Q/6oRp1N6lf16QlmPzAyTmu0jRH
 Uxe3Nqa0PlerjiU6sgC45Dtym3loF9Cvun2f8JJHEYrapVpDgmWdH+l5bQoENpV5
 GH5T90RjjEE+L2PGg83xVBLIBElJ05npzT73SkyfMacBPTR4ziYNX2blPnHFU+es
 VyZG6IqyjINTsiS3Y13SBd658BhEUrSsxaMJo3zwOMJRfVcqEPOIa+DLPt3Mok5i
 Pw382Ym4aNdHNTO0g83XOdtyHt8qAxM330Sayiw2ac/FGwtnoEM=
 =C52a
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.2-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
 "Fix an accounting mistake where we included the log space when
  calculating the reserve space for metadata expansion"

* tag 'xfs-5.2-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: don't reserve per-AG space for an internal log
2019-05-23 11:18:18 -07:00
Thomas Gleixner ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Darrick J. Wong 5cd213b0fe xfs: don't reserve per-AG space for an internal log
It turns out that the log can consume nearly all the space in an AG, and
when this happens this it's possible that there will be less free space
in the AG than the reservation would try to hide.  On a debug kernel
this can trigger an ASSERT in xfs/250:

XFS: Assertion failed: xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved + xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <= pag->pagf_freeblks + pag->pagf_flcount, file: fs/xfs/libxfs/xfs_ag_resv.c, line: 319

The log is permanently allocated, so we know we're never going to have
to expand the btrees to hold any records associated with the log space.
We therefore can treat the space as if it doesn't exist.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2019-05-20 11:25:39 -07:00