linux-brain/fs
Anand Jain fb4c7d2923 btrfs: fix lockdep warning while mounting sprout fs
[ Upstream commit c124706900c20dee70f921bb3a90492431561a0a ]

Following test case reproduces lockdep warning.

  Test case:

  $ mkfs.btrfs -f <dev1>
  $ btrfstune -S 1 <dev1>
  $ mount <dev1> <mnt>
  $ btrfs device add <dev2> <mnt> -f
  $ umount <mnt>
  $ mount <dev2> <mnt>
  $ umount <mnt>

The warning claims a possible ABBA deadlock between the threads
initiated by [#1] btrfs device add and [#0] the mount.

  [ 540.743122] WARNING: possible circular locking dependency detected
  [ 540.743129] 5.11.0-rc7+ #5 Not tainted
  [ 540.743135] ------------------------------------------------------
  [ 540.743142] mount/2515 is trying to acquire lock:
  [ 540.743149] ffffa0c5544c2ce0 (&fs_devs->device_list_mutex){+.+.}-{4:4}, at: clone_fs_devices+0x6d/0x210 [btrfs]
  [ 540.743458] but task is already holding lock:
  [ 540.743461] ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs]
  [ 540.743541] which lock already depends on the new lock.
  [ 540.743543] the existing dependency chain (in reverse order) is:

  [ 540.743546] -> #1 (btrfs-chunk-00){++++}-{4:4}:
  [ 540.743566] down_read_nested+0x48/0x2b0
  [ 540.743585] __btrfs_tree_read_lock+0x32/0x200 [btrfs]
  [ 540.743650] btrfs_read_lock_root_node+0x70/0x200 [btrfs]
  [ 540.743733] btrfs_search_slot+0x6c6/0xe00 [btrfs]
  [ 540.743785] btrfs_update_device+0x83/0x260 [btrfs]
  [ 540.743849] btrfs_finish_chunk_alloc+0x13f/0x660 [btrfs] <--- device_list_mutex
  [ 540.743911] btrfs_create_pending_block_groups+0x18d/0x3f0 [btrfs]
  [ 540.743982] btrfs_commit_transaction+0x86/0x1260 [btrfs]
  [ 540.744037] btrfs_init_new_device+0x1600/0x1dd0 [btrfs]
  [ 540.744101] btrfs_ioctl+0x1c77/0x24c0 [btrfs]
  [ 540.744166] __x64_sys_ioctl+0xe4/0x140
  [ 540.744170] do_syscall_64+0x4b/0x80
  [ 540.744174] entry_SYSCALL_64_after_hwframe+0x44/0xa9

  [ 540.744180] -> #0 (&fs_devs->device_list_mutex){+.+.}-{4:4}:
  [ 540.744184] __lock_acquire+0x155f/0x2360
  [ 540.744188] lock_acquire+0x10b/0x5c0
  [ 540.744190] __mutex_lock+0xb1/0xf80
  [ 540.744193] mutex_lock_nested+0x27/0x30
  [ 540.744196] clone_fs_devices+0x6d/0x210 [btrfs]
  [ 540.744270] btrfs_read_chunk_tree+0x3c7/0xbb0 [btrfs]
  [ 540.744336] open_ctree+0xf6e/0x2074 [btrfs]
  [ 540.744406] btrfs_mount_root.cold.72+0x16/0x127 [btrfs]
  [ 540.744472] legacy_get_tree+0x38/0x90
  [ 540.744475] vfs_get_tree+0x30/0x140
  [ 540.744478] fc_mount+0x16/0x60
  [ 540.744482] vfs_kern_mount+0x91/0x100
  [ 540.744484] btrfs_mount+0x1e6/0x670 [btrfs]
  [ 540.744536] legacy_get_tree+0x38/0x90
  [ 540.744537] vfs_get_tree+0x30/0x140
  [ 540.744539] path_mount+0x8d8/0x1070
  [ 540.744541] do_mount+0x8d/0xc0
  [ 540.744543] __x64_sys_mount+0x125/0x160
  [ 540.744545] do_syscall_64+0x4b/0x80
  [ 540.744547] entry_SYSCALL_64_after_hwframe+0x44/0xa9

  [ 540.744551] other info that might help us debug this:
  [ 540.744552] Possible unsafe locking scenario:

  [ 540.744553] CPU0 				CPU1
  [ 540.744554] ---- 				----
  [ 540.744555] lock(btrfs-chunk-00);
  [ 540.744557] 					lock(&fs_devs->device_list_mutex);
  [ 540.744560] 					lock(btrfs-chunk-00);
  [ 540.744562] lock(&fs_devs->device_list_mutex);
  [ 540.744564]
   *** DEADLOCK ***

  [ 540.744565] 3 locks held by mount/2515:
  [ 540.744567] #0: ffffa0c56bf7a0e0 (&type->s_umount_key#42/1){+.+.}-{4:4}, at: alloc_super.isra.16+0xdf/0x450
  [ 540.744574] #1: ffffffffc05a9628 (uuid_mutex){+.+.}-{4:4}, at: btrfs_read_chunk_tree+0x63/0xbb0 [btrfs]
  [ 540.744640] #2: ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs]
  [ 540.744708]
   stack backtrace:
  [ 540.744712] CPU: 2 PID: 2515 Comm: mount Not tainted 5.11.0-rc7+ #5

But the device_list_mutex in clone_fs_devices() is redundant, as
explained below.  Two threads [1]  and [2] (below) could lead to
clone_fs_device().

  [1]
  open_ctree <== mount sprout fs
   btrfs_read_chunk_tree()
    mutex_lock(&uuid_mutex) <== global lock
    read_one_dev()
     open_seed_devices()
      clone_fs_devices() <== seed fs_devices
       mutex_lock(&orig->device_list_mutex) <== seed fs_devices

  [2]
  btrfs_init_new_device() <== sprouting
   mutex_lock(&uuid_mutex); <== global lock
   btrfs_prepare_sprout()
     lockdep_assert_held(&uuid_mutex)
     clone_fs_devices(seed_fs_device) <== seed fs_devices

Both of these threads hold uuid_mutex which is sufficient to protect
getting the seed device(s) freed while we are trying to clone it for
sprouting [2] or mounting a sprout [1] (as above). A mounted seed device
can not free/write/replace because it is read-only. An unmounted seed
device can be freed by btrfs_free_stale_devices(), but it needs
uuid_mutex.  So this patch removes the unnecessary device_list_mutex in
clone_fs_devices().  And adds a lockdep_assert_held(&uuid_mutex) in
clone_fs_devices().

Reported-by: Su Yue <l@damenly.su>
Tested-by: Su Yue <l@damenly.su>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-26 14:07:12 +02:00
..
9p 9P: Cast to loff_t before multiplying 2020-11-05 11:43:34 +01:00
adfs Merge branch 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 11:33:22 -07:00
affs fs/affs: release old buffer head on error path 2021-03-04 10:26:48 +01:00
afs afs: Fix tracepoint string placement with built-in AFS 2021-07-28 13:30:58 +02:00
autofs autofs: fix a leak in autofs_expire_indirect() 2019-10-25 00:03:11 -04:00
befs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:48:39 +01:00
btrfs btrfs: fix lockdep warning while mounting sprout fs 2021-09-26 14:07:12 +02:00
cachefiles cachefiles: Handle readpage error correctly 2020-11-05 11:43:36 +01:00
ceph ceph: lockdep annotations for try_nonblocking_invalidate 2021-09-26 14:07:11 +02:00
cifs cifs: fix wrong release in sess_alloc_buffer() failed path 2021-09-22 12:26:35 +02:00
coda y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
configfs configfs: fix memleak in configfs_release_bin_file 2021-07-14 16:53:46 +02:00
cramfs cramfs: fix usage on non-MTD device 2019-11-23 21:44:49 -05:00
crypto fscrypt: add fscrypt_symlink_getattr() for computing st_size 2021-09-12 08:56:38 +02:00
debugfs debugfs: Return error during {full/open}_proxy_open() on rmmod 2021-09-15 09:47:33 +02:00
devpts devpts_pty_kill(): don't bother with d_delete() 2019-09-03 09:30:56 -04:00
dlm fs: dlm: fix memory leak when fenced 2021-07-14 16:53:17 +02:00
ecryptfs Revert "ecryptfs: replace BUG_ON with error handling code" 2021-05-26 12:05:19 +02:00
efivarfs efivarfs: revert "fix memory leak in efivarfs_create()" 2020-12-02 08:49:53 +01:00
efs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
erofs erofs: add unsupported inode i_format check 2021-05-11 14:04:02 +02:00
exportfs exportfs_decode_fh(): negative pinned may become positive without the parent locked 2019-11-10 11:56:05 -05:00
ext2 ext2: don't update mtime on COW faults 2020-09-09 19:12:30 +02:00
ext4 ext4: report correct st_size for encrypted symlinks 2021-09-12 08:56:38 +02:00
f2fs f2fs: fix to unmap pages from userspace process in punch_hole() 2021-09-22 12:26:26 +02:00
fat fat: don't allow to mount if the FAT length == 0 2020-06-17 16:40:36 +02:00
freevxfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
fscache fscache: Fix cookie key hashing 2021-09-22 12:26:25 +02:00
fuse fuse: fix use after free in fuse_read_interrupt() 2021-09-22 12:26:43 +02:00
gfs2 gfs2: Don't call dlm after protocol is unmounted 2021-09-22 12:26:33 +02:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-31 08:19:38 +02:00
hfsplus hfsplus: prevent corruption in shrinking truncate 2021-05-19 10:08:29 +02:00
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:24:14 +02:00
hpfs fs: hpfs: Initialize filesystem timestamp ranges 2019-08-30 08:11:25 -07:00
hugetlbfs hugetlbfs: fix mount mode command line processing 2021-07-28 13:31:01 +02:00
iomap mm/swap: consider max pages in iomap_swapfile_add_extent 2021-09-15 09:47:35 +02:00
isofs isofs: joliet: Fix iocharset=utf8 mount option 2021-09-15 09:47:27 +02:00
jbd2 jbd2: fix up sparse warnings in checkpoint code 2020-11-18 19:20:30 +01:00
jffs2 jffs2: check the validity of dstlen in jffs2_zlib_compress() 2021-05-11 14:04:16 +02:00
jfs fs/jfs: Fix missing error code in lmLogInit() 2021-07-20 16:10:42 +02:00
kernfs kernfs: do not call fsnotify() with name without a parent 2020-08-19 08:16:12 +02:00
lockd lockd: lockd server-side shouldn't set fl_ops 2021-09-22 12:26:34 +02:00
minix fs/minix: remove expected error message in block_to_path() 2020-08-21 13:05:37 +02:00
nfs NFSv4/pNFS: Don't call _nfs4_pnfs_v3_ds_connect multiple times 2021-07-20 16:10:50 +02:00
nfs_common nfs_common: need lock during iterate through the list 2020-12-30 11:51:22 +01:00
nfsd nfsd4: Fix forced-expiry locking 2021-09-15 09:47:35 +02:00
nilfs2 nilfs2: use refcount_dec_and_lock() to fix potential UAF 2021-09-26 14:07:09 +02:00
nls treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
notify fanotify: fix ignore mask logic for events on child and on dir 2020-06-17 16:40:24 +02:00
ntfs ntfs: fix validity check for file name attribute 2021-07-14 16:53:01 +02:00
ocfs2 ocfs2: issue zeroout to EOF blocks 2021-08-04 12:27:37 +02:00
omfs fs: omfs: Initialize filesystem timestamp ranges 2019-08-30 08:11:25 -07:00
openpromfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
orangefs orangefs: fix orangefs df output. 2021-07-20 16:10:48 +02:00
overlayfs ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup() 2021-09-22 12:26:37 +02:00
proc mm, oom: make the calculation of oom badness more accurate 2021-09-03 10:08:12 +02:00
pstore pstore: Fix typo in compression option name 2021-03-04 10:26:45 +01:00
qnx4 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
qnx6 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
quota quota: Fix memory leak when handling corrupted quota file 2021-03-04 10:26:26 +01:00
ramfs ramfs: fix nommu mmap with gaps in the page cache 2020-10-29 09:57:53 +01:00
reiserfs reiserfs: check directory items on read from disk 2021-08-12 13:21:05 +02:00
romfs romfs: fix uninitialized memory leak in romfs_dev_read() 2020-08-26 10:40:51 +02:00
squashfs squashfs: fix divide error in calculate_skip() 2021-05-19 10:08:29 +02:00
sysfs sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output 2021-03-07 12:20:48 +01:00
sysv fs: sysv: Initialize filesystem timestamp ranges 2019-08-30 07:27:18 -07:00
tracefs tracing: Do not create tracefs files if tracefs lockdown is in effect 2019-10-12 20:49:07 -04:00
ubifs ubifs: report correct st_size for encrypted symlinks 2021-09-12 08:56:39 +02:00
udf udf_get_extendedattr() had no boundary checks. 2021-09-15 09:47:28 +02:00
ufs fs/ufs: avoid potential u32 multiplication overflow 2020-08-21 13:05:37 +02:00
unicode unicode: make array 'token' static const, makes object smaller 2019-09-17 11:48:24 -04:00
verity fs-verity: support builtin file signatures 2019-08-12 19:33:50 -07:00
xfs xfs: Fix assert failure in xfs_setattr_size() 2021-03-07 12:20:42 +01:00
aio.c aio: fix async fsync creds 2020-06-17 16:40:24 +02:00
anon_inodes.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
attr.c utimes: Clamp the timestamps in notify_change() 2020-02-11 04:35:12 -08:00
bad_inode.c
binfmt_aout.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_elf_fdpic.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
binfmt_elf.c fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info() 2020-06-03 08:21:27 +02:00
binfmt_em86.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_flat.c binfmt_flat: revert "binfmt_flat: don't offset the data start" 2020-09-03 11:26:39 +02:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 17:03:57 +01:00
binfmt_script.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
block_dev.c block: reexpand iov_iter after read/write 2021-05-22 11:38:29 +02:00
buffer.c fs: Don't invalidate page buffers in block_write_full_page() 2020-11-05 11:43:24 +01:00
char_dev.c chardev: Avoid potential use-after-free in 'chrdev_open()' 2020-01-14 20:08:18 +01:00
compat_binfmt_elf.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 193 2019-05-30 11:29:21 -07:00
compat_ioctl.c fix compat handling of FICLONERANGE, FIDEDUPERANGE and FS_IOC_FIEMAP 2020-01-09 10:20:05 +01:00
compat.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
coredump.c coredump: fix core_pattern parse error 2020-12-11 13:23:30 +01:00
d_path.c fs: fix NULL dereference due to data race in prepend_path() 2020-10-29 09:57:45 +01:00
dax.c dax: fix ENOMEM handling in grab_mapping_entry() 2021-07-14 16:53:25 +02:00
dcache.c fix dget_parent() fastpath race 2020-10-01 13:17:19 +02:00
dcookies.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:24:11 +02:00
drop_caches.c fs: avoid softlockups in s_inodes iterators 2020-01-12 12:21:37 +01:00
eventfd.c eventfd: track eventfd_signal() recursion depth 2020-02-11 04:35:37 -08:00
eventpoll.c ep_create_wakeup_source(): dentry name can change under you... 2020-10-07 08:01:31 +02:00
exec.c exec: Transform exec_update_mutex into a rw_semaphore 2021-01-09 13:44:55 +01:00
fcntl.c fcntl: fix potential deadlock for &fasync_struct.fa_lock 2021-09-15 09:47:28 +02:00
fhandle.c fs/handle.c - fix up kerneldoc 2019-08-07 21:51:47 -04:00
file_table.c vfs: Export flush_delayed_fput for use by knfsd. 2019-08-19 11:00:39 -04:00
file.c fix multiplication overflow in copy_fdtable() 2020-05-27 17:46:12 +02:00
filesystems.c fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() 2020-04-17 10:50:21 +02:00
fs_context.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
fs_parser.c vfs: Make fs_parse() handle fs_param_is_fd-type params better 2019-09-12 21:06:14 -04:00
fs_pin.c switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
fs_struct.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
fs_types.c
fs-writeback.c writeback: fix obtain a reference to a freeing memcg css 2021-07-14 16:53:35 +02:00
fsopen.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
inode.c futex: Fix inode life-time issue 2020-03-25 08:25:58 +01:00
internal.h cgroup1: fix leaked context root causing sporadic NULL deref in LTP 2021-07-31 08:19:37 +02:00
io_uring.c io_uring: Fix current->fs handling in io_sq_wq_submit_work() 2021-01-30 13:54:10 +01:00
ioctl.c compat_ioctl: add compat_ptr_ioctl() 2019-12-17 19:55:30 +01:00
Kconfig fs-verity for 5.4 2019-09-18 16:59:14 -07:00
Kconfig.binfmt binfmt_flat: make support for old format binaries optional 2019-06-24 09:16:47 +10:00
libfs.c libfs: fix error cast of negative value in simple_attr_write() 2020-11-24 13:29:19 +01:00
locks.c locks: reinstate locks_delete_block optimization 2020-03-25 08:25:41 +01:00
Makefile fs-verity for 5.4 2019-09-18 16:59:14 -07:00
mbcache.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mount.h switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
mpage.c fs: move guard_bio_eod() after bio_set_op_attrs 2020-01-17 19:48:21 +01:00
namei.c namei: only return -ECHILD from follow_dotdot_rcu() 2020-03-05 16:43:48 +01:00
namespace.c fs: warn about impending deprecation of mandatory locks 2021-08-26 08:36:22 -04:00
no-block.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
nsfs.c vfs: Convert nsfs to use the new mount API 2019-05-25 18:00:06 -04:00
open.c cifs_atomic_open(): fix double-put on late allocation failure 2020-03-18 07:17:51 +01:00
pipe.c pipe: increase minimum default pipe size to 2 pages 2021-08-12 13:21:02 +02:00
pnode.c propagate_one(): mnt_set_mountpoint() needs mount_lock 2020-05-02 08:48:44 +02:00
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:03:33 +01:00
posix_acl.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
proc_namespace.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
read_write.c fs: allow deduplication of eof block into the end of the destination file 2020-02-11 04:35:23 -08:00
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 12:56:16 +02:00
select.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-24 11:26:44 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-20 16:10:54 +02:00
signalfd.c fs/signalfd.c: fix inconsistent return codes for signalfd4 2020-08-26 10:40:58 +02:00
splice.c splice: only read in as much information as there is pipe buffer space 2019-12-17 19:56:52 +01:00
stack.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
stat.c
statfs.c vfs: Fix EOVERFLOW testing in put_compat_statfs64 2019-10-03 14:21:35 -07:00
super.c vfs: remove lockdep bogosity in __sb_start_write 2020-11-24 13:29:01 +01:00
sync.c fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback 2019-05-14 09:47:50 -07:00
timerfd.c timerfd: Prepare for PREEMPT_RT 2019-08-01 20:51:23 +02:00
userfaultfd.c userfaultfd: prevent concurrent API initialization 2021-09-22 12:26:26 +02:00
utimes.c utimes: Clamp the timestamps in notify_change() 2020-02-11 04:35:12 -08:00
xattr.c xattr: break delegations in {set,remove}xattr 2020-08-11 15:33:39 +02:00