linux-brain/fs/btrfs
ethanwu 27afc71283 btrfs: backref, use correct count to resolve normal data refs
commit b25b0b871f206936d5bca02b80d38c05623e27da upstream.

With the following patches:

- btrfs: backref, only collect file extent items matching backref offset
- btrfs: backref, not adding refs from shared block when resolving normal backref
- btrfs: backref, only search backref entries from leaves of the same root

we only collect the normal data refs we want, so the imprecise upper
bound total_refs of that EXTENT_ITEM could now be changed to the count
of the normal backref entry we want to search.

Background and how the patches fit together:

Btrfs has two types of data backref.
For BTRFS_EXTENT_DATA_REF_KEY type of backref, we don't have the
exact block number. Therefore, we need to call resolve_indirect_refs.
It uses btrfs_search_slot to locate the leaf block. Then
we need to walk through the leaves to search for the EXTENT_DATA items
that have disk bytenr matching the extent item (add_all_parents).

When resolving indirect refs, we could take entries that don't
belong to the backref entry we are searching for right now.
For that reason when searching backref entry, we always use total
refs of that EXTENT_ITEM rather than individual count.

For example:
item 11 key (40831553536 EXTENT_ITEM 4194304) itemoff 15460 itemsize
  extent refs 24 gen 7302 flags DATA
  shared data backref parent 394985472 count 10 #1
  extent data backref root 257 objectid 260 offset 1048576 count 3 #2
  extent data backref root 256 objectid 260 offset 65536 count 6 #3
  extent data backref root 257 objectid 260 offset 65536 count 5 #4

For example, when searching backref entry #4, we'll use total_refs
24, a very loose loop ending condition, instead of total_refs = 5.

But using total_refs = 24 is not accurate. Sometimes, we'll never find
all the refs from specific root.  As a result, the loop keeps on going
until we reach the end of that inode.

The first 3 patches, handle 3 different types refs we might encounter.
These refs do not belong to the normal backref we are searching, and
hence need to be skipped.

This patch changes the total_refs to correct number so that we could
end loop as soon as we find all the refs we want.

btrfs send uses backref to find possible clone sources, the following
is a simple test to compare the results with and without this patch:

 $ btrfs subvolume create /sub1
 $ for i in `seq 1 163840`; do
     dd if=/dev/zero of=/sub1/file bs=64K count=1 seek=$((i-1)) conv=notrunc oflag=direct
   done
 $ btrfs subvolume snapshot /sub1 /sub2
 $ for i in `seq 1 163840`; do
     dd if=/dev/zero of=/sub1/file bs=4K count=1 seek=$(((i-1)*16+10)) conv=notrunc oflag=direct
   done
 $ btrfs subvolume snapshot -r /sub1 /snap1
 $ time btrfs send /snap1 | btrfs receive /volume2

Without this patch:

real 69m48.124s
user 0m50.199s
sys  70m15.600s

With this patch:

real    1m59.683s
user    0m35.421s
sys     2m42.684s

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: ethanwu <ethanwu@synology.com>
[ add patchset cover letter with background and numbers ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-07 15:35:47 +01:00
..
tests btrfs: Correctly handle empty trees in find_first_clear_extent_bit 2020-02-11 04:35:34 -08:00
acl.c btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
async-thread.c Btrfs: fix crash during unmount due to race with delayed inode workers 2020-04-17 10:50:15 +02:00
async-thread.h Btrfs: fix crash during unmount due to race with delayed inode workers 2020-04-17 10:50:15 +02:00
backref.c btrfs: backref, use correct count to resolve normal data refs 2021-02-07 15:35:47 +01:00
backref.h btrfs: fiemap: preallocate ulists for btrfs_check_shared 2019-07-01 13:34:53 +02:00
block-group.c btrfs: fix possible free space tree corruption with online conversion 2021-02-03 23:25:57 +01:00
block-group.h btrfs: move struct io_ctl to free-space-cache.h 2019-09-09 14:59:15 +02:00
block-rsv.c btrfs: force chunk allocation if our global rsv is larger than metadata 2020-06-22 09:31:13 +02:00
block-rsv.h btrfs: migrate the global_block_rsv helpers to block-rsv.c 2019-07-02 12:30:55 +02:00
btrfs_inode.h btrfs: remove assumption about csum type form btrfs_print_data_csum_error() 2019-07-01 13:35:02 +02:00
check-integrity.c btrfs: fix possible NULL-pointer dereference in integrity checks 2020-02-24 08:36:53 +01:00
check-integrity.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
compression.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00
compression.h btrfs: compression: replace set_level callbacks by a common helper 2019-09-09 14:59:11 +02:00
ctree.c btrfs: cleanup cow block on error 2020-11-05 11:43:27 +01:00
ctree.h btrfs: fix possible free space tree corruption with online conversion 2021-02-03 23:25:57 +01:00
delalloc-space.c Btrfs: fix qgroup double free after failure to reserve metadata for delalloc 2019-10-17 20:13:44 +02:00
delalloc-space.h btrfs: migrate the delalloc space stuff to it's own home 2019-07-04 17:26:17 +02:00
delayed-inode.c btrfs: qgroup: fix wrong qgroup metadata reserve for delayed inode 2020-11-05 11:43:26 +01:00
delayed-inode.h Btrfs: delayed-inode: use rb_first_cached for ins_root and del_root 2018-10-15 17:23:33 +02:00
delayed-ref.c Btrfs: fix race between adding and putting tree mod seq elements and nodes 2020-02-11 04:35:34 -08:00
delayed-ref.h btrfs: migrate the delayed refs rsv code 2019-07-04 17:26:17 +02:00
dev-replace.c btrfs: dev-replace: fail mount if we don't have replace item with target device 2020-11-18 19:20:29 +01:00
dev-replace.h btrfs: get fs_info from trans in btrfs_run_dev_replace 2019-04-29 19:02:43 +02:00
dir-item.c btrfs: remove unused parameter fs_info from btrfs_extend_item 2019-04-29 19:02:50 +02:00
disk-io.c btrfs: fix overflow when copying corrupt csums for a message 2020-10-01 13:18:24 +02:00
disk-io.h btrfs: Make reada_tree_block_flagged private 2019-09-09 14:59:11 +02:00
export.c btrfs: export helpers for subvolume name/id resolution 2020-08-26 10:40:49 +02:00
export.h btrfs: export helpers for subvolume name/id resolution 2020-08-26 10:40:49 +02:00
extent_io.c btrfs: prevent NULL pointer dereference in extent_io_tree_panic 2021-01-19 18:26:11 +01:00
extent_io.h btrfs: trim: fix underflow in trim length to prevent access beyond device boundary 2020-12-30 11:51:37 +01:00
extent_map.c Btrfs: fix race between using extent maps and merging them 2020-02-19 19:53:00 +01:00
extent_map.h btrfs: Remove impossible condition from mergable_maps 2019-02-25 14:13:21 +01:00
extent-tree.c btrfs: don't get an EINTR during drop_snapshot for reloc 2021-01-27 11:47:40 +01:00
file-item.c btrfs: do not ignore error from btrfs_next_leaf() when inserting checksums 2020-06-22 09:30:55 +02:00
file.c btrfs: allow btrfs_truncate_block() to fallback to nocow for data space reservation 2020-10-14 10:33:00 +02:00
free-space-cache.c btrfs: fix space cache memory leak after transaction abort 2020-09-03 11:27:02 +02:00
free-space-cache.h btrfs: move struct io_ctl to free-space-cache.h 2019-09-09 14:59:15 +02:00
free-space-tree.c btrfs: fix possible free space tree corruption with online conversion 2021-02-03 23:25:57 +01:00
free-space-tree.h btrfs: move basic block_group definitions to their own header 2019-09-09 14:59:03 +02:00
inode-item.c btrfs: Make btrfs_find_name_in_ext_backref return struct btrfs_inode_extref 2019-09-09 14:59:16 +02:00
inode-map.c btrfs: qgroup: Always free PREALLOC META reserve in btrfs_delalloc_release_extents() 2019-10-15 18:50:07 +02:00
inode-map.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
inode.c btrfs: allow btrfs_truncate_block() to fallback to nocow for data space reservation 2020-10-14 10:33:00 +02:00
ioctl.c btrfs: fix race when defragmenting leads to unnecessary IO 2021-01-06 14:48:36 +01:00
Kconfig btrfs: Fix build error while LIBCRC32C is module 2019-07-17 17:03:30 +02:00
locking.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00
locking.h btrfs: Remove unused locking functions 2019-09-09 14:58:59 +02:00
lzo.c btrfs: compression: replace set_level callbacks by a common helper 2019-09-09 14:59:11 +02:00
Makefile btrfs: migrate the block group lookup code 2019-09-09 14:59:04 +02:00
misc.h btrfs: move math functions to misc.h 2019-09-09 14:59:15 +02:00
ordered-data.c Btrfs: fix btrfs_wait_ordered_range() so that it waits for all ordered extents 2020-02-28 17:22:24 +01:00
ordered-data.h btrfs: don't assume ordered sums to be 4 bytes 2019-07-01 13:35:00 +02:00
orphan.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
print-tree.c btrfs: require only sector size alignment for parent eb bytenr 2020-09-17 13:47:51 +02:00
print-tree.h btrfs: print-tree: debugging output enhancement 2018-04-20 19:18:16 +02:00
props.c btrfs: rename the btrfs_calc_*_metadata_size helpers 2019-09-09 14:59:13 +02:00
props.h btrfs: delete unused function btrfs_set_prop_trans 2019-04-29 19:02:54 +02:00
qgroup.c btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan 2021-01-19 18:26:14 +01:00
qgroup.h btrfs: make btrfs_qgroup_check_reserved_leak take btrfs_inode 2020-09-03 11:26:47 +02:00
raid56.c btrfs: get rid of unique workqueue helper functions 2020-01-09 10:20:06 +01:00
raid56.h btrfs: constify map parameter for nr_parity_stripes and nr_data_stripes 2019-07-01 13:34:58 +02:00
rcu-string.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
reada.c btrfs: fix readahead hang and use-after-free after removing a device 2020-11-05 11:43:27 +01:00
ref-verify.c btrfs: ref-verify: fix memory leak in btrfs_ref_tree_mod 2020-11-18 19:20:28 +01:00
ref-verify.h btrfs: ref-verify: Use btrfs_ref to refactor btrfs_ref_tree_mod() 2019-04-29 19:02:49 +02:00
relocation.c btrfs: fix min reserved size calculation in merge_reloc_root 2020-11-18 19:20:28 +01:00
root-tree.c btrfs: do not delete mismatched root refs 2020-01-23 08:22:40 +01:00
scrub.c btrfs: allocate scrub workqueues outside of locks 2020-09-09 19:12:31 +02:00
send.c btrfs: send: fix invalid clone operations when cloning from the same file and root 2021-01-27 11:47:41 +01:00
send.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
space-info.c btrfs: take overcommit into account in inc_block_group_ro 2020-10-17 10:11:21 +02:00
space-info.h btrfs: take overcommit into account in inc_block_group_ro 2020-10-17 10:11:21 +02:00
struct-funcs.c btrfs: tie extent buffer and it's token together 2019-09-09 14:59:16 +02:00
super.c btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan 2021-01-19 18:26:14 +01:00
sysfs.c btrfs: sysfs: use NOFS for device creation 2020-08-21 13:05:22 +02:00
sysfs.h btrfs: sysfs: move helper macros to sysfs.c 2019-09-09 14:59:08 +02:00
transaction.c btrfs: add wrapper for transaction abort predicate 2020-08-26 10:40:49 +02:00
transaction.h btrfs: add wrapper for transaction abort predicate 2020-08-26 10:40:49 +02:00
tree-checker.c btrfs: tree-checker: check if chunk item end overflows 2021-01-19 18:26:13 +01:00
tree-checker.h btrfs: get fs_info from eb in btrfs_check_chunk_valid 2019-04-29 19:02:39 +02:00
tree-defrag.c btrfs: open code now trivial btrfs_set_lock_blocking 2019-02-25 14:13:27 +01:00
tree-log.c btrfs: reschedule if necessary when logging directory items 2020-11-05 11:43:26 +01:00
tree-log.h btrfs: get fs_info from trans in btrfs_set_log_full_commit 2019-04-29 19:02:41 +02:00
ulist.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
ulist.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
uuid-tree.c btrfs: handle ENOENT in btrfs_uuid_tree_iterate 2019-12-31 16:42:05 +01:00
volumes.c btrfs: fix lockdep splat in btrfs_recover_relocation 2021-01-27 11:47:40 +01:00
volumes.h btrfs: fix readahead hang and use-after-free after removing a device 2020-11-05 11:43:27 +01:00
xattr.c Btrfs: fix failure to persist compression property xattr deletion on fsync 2019-06-17 16:37:17 +02:00
xattr.h btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
zlib.c btrfs: compression: replace set_level callbacks by a common helper 2019-09-09 14:59:11 +02:00
zstd.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00