linux-brain/arch/x86/crypto
Ard Biesheuvel 6c3d86e6ff crypto: x86/aes-ni-xts - use direct calls to and 4-way stride
commit 86ad60a65f29dd862a11c22bb4b5be28d6c5cef1 upstream.

The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.

Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.

As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)

Fixes: 9697fa39ef ("x86/retpoline/crypto: Convert crypto assembler indirect jumps")
Tested-by: Eric Biggers <ebiggers@google.com> # x86_64
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[ardb: rebase onto stable/linux-5.4.y]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:39:47 +01:00
..
Makefile crypto: aegis128l/aegis256 - remove x86 and generic implementations 2019-07-26 15:03:56 +10:00
aegis128-aesni-asm.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
aegis128-aesni-glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
aes_ctrby8_avx-x86_64.S crypto: aesni - add compatibility with IAS 2020-08-19 08:16:22 +02:00
aes_glue.c crypto: x86/aes - drop scalar assembler implementations 2019-07-26 14:56:02 +10:00
aesni-intel_asm.S crypto: x86/aes-ni-xts - use direct calls to and 4-way stride 2021-03-20 10:39:47 +01:00
aesni-intel_avx-x86_64.S crypto: aesni - Use TEST %reg,%reg instead of CMP $0,%reg 2021-03-20 10:39:47 +01:00
aesni-intel_glue.c crypto: x86/aes-ni-xts - use direct calls to and 4-way stride 2021-03-20 10:39:47 +01:00
blowfish-x86_64-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
blowfish_glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
camellia-aesni-avx-asm_64.S x86/retpoline/crypto: Convert crypto assembler indirect jumps 2018-01-12 00:14:29 +01:00
camellia-aesni-avx2-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
camellia-x86_64-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
camellia_aesni_avx2_glue.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
camellia_aesni_avx_glue.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
camellia_glue.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
cast5-avx-x86_64-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
cast5_avx_glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
cast6-avx-x86_64-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
cast6_avx_glue.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
chacha-avx2-x86_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
chacha-avx512vl-x86_64.S crypto: x86/chacha20 - refactor to allow varying number of rounds 2018-12-13 18:24:58 +08:00
chacha-ssse3-x86_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
chacha_glue.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-07-08 20:57:08 -07:00
crc32-pclmul_asm.S crypto: crc32-pclmul - remove useless relative addressing 2017-10-07 12:10:30 +08:00
crc32-pclmul_glue.c crypto: x86 - convert to use crypto_simd_usable() 2019-03-22 20:57:27 +08:00
crc32c-intel_glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 335 2019-06-05 17:37:06 +02:00
crc32c-pcl-intel-asm_64.S crypto: x86/crc32c - fix building with clang ias 2020-11-01 12:01:06 +01:00
crct10dif-pcl-asm_64.S crypto: x86/crct10dif-pcl - cleanup and optimizations 2019-02-08 15:29:48 +08:00
crct10dif-pclmul_glue.c crypto: x86/crct10dif-pcl - fix use via crypto_shash_digest() 2019-04-08 14:42:54 +08:00
des3_ede-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
des3_ede_glue.c crypto: x86/des - switch to library interface 2019-08-22 14:57:33 +10:00
ghash-clmulni-intel_asm.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
ghash-clmulni-intel_glue.c crypto: ghash - add comment and improve help text 2019-07-27 21:08:38 +10:00
glue_helper-asm-avx.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
glue_helper-asm-avx2.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
glue_helper.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
nh-avx2-x86_64.S crypto: x86/nhpoly1305 - add AVX2 accelerated NHPoly1305 2018-12-13 18:24:57 +08:00
nh-sse2-x86_64.S crypto: x86/nhpoly1305 - add SSE2 accelerated NHPoly1305 2018-12-13 18:24:57 +08:00
nhpoly1305-avx2-glue.c crypto: arch/nhpoly1305 - process in explicit 4k chunks 2020-05-14 07:58:25 +02:00
nhpoly1305-sse2-glue.c crypto: arch/nhpoly1305 - process in explicit 4k chunks 2020-05-14 07:58:25 +02:00
poly1305-avx2-x86_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
poly1305-sse2-x86_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
poly1305_glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
serpent-avx-x86_64-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
serpent-avx2-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
serpent-sse2-i586-asm_32.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
serpent-sse2-x86_64-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
serpent_avx2_glue.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
serpent_avx_glue.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
serpent_sse2_glue.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
sha1_avx2_x86_64_asm.S crypto: x86/sha1-avx2 - Fix RBP usage 2017-09-20 17:42:34 +08:00
sha1_ni_asm.S crypto: x86 - make constants readonly, allow linker to merge them 2017-01-23 22:50:29 +08:00
sha1_ssse3_asm.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sha1_ssse3_glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sha256-avx-asm.S crypto: x86/sha256-avx - Fix RBP usage 2017-09-20 17:42:36 +08:00
sha256-avx2-asm.S crypto: x86/sha256-avx2 - Fix RBP usage 2017-09-20 17:42:36 +08:00
sha256-ssse3-asm.S crypto: x86/sha256-ssse3 - Fix RBP usage 2017-09-20 17:42:37 +08:00
sha256_ni_asm.S crypto: x86 - make constants readonly, allow linker to merge them 2017-01-23 22:50:29 +08:00
sha256_ssse3_glue.c crypto: x86 - Rename functions to avoid conflict with crypto/sha256.h 2019-09-05 14:37:30 +10:00
sha512-avx-asm.S crypto: x86 - make constants readonly, allow linker to merge them 2017-01-23 22:50:29 +08:00
sha512-avx2-asm.S crypto: sha512-avx2 - Fix RBP usage 2017-09-20 17:42:37 +08:00
sha512-ssse3-asm.S crypto: x86 - make constants readonly, allow linker to merge them 2017-01-23 22:50:29 +08:00
sha512_ssse3_glue.c crypto: x86 - convert to use crypto_simd_usable() 2019-03-22 20:57:27 +08:00
twofish-avx-x86_64-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
twofish-i586-asm_32.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
twofish-x86_64-asm_64-3way.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
twofish-x86_64-asm_64.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
twofish_avx_glue.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
twofish_glue.c crypto: prefix module autoloading with "crypto-" 2014-11-24 22:43:57 +08:00
twofish_glue_3way.c crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00