aboutsummaryrefslogtreecommitdiffstats
path: root/lib/net_utils.c
diff options
context:
space:
mode:
authorEric Biggers <[email protected]>2025-07-06 23:11:00 +0000
committerEric Biggers <[email protected]>2025-07-11 21:29:42 +0000
commit9f65592b7e1f24569bb6ced064df5b4319f725ce (patch)
tree52f0b37eeb16174beb4f22d2bed273d9a7d10e3d /lib/net_utils.c
parentlib/crypto: x86/poly1305: Fix register corruption in no-SIMD contexts (diff)
downloadkernel-9f65592b7e1f24569bb6ced064df5b4319f725ce.tar.gz
kernel-9f65592b7e1f24569bb6ced064df5b4319f725ce.zip
lib/crypto: x86/poly1305: Fix performance regression on short messages
Restore the len >= 288 condition on using the AVX implementation, which was incidentally removed by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface"). This check took into account the overhead in key power computation, kernel-mode "FPU", and tail handling associated with the AVX code. Indeed, restoring this check slightly improves performance for len < 256 as measured using poly1305_kunit on an "AMD Ryzen AI 9 365" (Zen 5) CPU: Length Before After ====== ========== ========== 1 30 MB/s 36 MB/s 16 516 MB/s 598 MB/s 64 1700 MB/s 1882 MB/s 127 2265 MB/s 2651 MB/s 128 2457 MB/s 2827 MB/s 200 2702 MB/s 3238 MB/s 256 3841 MB/s 3768 MB/s 511 4580 MB/s 4585 MB/s 512 5430 MB/s 5398 MB/s 1024 7268 MB/s 7305 MB/s 3173 8999 MB/s 8948 MB/s 4096 9942 MB/s 9921 MB/s 16384 10557 MB/s 10545 MB/s While the optimal threshold for this CPU might be slightly lower than 288 (see the len == 256 case), other CPUs would need to be tested too, and these sorts of benchmarks can underestimate the true cost of kernel-mode "FPU". Therefore, for now just restore the 288 threshold. Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface") Cc: [email protected] Reviewed-by: Ard Biesheuvel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Eric Biggers <[email protected]>
Diffstat (limited to 'lib/net_utils.c')
0 files changed, 0 insertions, 0 deletions