riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release} - kernel

diff options

author	Jisheng Zhang <[email protected]>	2024-03-25 11:10:38 +0000
committer	Palmer Dabbelt <[email protected]>	2024-04-24 19:57:49 +0000
commit	79d6e4eae9662b9103fecf94d52b44deca56743c (patch)
tree	23e09e4534c78b161086b2b1edd773defb04add9 /scripts/generate_rust_target.rs
parent	riscv: select ARCH_USE_CMPXCHG_LOCKREF (diff)
download	kernel-79d6e4eae9662b9103fecf94d52b44deca56743c.tar.gz kernel-79d6e4eae9662b9103fecf94d52b44deca56743c.zip

riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}

After selecting ARCH_USE_CMPXCHG_LOCKREF, one straight futher optimization is implementing the arch_cmpxchg64_relaxed() because the lockref code does not need the cmpxchg to have barrier semantics. At the same time, implement arch_cmpxchg64_acquire and arch_cmpxchg64_release as well. However, on both TH1520 and JH7110 platforms, I didn't see obvious performance improvement with Linus' test case [1]. IMHO, this may be related with the fence and lr.d/sc.d hw implementations. In theory, lr/sc without fence could give performance improvement over lr/sc plus fence, so add the code here to leave performance improvement room on newer HW platforms. Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1] Signed-off-by: Jisheng Zhang <[email protected]> Reviewed-by: Andrea Parri <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>

Diffstat (limited to 'scripts/generate_rust_target.rs')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: