diff options
| author | Eric Dumazet <[email protected]> | 2010-10-04 05:17:54 +0000 |
|---|---|---|
| committer | David S. Miller <[email protected]> | 2010-10-04 05:17:54 +0000 |
| commit | c7d4426a98a5f6654cd0b4b33d9dab2e77192c18 (patch) | |
| tree | 0db2524e6f3f742861765dd6aa696a9271767056 /net/ipv4/tcp_input.c | |
| parent | sctp: Fix break indentation in sctp_ioctl(). (diff) | |
| download | kernel-c7d4426a98a5f6654cd0b4b33d9dab2e77192c18.tar.gz kernel-c7d4426a98a5f6654cd0b4b33d9dab2e77192c18.zip | |
net: introduce DST_NOCACHE flag
While doing stress tests with IP route cache disabled, and multi queue
devices, I noticed a very high contention on one rwlock used in
neighbour code.
When many cpus are trying to send frames (possibly using a high
performance multiqueue device) to the same neighbour, they fight for the
neigh->lock rwlock in order to call neigh_hh_init(), and fight on
hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test())
But we dont need to call neigh_hh_init() for dst that are used only
once. It costs four atomic operations at least, on two contended cache
lines, plus the high contention on neigh->lock rwlock.
Introduce a new dst flag, DST_NOCACHE, that is set when dst was not
inserted in route cache.
With the stress test bench, sending 160000000 frames on one neighbour,
results are :
Before patch:
real 2m28.406s
user 0m11.781s
sys 36m17.964s
After patch:
real 1m26.532s
user 0m12.185s
sys 20m3.903s
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Diffstat (limited to 'net/ipv4/tcp_input.c')
0 files changed, 0 insertions, 0 deletions
