aboutsummaryrefslogtreecommitdiffstats
path: root/net/core/secure_seq.c
Commit message (Collapse)AuthorAgeFilesLines
* net: Retire DCCP socket.Kuniyuki Iwashima2025-04-121-42/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DCCP was orphaned in 2021 by commit 054c4610bd05 ("MAINTAINERS: dccp: move Gerrit Renker to CREDITS"), which noted that the last maintainer had been inactive for five years. In recent years, it has become a playground for syzbot, and most changes to DCCP have been odd bug fixes triggered by syzbot. Apart from that, the only changes have been driven by treewide or networking API updates or adjustments related to TCP. Thus, in 2023, we announced we would remove DCCP in 2025 via commit b144fcaf46d4 ("dccp: Print deprecation notice."). Since then, only one individual has contacted the netdev mailing list. [0] There is ongoing research for Multipath DCCP. The repository is hosted on GitHub [1], and development is not taking place through the upstream community. While the repository is published under the GPLv2 license, the scheduling part remains proprietary, with a LICENSE file [2] stating: "This is not Open Source software." The researcher mentioned a plan to address the licensing issue, upstream the patches, and step up as a maintainer, but there has been no further communication since then. Maintaining DCCP for a decade without any real users has become a burden. Therefore, it's time to remove it. Removing DCCP will also provide significant benefits to TCP. It allows us to freely reorganize the layout of struct inet_connection_sock, which is currently shared with DCCP, and optimize it to reduce the number of cachelines accessed in the TCP fast path. Note that we keep DCCP netfilter modules as requested. [3] Link: https://lore.kernel.org/netdev/[email protected]/T/#u #[0] Link: https://github.com/telekom/mp-dccp #[1] Link: https://github.com/telekom/mp-dccp/blob/mpdccp_v03_k5.10/net/dccp/non_gpl_scheduler/LICENSE #[2] Link: https://lore.kernel.org/netdev/Z_VQ0KlCRkqYWXa-@calendula/ #[3] Signed-off-by: Kuniyuki Iwashima <[email protected]> Acked-by: Paul Moore <[email protected]> (LSM and SELinux) Acked-by: Casey Schaufler <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
* tcp: use EXPORT_IPV6_MOD[_GPL]()Eric Dumazet2025-02-141-1/+1
| | | | | | | | | | | | | | Use EXPORT_IPV6_MOD[_GPL]() for symbols that don't need to be exported unless CONFIG_IPV6=m tcp_hashinfo and tcp_openreq_init_rwin() are no longer used from any module anyway. Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Mateusz Polchlopek <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
* tcp: Fix data-races around sysctl knobs related to SYN option.Kuniyuki Iwashima2022-07-201-2/+2
| | | | | | | | | | | | | While reading these knobs, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. - tcp_sack - tcp_window_scaling - tcp_timestamps Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* tcp: resalt the secret every 10 secondsEric Dumazet2022-05-051-3/+9
| | | | | | | | | | | | | | | In order to limit the ability for an observer to recognize the source ports sequence used to contact a set of destinations, we should periodically shuffle the secret. 10 seconds looks effective enough without causing particular issues. Cc: Moshe Kol <[email protected]> Cc: Yossi Gilad <[email protected]> Cc: Amit Klein <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Tested-by: Willy Tarreau <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
* secure_seq: use the 64 bits of the siphash for port offset calculationWilly Tarreau2022-05-051-2/+2
| | | | | | | | | | | | | | | | | | SipHash replaced MD5 in secure_ipv{4,6}_port_ephemeral() via commit 7cd23e5300c1 ("secure_seq: use SipHash in place of MD5"), but the output remained truncated to 32-bit only. In order to exploit more bits from the hash, let's make the functions return the full 64-bit of siphash_3u32(). We also make sure the port offset calculation in __inet_hash_connect() remains done on 32-bit to avoid the need for div_u64_rem() and an extra cost on 32-bit systems. Cc: Jason A. Donenfeld <[email protected]> Cc: Moshe Kol <[email protected]> Cc: Yossi Gilad <[email protected]> Cc: Amit Klein <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Willy Tarreau <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
* net: align static siphash keysEric Dumazet2021-11-171-2/+2
| | | | | | | | | | siphash keys use 16 bytes. Define siphash_aligned_key_t macro so that we can make sure they are not crossing a cache line boundary. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
* crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.hEric Biggers2020-05-081-1/+0
| | | | | | | | | | | | | | | <linux/cryptohash.h> sounds very generic and important, like it's the header to include if you're doing cryptographic hashing in the kernel. But actually it only includes the library implementation of the SHA-1 compression function (not even the full SHA-1). This should basically never be used anymore; SHA-1 is no longer considered secure, and there are much better ways to do cryptographic hashing in the kernel. Most files that include this header don't actually need it. So in preparation for removing it, remove all these unneeded includes of it. Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
* treewide: Add SPDX license identifier for missed filesThomas Gleixner2019-05-211-0/+1
| | | | | | | | | | | | | | | | | Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
* infiniband: i40iw, nes: don't use wall time for TCP sequence numbersArnd Bergmann2018-07-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The nes infiniband driver uses current_kernel_time() to get a nanosecond granunarity timestamp to initialize its tcp sequence counters. This is one of only a few remaining users of that deprecated function, so we should try to get rid of it. Aside from using a deprecated API, there are several problems I see here: - Using a CLOCK_REALTIME based time source makes it predictable in case the time base is synchronized. - Using a coarse timestamp means it only gets updated once per jiffie, making it even more predictable in order to avoid having to access the hardware clock source - The upper 2 bits are always zero because the nanoseconds are at most 999999999. For the Linux TCP implementation, we use secure_tcp_seq(), which appears to be appropriate here as well, and solves all the above problems. i40iw uses a variant of the same code, so I do that same thing there for ipv4. Unlike nes, i40e also supports ipv6, which needs to call secure_tcpv6_seq instead. Acked-by: Shiraz Saleem <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
* tcp: Namespaceify sysctl_tcp_timestampsEric Dumazet2017-06-081-4/+5
| | | | | Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* tcp: randomize timestamps on syncookiesEric Dumazet2017-05-051-12/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Whole point of randomization was to hide server uptime, but an attacker can simply start a syn flood and TCP generates 'old style' timestamps, directly revealing server jiffies value. Also, TSval sent by the server to a particular remote address vary depending on syncookies being sent or not, potentially triggering PAWS drops for innocent clients. Lets implement proper randomization, including for SYNcookies. Also we do not need to export sysctl_tcp_timestamps, since it is not used from a module. In v2, I added Florian feedback and contribution, adding tsoff to tcp_get_cookie_sock(). v3 removed one unused variable in tcp_v4_connect() as Florian spotted. Fixes: 95a22caee396c ("tcp: randomize tcp timestamp offsets for each connection") Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Florian Westphal <[email protected]> Tested-by: Florian Westphal <[email protected]> Cc: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-04-061-2/+29
|\ | | | | | | | | | | | | Mostly simple cases of overlapping changes (adding code nearby, a function whose name changes, for example). Signed-off-by: David S. Miller <[email protected]>
| * secure_seq: downgrade to per-host timestamp offsetsFlorian Westphal2017-03-251-2/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unfortunately too many devices (not under our control) use tcp_tw_recycle=1, which depends on timestamps being identical of the same saddr. Although tcp_tw_recycle got removed in net-next we can't make such end hosts disappear so downgrade to per-host timestamp offsets. Cc: Soheil Hassas Yeganeh <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Neal Cardwell <[email protected]> Cc: Yuchung Cheng <[email protected]> Reported-by: Yvan Vanrossomme <[email protected]> Fixes: 95a22caee396c ("tcp: randomize tcp timestamp offsets for each connection") Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* | tcp: rename *_sequence_number() to *_seq_and_tsoff()Alexey Kodanev2017-03-101-7/+6
|/ | | | | | | | | | | The functions that are returning tcp sequence number also setup TS offset value, so rename them to better describe their purpose. No functional changes in this patch. Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Alexey Kodanev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* secure_seq: fix sparse errorsEric Dumazet2017-01-121-1/+3
| | | | | | | | | | | | | | | | | | | | | | | Fixes following warnings : net/core/secure_seq.c:125:28: warning: incorrect type in argument 1 (different base types) net/core/secure_seq.c:125:28: expected unsigned int const [unsigned] [usertype] a net/core/secure_seq.c:125:28: got restricted __be32 [usertype] saddr net/core/secure_seq.c:125:35: warning: incorrect type in argument 2 (different base types) net/core/secure_seq.c:125:35: expected unsigned int const [unsigned] [usertype] b net/core/secure_seq.c:125:35: got restricted __be32 [usertype] daddr net/core/secure_seq.c:125:43: warning: cast from restricted __be16 net/core/secure_seq.c:125:61: warning: restricted __be16 degrades to integer Fixes: 7cd23e5300c1 ("secure_seq: use SipHash in place of MD5") Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Jason A. Donenfeld <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* secure_seq: use SipHash in place of MD5Jason A. Donenfeld2017-01-091-82/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This gives a clear speed and security improvement. Siphash is both faster and is more solid crypto than the aging MD5. Rather than manually filling MD5 buffers, for IPv6, we simply create a layout by a simple anonymous struct, for which gcc generates rather efficient code. For IPv4, we pass the values directly to the short input convenience functions. 64-bit x86_64: [ 1.683628] secure_tcpv6_sequence_number_md5# cycles: 99563527 [ 1.717350] secure_tcp_sequence_number_md5# cycles: 92890502 [ 1.741968] secure_tcpv6_sequence_number_siphash# cycles: 67825362 [ 1.762048] secure_tcp_sequence_number_siphash# cycles: 67485526 32-bit x86: [ 1.600012] secure_tcpv6_sequence_number_md5# cycles: 103227892 [ 1.634219] secure_tcp_sequence_number_md5# cycles: 94732544 [ 1.669102] secure_tcpv6_sequence_number_siphash# cycles: 96299384 [ 1.700165] secure_tcp_sequence_number_siphash# cycles: 86015473 Signed-off-by: Jason A. Donenfeld <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Miller <[email protected]> Cc: David Laight <[email protected]> Cc: Tom Herbert <[email protected]> Cc: Hannes Frederic Sowa <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* tcp: allow to turn tcp timestamp randomization offFlorian Westphal2016-12-021-2/+3
| | | | | | | | | | | | | | Eric says: "By looking at tcpdump, and TS val of xmit packets of multiple flows, we can deduct the relative qdisc delays (think of fq pacing). This should work even if we have one flow per remote peer." Having random per flow (or host) offsets doesn't allow that anymore so add a way to turn this off. Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Acked-by: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* tcp: randomize tcp timestamp offsets for each connectionFlorian Westphal2016-12-021-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | jiffies based timestamps allow for easy inference of number of devices behind NAT translators and also makes tracking of hosts simpler. commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset") added the main infrastructure that is needed for per-connection ts randomization, in particular writing/reading the on-wire tcp header format takes the offset into account so rest of stack can use normal tcp_time_stamp (jiffies). So only two items are left: - add a tsoffset for request sockets - extend the tcp isn generator to also return another 32bit number in addition to the ISN. Re-use of ISN generator also means timestamps are still monotonically increasing for same connection quadruple, i.e. PAWS will still work. Includes fixes from Eric Dumazet. Signed-off-by: Florian Westphal <[email protected]> Acked-by: Eric Dumazet <[email protected]> Acked-by: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* net: remove a sparse error in secure_dccpv6_sequence_number()Eric Dumazet2015-05-261-1/+1
| | | | | | | | | make C=2 CF=-D__CHECK_ENDIAN__ net/core/secure_seq.o net/core/secure_seq.c:157:50: warning: restricted __be32 degrades to integer Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* net: use ktime_get_ns() and ktime_get_real_ns() helpersEric Dumazet2014-08-231-3/+3
| | | | | | | | | ktime_get_ns() replaces ktime_to_ns(ktime_get()) ktime_get_real_ns() replaces ktime_to_ns(ktime_get_real()) Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* inetpeer: get rid of ip_id_countEric Dumazet2014-06-021-25/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ideally, we would need to generate IP ID using a per destination IP generator. linux kernels used inet_peer cache for this purpose, but this had a huge cost on servers disabling MTU discovery. 1) each inet_peer struct consumes 192 bytes 2) inetpeer cache uses a binary tree of inet_peer structs, with a nominal size of ~66000 elements under load. 3) lookups in this tree are hitting a lot of cache lines, as tree depth is about 20. 4) If server deals with many tcp flows, we have a high probability of not finding the inet_peer, allocating a fresh one, inserting it in the tree with same initial ip_id_count, (cf secure_ip_id()) 5) We garbage collect inet_peer aggressively. IP ID generation do not have to be 'perfect' Goal is trying to avoid duplicates in a short period of time, so that reassembly units have a chance to complete reassembly of fragments belonging to one message before receiving other fragments with a recycled ID. We simply use an array of generators, and a Jenkin hash using the dst IP as a key. ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it belongs (it is only used from this file) secure_ip_id() and secure_ipv6_id() no longer are needed. Rename ip_select_ident_more() to ip_select_ident_segs() to avoid unnecessary decrement/increment of the number of segments. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-10-231-0/+2
|\ | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/usb/qmi_wwan.c include/net/dst.h Trivial merge conflicts, both were overlapping changes. Signed-off-by: David S. Miller <[email protected]>
| * net: secure_seq: Fix warning when CONFIG_IPV6 and CONFIG_INET are not selectedFabio Estevam2013-10-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | net_secret() is only used when CONFIG_IPV6 or CONFIG_INET are selected. Building a defconfig with both of these symbols unselected (Using the ARM at91sam9rl_defconfig, for example) leads to the following build warning: $ make at91sam9rl_defconfig # # configuration written to .config # $ make net/core/secure_seq.o scripts/kconfig/conf --silentoldconfig Kconfig CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h make[1]: `include/generated/mach-types.h' is up to date. CALL scripts/checksyscalls.sh CC net/core/secure_seq.o net/core/secure_seq.c:17:13: warning: 'net_secret_init' defined but not used [-Wunused-function] Fix this warning by protecting the definition of net_secret() with these symbols. Reported-by: Olof Johansson <[email protected]> Signed-off-by: Fabio Estevam <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* | net: always inline net_secret_initHannes Frederic Sowa2013-10-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently net_secret_init does not get inlined, so we always have a call to net_secret_init even in the fast path. Let's specify net_secret_init as __always_inline so we have the nop in the fast-path without the call to net_secret_init and the unlikely path at the epilogue of the function. jump_labels handle the inlining correctly. Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* | net: switch net_secret key generation to net_get_random_onceHannes Frederic Sowa2013-10-191-12/+2
|/ | | | | | | Cc: Eric Dumazet <[email protected]> Cc: "David S. Miller" <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* net: net_secret should not depend on TCPEric Dumazet2013-09-281-3/+24
| | | | | | | | | | | | | | A host might need net_secret[] and never open a single socket. Problem added in commit aebda156a570782 ("net: defer net_secret[] initialization") Based on prior patch from Hannes Frederic Sowa. Reported-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* net: defer net_secret[] initializationEric Dumazet2013-04-291-3/+1
| | | | | | | | | | | Instead of feeding net_secret[] at boot time, defer the init at the point first socket is created. This permits some platforms to use better entropy sources than the ones available at boot time. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* netfilter: ipv6: add IPv6 NAT supportPatrick McHardy2012-08-301-0/+1
| | | | Signed-off-by: Patrick McHardy <[email protected]>
* net: fix some sparse errorsEric Dumazet2012-01-171-1/+1
| | | | | | | | | make C=2 CF="-D__CHECK_ENDIAN__" M=net And fix flowi4_init_output() prototype for sport Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* net:core: use IS_ENABLEDIgor Maravić2011-12-161-1/+1
| | | | | | | | Use IS_ENABLED(CONFIG_FOO) instead of defined(CONFIG_FOO) || defined (CONFIG_FOO_MODULE) Signed-off-by: Igor Maravić <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* net: use IS_ENABLED(CONFIG_IPV6)Eric Dumazet2011-12-111-2/+2
| | | | | | | Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* net: Silence seq_scale() unused warningStephen Boyd2011-12-061-0/+2
| | | | | | | | | | On a CONFIG_NET=y build net/core/secure_seq.c:22: warning: 'seq_scale' defined but not used Signed-off-by: Stephen Boyd <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* tcp: add const qualifiers where possibleEric Dumazet2011-10-211-1/+1
| | | | | | | | | | | | Adding const qualifiers to pointers can ease code review, and spot some bugs. It might allow compiler to optimize code further. For example, is it legal to temporary write a null cksum into tcphdr in tcp_md5_hash_header() ? I am afraid a sniffer could catch the temporary null value... Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
* net: Compute protocol sequence numbers and fragment IDs using MD5.David S. Miller2011-08-071-0/+184
Computers have become a lot faster since we compromised on the partial MD4 hash which we use currently for performance reasons. MD5 is a much safer choice, and is inline with both RFC1948 and other ISS generators (OpenBSD, Solaris, etc.) Furthermore, only having 24-bits of the sequence number be truly unpredictable is a very serious limitation. So the periodic regeneration and 8-bit counter have been removed. We compute and use a full 32-bit sequence number. For ipv6, DCCP was found to use a 32-bit truncated initial sequence number (it needs 43-bits) and that is fixed here as well. Reported-by: Dan Kaminsky <[email protected]> Tested-by: Willy Tarreau <[email protected]> Signed-off-by: David S. Miller <[email protected]>