The initial TCP RTO value of 3s is too long for most LAN-based applications. How can I tune it lower? Is there a sysctl?
3 Answers
Nope, you can't; it is hardcoded in the kernel. So change the kernel and recompile.
#define TCP_TIMEOUT_INIT ((unsigned)(3*HZ)) /* RFC 1122 initial RTO value */
This is what you should get in your include/net/tcp.h.
But I can see someone provided a patch, even though never tried it myself
- 1,318
- 1
- 11
- 11
-
1This is no longer true! You can now set it using eBPF https://blog.habets.se/2020/11/BPF-the-future-of-configs.html – Thomas Nov 25 '20 at 11:26
The initial setting should not affect your overall performance much, as RTO self-adjusts to network conditions. If you do change RTO, you can set it to 1 sec (but no lower).
There is a discussion of this in RFC 1122:
The following values SHOULD be used to initialize the estimation parameters for a new connection:
(a) RTT = 0 seconds.
(b) RTO = 3 seconds. (The smoothed variance is to be
initialized to the value that will result in this RTO).
The recommended upper and lower bounds on the RTO are known
to be inadequate on large internets. The lower bound SHOULD
be measured in fractions of a second (to accommodate high
speed LANs) and the upper bound should be 2*MSL, i.e., 240
seconds.
DISCUSSION:
Experience has shown that these initialization values
are reasonable, and that in any case the Karn and
Jacobson algorithms make TCP behavior reasonably
insensitive to the initial parameter choices.
RFC 6298 is a proposed update (published June 2011) that says that RTO can be initialized to a lower value (but no lower than 1 sec), and contains an Appendix containing data that justifies 1 sec as a reasonable initial value.
- 151
- 5
-
1 second is SHOULD, not MUST; btw you can look at rto of one well-known search engine front-ends =) – SaveTheRbtz Oct 14 '11 at 00:50
-
I disagree with this statement "The initial setting should not affect your overall performance much, ". This can affect your error rate for the application on the initial communication. When the backend application sets a read timeout to 3 seconds or less, packet drops (normal event with any congestion) on the network during the initial TCP communication will not allow for a proper retransmission of a dropped packet. the initial value must be lower than the read timeout set by the receiving end and should be set based on the QOS of the network you are running on. – Joe Jun 14 '17 at 19:50
-
3 Seconds is an eternity on local networks and packet drops happen real fast one network where the round trip time is in the milliseconds. – Joe Jun 14 '17 at 19:50
-
I agree modern CPUs can get a lot done in 3 seconds. My understanding is that this initial delay is only applied when the driver initializes, which only occurs when a system first boots up. – Jay Elston Dec 22 '17 at 21:18
See this blog post for how you can make an eBPF program that will override the timeout.
In short, you need to load this sockops
program:
#include<linux/bpf.h>
#define SEC(NAME) __attribute__((section(NAME), used))
// TODO: assumes little-endian (x86, amd64)
#define bpf_ntohl(x) __builtin_bswap32(x)
SEC("sockops")
int bpf_sockmap(struct bpf_sock_ops *skops)
{
const int op = (int) skops->op;
if (op == BPF_SOCK_OPS_TIMEOUT_INIT) {
// TODO: this is in jiffies, and despite `getconf CLK_TCK` return 100, HZ is clearly 250 on my kernel.
// 5000 / 250 = 20 seconds
skops->reply = 5000;
return 1;
}
return 0;
}
char _license[] __attribute((section("license"),used)) = "GPL";
int _version SEC("version") = 1;
You can compile and load it with:
clang $CFLAGS -target bpf -Wall -g -O2 -c set_rto.c -o set_rto.o
sudo bpftool prog load set_rto.o /sys/fs/bpf/bpf_sockop
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ sock_ops pinned /sys/fs/bpf/set_rto
- 1,446
- 11
- 16