6

I'm seeing the error message "No buffer space available" when processes call "connect" on a Linux virtual machine. I'm having trouble tracking down the cause - hopefully someone can help!

I've checked the following:

(1) File handles:

cat /proc/sys/fs/file-nr
4672 0 810707 

I'm reading this as (allocated, unused, available) so this looks OK.

(2) Sockets or TCP memory:

cat /proc/sys/net/ipv4/tcp_mem
191889 255854 383778

cat /proc/net/sockstat
sockets: used 579
TCP: inuse 169 orphan 0 tw 245 alloc 187 mem 5
UDP: inuse 31 mem 4
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0

Reading this as only a total of 579 sockets in use, page totals way below the maximum.

There are lots of random TCP tweaks shown on Google - what I'm hoping for in an answer is (1) the resource I'm running out of, (2) how to determine the current value and (3) how to adjust the ceiling. Most of the pages I've found are missing everything except (3)!

** Update #1 **

On Flup's suggestion I did a systrace when it happens (using ping):

socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("10.140.0.65")}, 16) = -1 ENOBUFS (No buffer space available)

** Update #2 **

I don't know much about the linux kernel source, but I had a dig around and the only place in the connect() path I can see ENOBUFS is here: http://lxr.free-electrons.com/source/net/ipv4/af_inet.c?v=3.11#L353

This looks like it is allocating things in the kernel though with kmem_cache_alloc and security_sk_alloc...?

masegaloeh
  • 17,978
  • 9
  • 56
  • 104
user611942
  • 103
  • 1
  • 1
  • 5
  • 1
    Can you show us the output of `strace` covering the `socket()` and `connect()` calls (and their return values)? – Flup Jul 22 '14 at 10:58
  • I'll try and catch it in the act and strace - it happens fairly randomly. – user611942 Jul 22 '14 at 11:05
  • `socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4` `connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("10.140.0.65")}, 16) = -1 ENOBUFS (No buffer space available)` – user611942 Jul 22 '14 at 15:56
  • Did you check sysctl -a | grep somaxconn ? Try to give it a larger number.. – Adrian Jun 21 '16 at 11:56

2 Answers2

4

In the kernels prior to 3.6, you could've been hit by ENOBUFS for the regular IPv4/v6 traffic, when the net.ipv4.route.max_size or net.ipv6.route.max_size limit was depleated, accordingly.

Starting with the kernel 3.6, routing cache was removed, and net.ipv4.route.max_size lost it's influence on amount of dst entries. So, generally, that would no longer be possible.

However, you can still run into this error like myself, when using IPSec. After certain amount of created IPSec tunnels, I was unable to ping the remote host:

# ping 10.100.0.1
connect: No buffer space available

ping from iputils creates test file descriptor, and uses connect() on it, to bind the dst ip. When it happens, dst cache entry for the given AF is created by the kernel, and xfrm4 dst cache entries limit was already depleated in my case. This limit is controlled by sysctl setting:

xfrm4_gc_thresh - INTEGER
    The threshold at which we will start garbage collecting for IPv4
    destination cache entries.  At twice this value the system will
    refuse new allocations.

I had run into this using kernel 3.10.59, where default limit is very low - 1024. Starting from kernel 3.10.83, this limit was increased to 32768, and would be much harder to hit.

So, I issued:

# sysctl net.ipv4.xfrm4_gc_thresh=32768

and it did the thing for me.

Approximate path in kernel for my case with IPSec:

ip4_datagram_connect() -> ip_route_connect() -> ip_route_output_flow() ->
xfrm_lookup() -> xfrm_resolve_and_create_bundle() ->
... -> xfrm_alloc_dst() -> dst_alloc() with xfrm4_dst_ops, where gc is set.
3

Well, I don't know what exactly the problem is, but I'll try to figure out the right direction for solving it.

The ENOBUFS code is returned when either sk_alloc() or dst_alloc() is failed. I can't find any other occurrences of ENOBUFS in the source code related to the sockets.

Also I can't find any paths from the SYSCALL_DEFINE3(connect) to sk_alloc(), and I think the socket should not be allocated during connect() call where you get the error, so I think it is not likely that the sk_alloc() caused the problem.

The dst_alloc() is likely used for checking routes during the connect(), I can't find the exact path to it, it must be somewhere inside: SYSCALL_DEFINE3(connect) -> .connect() -> ip4_datagram_connect() -> ip_route_connect()

The dst_alloc() allocates an entry in a corresponding SLAB cache, and it may actually fail if the cache is full. Actually old entries should be purged if that happens, but perhaps there are cases when it still returns an error.

So I think you can move to this direction. The dst cache size may changed through /proc/sys/net/ipv4/route/max_size. First, check if the setting (or any other settings in sys.net.ipv4.route) is changed by "random TCP tweaks shown on Google".

Dmitry
  • 461
  • 3
  • 5
  • 1
    `$ cat /proc/sys/net/ipv4/route/max_size 2147483647` This number feels quite high already for me, can it really be the bottleneck in my case? – Nemo Feb 19 '17 at 09:52