Error opening specified endpoint "udp:127.0.0.1:161"
This looks like quite a slippery area in the snmpd codebase - not sure if this is a property of snmpd (as part of Net-SNMP) or Debian and friends.
I've encountered this just now in Debian 11.3.
Too lazy to dive into source code, I've managed to find another symptom of the error using strace
: namely strace -f -o /var/trace.txt snmpd -u root -g root
. In the output trace, you get two consecutive bind() calls on 0.0.0.0:161, the first one succeeds, and the second call fails with EADDRINUSE = address already in use. Initially I dodn't notice that there were two consecutive bind() attempts, so I went looking for the culprit using netstat -lnp
, which yielded no candidates. Then I had a flurry of socket-related keywords emanate from my ageing grey matter, along the lines of TIME_WAIT, SO_LINGER, SO_REUSEADDR (mind that this probably points to TCP only, not UDP) - until I noticed that snmpd itself is actually the culprit!
There's a Debian bug report along those lines, filed in 2017 for snmpd v5.7. I am already at v5.9, and apparently the bug (or similar) is still there.
There's another bug report filed with Net-SNMP in 2019 for v5.8 - which claims that the first bind() attempt actually stems from the trapsink
keyword somewhere in the config files, specifying the localhost address, and defaulting to port 161 as well. I've tried following that advice and it doesn't seem to apply to my case.
Mine appears to be the "debian flavoured" bug above. Actually my best bet is to exclude (comment out) any references to agentAddress
anywhere in the config files - in that case, snmpd will end up starting, reported by netstat as listening on 0.0.0.0:161 . If I add my own declaration of agentAddress
, it must not overlap with the default IP Address (0.0.0.0) which is impossible, or it must contain a different UDP port (or TCP instead of UDP). If I meet that rule, both those sockets become open, do listen and I do get the same response to my SNMP queries from both.
Coupled to that, I've noticed some peculiarities in the "precedence of config files", such as:
- if I specify agentAddress in both
/usr/share/snmp/snmpd.conf
and /etc/snmp/snmpd.conf
, the declaration in /etc/snmp/snmpd.conf
gets ignored, and the one in /usr/share/snmp/snmpd.conf
prevails - despite the fact that I can see both these files getting open in strace output.
- the debianese systemd wrapper around snmpd, called
/etc/systemd/system/multi-user.target.wants/snmpd.service
contains some extra cmdline args: ExecStart=/usr/sbin/snmpd -LOw -I -smux,mteTrigger,mteTriggerConf
that are kind of difficult for me to follow to all their respective resulting effects, and somehow they do affect the MIB's that get loaded (compared to me just running snmpd by hand at the cmdline) - I get the lmSensors MIB loaded if I start snmpd via systemd, otherwise not. Possibly the modules excluded by the -I -
option are troublesome. Also, the systemd wrapper contains an extra condition: ConditionPathExists=/etc/snmp/snmpd.conf
. So I cannot just erase /etc/snmp/snmpd.conf. And if I just "touch" an empty file, snmpd starts, but doesn't respond to the lmSensors OID's.
Interesting stuff.
In other words, there are several similar misbehaviors around the default IP and socket to bind, and whether this gets overridden by an explicit agentAddress (or collides with it). Plus, based on what different people report as workarounds, the precedence of config files can also differ between distroes. It is curious to me that this sounds like a fairly boring area of the codebase, and it has bugs - and yet the arcane SNMP engine and modular architecture of the thing (pluggable sub-agents) seems rock solid.