16

I'm implementing a network monitoring solution for a very large network (approximately 5000 network devices). We'd like to have all devices on our network send SNMP traps to a single box (technically this will probably be an HA pair of boxes) and then have that box pass the SNMP traps on to the real processing boxes. This will allow us to have multiple back-end boxes handling traps, and to distribute load among those back end boxes.

One key feature that we need is the ability to forward the traps to a specific box depending on the source address of the trap. Any suggestions for the best way to handle this?

Among the things we've considered are:

  • Using snmptrapd to accept the traps, and have it pass them off to a custom written perl handler script to rewrite the trap and send it to the proper processing box
  • Using some sort of load balancing software running on a Linux box to handle this (having some difficulty finding many load balancing programs that will handle UDP)
  • Using a Load Balancing Appliance (F5, etc)
  • Using IPTables on a Linux box to route the SNMP traps with NATing

We've currently implemented and are testing the last solution, with a Linux box with IPTables configured to receive the traps, and then depending on the source address of the trap, rewrite it with a destination nat (DNAT) so the packet gets sent to the proper server. For example:

# Range: 10.0.0.0/19       Site: abc01    Destination: foo01
iptables -t nat -A PREROUTING -p udp --dport 162 -s 10.0.0.0/19 -j DNAT --to-destination 10.1.2.3
# Range: 10.0.33.0/21       Site: abc01    Destination: foo01
iptables -t nat -A PREROUTING -p udp --dport 162 -s 10.0.33.0/21 -j DNAT --to-destination 10.1.2.3
# Range: 10.1.0.0/16       Site: xyz01    Destination: bar01
iptables -t nat -A PREROUTING -p udp --dport 162 -s 10.1.0.0/16 -j DNAT --to-destination 10.3.2.1

This should work with excellent efficiency for basic trap routing, but it leaves us completely limited to what we can mach and filter on with IPTables, so we're concerned about flexibility for the future.

Another feature that we'd really like, but isn't quite a "must have" is the ability to duplicate or mirror the UDP packets. Being able to take one incoming trap and route it to multiple destinations would be very useful.

Has anyone tried any of the possible solutions above for SNMP traps (or Netflow, general UDP, etc) load balancing? Or can anyone think of any other alternatives to solve this?

masegaloeh
  • 17,978
  • 9
  • 56
  • 104
Christopher Cashell
  • 8,999
  • 2
  • 31
  • 43

6 Answers6

4

A co-worker just showed me samplicator. This tool looks to be just about a perfect solution what I was looking for. From the tool's website:

This simple program listens for UDP datagrams on a network port, and sends copies of these datagrams on to a set of destinations. Optionally, it can perform sampling, i.e. rather than forwarding every packet, forward only 1 in N. Another option is that it can "spoof" the IP source address, so that the copies appear to come from the original source, rather than the relay. Currently only supports IPv4.

It can been used to distribute e.g. Netflow packets, SNMP traps (but not informs), or Syslog messages to multiple receivers.

Christopher Cashell
  • 8,999
  • 2
  • 31
  • 43
3

I would go implementing the solution myself, as I don't know if you will find something as specific as you want.

I would use a high-level language like ruby to implement the balance rules and even the trap listener. For instance, using this libraries seems easy.

Listen to traps:

m = SNMP::TrapListener.new(:Port => 1062, :Community => 'public') do |manager|
  manager.on_trap_default { |trap| p trap }
end
m.join

You should add the balance logic in the on_trap_default block.

Send traps:

Manager.open(:Version => :SNMPv1) do |snmp|
  snmp.trap_v1(
    "enterprises.9",
    "10.1.2.3",
    :enterpriseSpecific,
    42,
    12345,
    [VarBind.new("1.3.6.1.2.3.4", Integer.new(1))])
end

To build the daemon you could use the daemon-kit ruby gem.

If you keep it simple and define good objects you can maintain the software with not much effort.

chmeee
  • 7,270
  • 3
  • 29
  • 43
  • I appreciate the answer, but honestly if I build something myself, it'll be based around Net-SNMP's snmptrapd, and implemented in Perl, as snmptrapd has built-in support for accepting traps and calling Perl modules to handle them. That keeps it simpler and much better supported (we have a dozen guys who can handle basic Perl, and one guy who's (barely) toyed with Ruby). – Christopher Cashell Jul 07 '09 at 22:53
1

Your main problem is going to be, how do you know the actual ip of the device you are receiving the traps from?

If you are using SNMP v1, you can get the ip off the header of the trap. If you are using v2 or v3 traps, you will need to correlate the snmpengine id to the ip that you have previously fetched from the device. Engineid is typically not a mandatory config item for most SNMP implementations, and hence you can't fully rely on that alone.

The fallback is that you can use the source ip from the udp packet header. Ofcourse, this will fail, if your trap is routed through another EMS/NMS or if you have a NAT between the device and your mgmt application.

  1. If you don't need to support NAT/forwarded traps from other NMS, then just make a copy of the udp packet, and route based on the ip

  2. If you need to support that, you have to parse the SNMP trap and check for engine id match for v2/v3, for v1 you can read it off the agent-address field in the SNMP header.

0

one more netfilter-based hack:

iptables -t nat -A PREROUTING -d 10.0.0.1 -p udp --dport 162 -m random --average 33 -j DNAT --to-destination 10.0.0.2:162
iptables -t nat -A PREROUTING -d 10.0.0.1 -p udp --dport 162 -m random --average 33 -j DNAT --to-destination 10.0.0.3:162
# everything else goes to other consumer
iptables -t nat -A PREROUTING -d 10.0.0.1 -p udp --dport 162 -j DNAT --to-destination 10.0.0.4:162

[ assumption - all traps are sent to 10.0.0.1, which then redirects them to 10.0.0.2, 10.0.0.3, 10.0.0.4 ]

as long as you have one-packet-long snmp traps - this should spread load nicely - in this case across 3 machines. [ although i have not tested it ].

pQd
  • 29,561
  • 5
  • 64
  • 106
  • Actually, we very much don't want the load spread randomly. We want all traps from a given subnet to hit the same machine so we can correlate events to specific sites. Right now my IPTables rules set the DNAT destination based on the source of the trap. – Christopher Cashell Jul 07 '09 at 22:56
  • @Christopher Cashell - then alternatively to your solution you can use u32 netfilter module to 'hash' destination server based on src ip address. eg take last 2 bits of src ip address and spread load to 4 snmp 'consumers'. http://www.netfilter.org/documentation/HOWTO/netfilter-extensions-HOWTO-3.html#ss3.21 – pQd Jul 08 '09 at 06:55
  • @Christopher Cashell http://www.stearns.org/doc/iptables-u32.v0.1.html is nice tutorial for u32 match. alternativly - look at "linux virtual server" project - they can do load balancing for udp packets based on src/dst ip as well. – pQd Jul 08 '09 at 07:04
0

I think the answer from chmeee is the right way to go. Get rid of UDP and SNMP as early in the process as you can, they are horrible to manage.

I'm now building a system that will put all events (including traps) on a JMS queue and then use all the wonders of enterprise messaging to do load balancing and failover.

Aleksandar Ivanisevic
  • 3,327
  • 19
  • 24
  • I think you're misunderstanding. . . I'm not trying to build a full monitoring system, just an SNMP trap router. We've got 5000 network devices and hundreds of thousands of ports we're monitoring here. There's no way I'm reinventing that wheel. . . just trying to make the tools we have work better. – Christopher Cashell Jul 08 '09 at 17:50
  • I understood you right, probably you didn't understand me ;) JMS is used as a transport because modern brokers have all those nice failover, persistance and balancing features. You can POST to a URL, send an email, SOAP, whatever works. UDP was never built to be reliable or balancable since it has no concept of data stream or flow control. You'll just be screwed on the long run trying to make UDP do what it was not designed to do. – Aleksandar Ivanisevic Jul 09 '09 at 08:38
  • I appreciate the suggestion, but I really have absolutely no intention or interest in building my own enterprise level network monitoring system. There are plenty of them already available, and implementing one with the feature set and scalability that we require would need a team of a dozen programmers for 2-4 years. It's not feasible or desirable. That leaves me with interacting with existing systems, and that leaves me dealing with a *lot* of SNMP over UDP. – Christopher Cashell Mar 22 '10 at 16:41
0

Your main problem is going to be, how do you know the actual ip of the device you are receiving the traps from?

To get the original sender's IP, you could try to patch the snmptrapd with this patch - https://sourceforge.net/p/net-snmp/patches/1320/#6afe.

That modifies the payload, so IP headers will be kept intact, so they don't get into your routing and/or NATting.

Piotr Kierklo
  • 161
  • 1
  • 4