18

What is the best tool to monitor/analyze network traffic on an entire network (several subnets)?

I'm looking for something that will help me toubleshoot bandwidth problems when, for instance, users start complaining that the "network is slow"

Brent
  • 22,219
  • 19
  • 68
  • 102

10 Answers10

10

I'm assuming you have a commercial router/switch, it most likely has SNMP which you can combine with MRTG for a nice traffic graph.

Adam Gibbins
  • 7,147
  • 2
  • 28
  • 42
10

I think your best bet is going to be a mixture of Cacti and Ntop.

ntop is going to provide you information about the traffic on your network, like the hosts that are consuming the most... what traffic is causing slowdowns, etc...

Cacti is going to give long term trends about your bandwidth consumption so you can tell how you networks traffic has changed over time.

Mark Amerine Turner
  • 2,574
  • 1
  • 16
  • 17
4

When you have users reporting 'network issues', the problem could relate to a multitude of issues (routing, switching, host configuration, unicast, multicast, security policy, hardware failure). It's very unlikely that you'll find one piece of software to monitor all your different potential problems.

Instead, focus on two things:

  • Instrumentation: come up with a monitoring strategy that allows you proactively monitor for those faults that occur regularly. See this previous answer for more detail.

  • Troubleshooting: come up with a quick, standard series of tests that you can run to immediately try and isolate where the problem might be, and publish it to your users.

Some example tests:

  • ping your default gateway
  • ping another host on the same subnet
  • ping an off subnet host
  • what kind of packet loss are you getting?
  • do results vary with packet size?
  • can you successfully telnet from the command line to the destination IP/port?

These kinds of simple diagnostics can often point you very quickly in the right direction. Finally, if you can, always get a source IP, a destination IP, and a destination port. Try and educate your users; ambigious complaints like 'the network is slow' can't be easily diagnosed.

Murali Suriar
  • 10,166
  • 8
  • 40
  • 62
3

Try MRTG and/or ntop.

Node
  • 1,644
  • 1
  • 13
  • 15
2

I'm working at an organization that has a small to medium sized network (~500 users) and about a dozen /24 subnets (and a handful of smaller ones behind NAT). We use a variety monitoring software that allows us to keep tabs on remote parts of the network and respond to problems proactively.

  • SNMP - This forms the the basis of our monitoring system. All the network infrastructure, at a minimum needs to support SNMP and logging to a central server via syslog.
  • OpenNMS - Primarily used for event monitoring, although we're beginning to use it for asset and performance tracking. I constantly monitor OpenNMS. If there's a problem with the network, I want to know about it before someone calls me.
  • SFlow/Netflow - This is really useful to determine how much traffic is flowing through which piece of the network and which host is generating that traffic (i.e., top talkers/top listeners).
  • Smokeping - This is mostly used for latency and connectivity tracking, particularly for wireless bridges or other troublesome connections.
  • MRTG - Traffic monitoring on infrastructure devices that don't support SFlow/Netflow is done with MRTG.
  • Linux Network "Probes" - Some parts of our network are not reachable by design and have separate physically discrete connections. An old workstation with a Linux install that has a point of presence on both network segments allows us to keep an eye on these segments using tools like the aforementioned Smokeping and MRTG, but also any of the useful command line tools such as ntop, tcpdump, tcptraceroute, httping, and the venerable ping.
  • TippingPoint IPS System - It's basically Snort in a black box. While it's completely dependent on pattern recognition, the TippingPoint system sits on the network edge and allows us to look for interesting Layer-7 events (malware, scanning, TCP/IP weirdness, etc.).
  • BlueCoat Packeteer - This is mostly a QoS and web filtering device but it does give a nice high-level view of what the Layer-7 ingress and egress traffic breaks down into. For example: It's not surprising that 80% of our ingress traffic is HTTP, but how much of that is Facebook, Pandora, YouTube, etc? It also provides a list of top talkers/top listeners on per application basis, which again is interesting information.
  • Wavemon and a laptop with a decent wireless card is used for 802.11 wireless monitoring and troubleshooting as a substantially less expensive replacement for a Fluke AirCheck. The Fluke supports 5Ghz (which some of our wireless bridges use) and can pick up non-801.11 traffic and is an all around useful RF tool, but I have a hard time recommending it because of the cost.
2

I have been using smoothwall at home with great success, it does a great job monitoring traffic and a ton more.

It comes in a corporate edition as well that does some more fancy stuff.

I was trying to figure out why I kept on running out of bandwidth (in Australia we have limits) turns out it was my fault :)

Sam Saffron
  • 1,959
  • 3
  • 18
  • 27
1

Check out the products from VSS Monitoring. They have several different in-line fail safe products for monitoring network traffic remotely. Once you have them peered into your network(s) and on the backbone, it is as good as being there.

Tall Jeff
  • 1,583
  • 12
  • 11
1

If you have a router capable of reporting netflows, look into a netflow handler. Where MRTG will provide link utilization, netflows report IP and protocol usage flowing through the router. So, instead of "Suzy in accounting it using a lot of traffic" or "The port the WAP is on has high utilization", you could see "Suzy in accounting is 10% LAN traffic, 40% streaming media, and 50% internet HTTP traffic.

Unfortunately I don't have a recommendation for a free flow aggregator. After a net monitoring company tried to sell my company a solution and I determined that their whole product was based on netflows, I made a note to research them. Before I got around to it we bought another NOC solution that also included a flow aggregator.

jj33
  • 11,038
  • 1
  • 36
  • 50
1

I've been using Wireshark for years. Love it.

Spencer Ruport
  • 477
  • 3
  • 17
1

First of all, are they users complaining about your local network ?

The fileserver is slow!

or are they complaining about remote websites ?

Facebook is slow! I can't do my work!

If it's the former, then I would start with the fileserver in question and work backwards. First of all check the fileserver, is it's utilization out of the ordinary ? Check the interface that user traffic flows over. Is it pegged ? Is auto negotiation enabled ? Is it enabled on both ends ...

If everything looks ok there and the server is not under any undue load, try the routers and switches in the path between the user and the server. Are they overloaded ? auto neg enabled ? check the interface counters for errors.

If that appears to be nothing wrong, then the problem may be local to the users work station. Is it under undue load ? Are there any hardware errors (disk errors causing blocking while the firmware retries) ? Is their machine low on real memory (firefox paging hard) ?

This usual solves 99% of the problems.

Depending on the frequency you have to deal with these requests you may prefer to reverse the order of these steps.

Alternatively if it's a problem with a remote site, after debugging your network, and the users workstation try tools like mtr to detect packet loss between you and the remote site. If the problem is not local to your network then your options are probably limited to logging a case with your provider, or waiting till the remote site gets over whatever tizzy it's having.

Dave Cheney
  • 18,307
  • 7
  • 48
  • 56