0

I have been tasked with implementing an 'on the wire' monitoring solution for a large Hadoop installation. The source of data will be a combination of taps and SPANs throughout the environment. My team's usual charter is one of packet analysis and network performance analysis. Given the architecture of this implementation (and volume of data) raw packet analysis through tools like Wireshark and others is just not feasable.

What are my options?

We are looking to monitor things like:

-How is Scoop/JDBC working

-How is connectivity performance between the control tier and the data tier

-DNS is key to this implementation. Are network services responding in an appropriate manner?

While we will be installing the standard suite of monitoring tools; Ganglia and Nagios, we would like to have that external hard network data for validation of performance. Some of the tools we have experience with are CompuWare DCRUM/Dynatrace, NetScout, Network Instruments, Extrahop, and Riverbed.

What are your experiences?

  • Are you commited to using enterprise-grade switches in the tiers or are you directly intermeshing a large number of interfaces? Most large clusters I've worked with have been deliberately designed to have broken pieces more or less all the time. Taps get very expensive very quickly unless you're doing crazy things with very high end bsd boxes shoved with interfaces and unlike a good tap, those don't fail open. Just a couple of thoughts. I usually want to know how much is going in and out of a rack, I don't really care about the little things if I'll pick em up sooner or later. – quadruplebucky Mar 15 '14 at 06:00
  • Yep using enterprise grade switching gear as well as ingress/egress taps into the new zone. Spans will be the source of traffic from the top of rack. Either way, we will be collecting the data. I'm just hoping there is some type of tool that can generate actionable metrics from the data. – user212869 Mar 15 '14 at 13:19

0 Answers0