5

I'm running Ganglia 3.1.2 on a network where there is no multicast (nor can I turn it on). Does anyone have an elegant solution for getting ganglia to work correctly? I found this:

http://code.google.com/p/ganglia-multicast-hack/

but it does not scale very well.

Right now, I have separate data_source lines for each host on my network in my gmetad.conf file, but that too does not scale well, and I can't get accurate summary statistics, because it keeps overwriting the rrds (although the host statistics work just fine).

Any pointers would be greatly appreciated (or confirmation that I have found the best solution already).

Thanks!

jedberg
  • 2,291
  • 22
  • 21

4 Answers4

5

After further research, I found the answer. On my clients, I added the following to gmond.conf:

udp_send_channel {
  host = monitoring-host
  port = 8666
  ttl = 1
}

udp_send_channel {
  host = monitoring-host-backup
  port = 8666
  ttl = 1
}

This sends the data via unicast UDP to the monitoring host and the backup every 1 second.

Then on the monitoring host, I added this:

udp_recv_channel {
  port = 8666
}

The key is to get rid of the multicast entry, which is there by default.

jedberg
  • 2,291
  • 22
  • 21
2

This works, but the problem is that all the nodes will end up in the same default datasource, so their Cluster information is lost, which is not so nice for multi-cluster environments.

I haven't tried yet, but a possible workaround for this would be to create a UDP channel for each cluster, which is not so nice if you have many of them.

Later Edit:

My current setup is using unicast at the cluster level due to networking limitations, and all the data is getting sent to a node from each cluster. Then I contact each of those using metad to get all the data regarding that cluster.

This way the clusters will be assigned to their own data sources, and their full information will be there.

The config would look like this:

# on each node in the cluster
udp_send_channel {
  host = 1.2.3.4 # this is a member of the cluster, not a metad server
  port = 8650
}

Then on the metad:

data_source "My Cluster" 1.2.3.4

For redundancy you can have multiple udp_send_channel entries and multiple IPs listed in the data_source. I personally use two for each cluster.

For federation I use something like this:

data_source "My Grid" 1.2.3.5:8651

This is valid only if you have a metad listening on port 8651 there.

2

Was facing the same problem with multicast mode while configuring Ganglia on Amazon EC2 cloud which prevents the use of multicast in its network. The possible solution is to switch to unicast mode which works fortunately.

To be very concise, below given simple steps are there to get rid of multicast mode.

  1. Make one of your nodes master running gmond (ganglia data collector) daemon.

Example: 10 Nodes are there which are running gmond daemon. Pick a node any one from 10 and make that Master which will be getting all the data from 10 Nodes even should also be the slave of itself.

# Define the cluster.
cluster {
  name = "Yellow"
  owner = "Your Company"
  latlong = "N34.02 W118.45"
  url = "http://yourcompany.com/"
}

# Disable multicast and define the host, the yellow master, where nodes in the cluster send data.

udp_send_channel {
  # mcast_join = 239.2.11.71  (No need to join as mcast is not being used)
  host = master.among10node.com  (put the IP/Hostname of server from any 10 nodes to ack as                  master)
  port = 8649
  ttl = 1
}

udp_recv_channel {
  # mcast_join = 239.2.11.71   (Disabled mcast as it is not being used)
  port = 8649
  # bind = 239.2.11.71  (No need to bind as mcast is not being used)
} 

Note: Copy the same configuration on all 10 nodes running gmond daemon. Restart Master first then all others. Hope it will work and the Master Node will be having all the data from other nodes.

Now configure Ganglia data consolidator (gmetad) daemon to use your Master Node as a primary data source.

Example:

data_source "Yellow" master.among10node.com 

# default port is 8649, define here if you are using non default

Now restart the gmetad daemon and let the magic begin.

Cheers Mohd Mozammil Khan

quanta
  • 50,327
  • 19
  • 152
  • 213
0

See also:

https://github.com/ganglia/monitor-core/tree/feature/cloud

I installed it today and got it working on EC2 which doesn't allow multicast.

dmourati
  • 24,720
  • 2
  • 40
  • 69