I am trying to add the GPU Nvidia module in ganglia (/ganglia/gmond_python_modules/gpu/nvidia/
).
Do we need to apply the ganglia_web.patch
patch?
If I do not apply the patch, I don't see any GPU metrics when I go to http://localhost/ganglia/
If I try to apply the patch, I have the following issue:
ubuntu@server:/usr/share/ganglia-webfrontend$ sudo patch -p0 < /home/ubuntu/gmond_python_modules/gpu/nvidia/ganglia_web.patch
sudo: unable to resolve host server
patching file host_view.php
Hunk #1 FAILED at 17.
Hunk #2 FAILED at 37.
Hunk #3 FAILED at 144.
Hunk #4 FAILED at 153.
Hunk #5 FAILED at 169.
5 out of 5 hunks FAILED -- saving rejects to file host_view.php.rej
patching file templates/default/host_view.tpl
Hunk #1 FAILED at 80.
Hunk #2 FAILED at 89.
2 out of 2 hunks FAILED -- saving rejects to file templates/default/host_view.tpl.rej
ubuntu@server:/usr/share/ganglia-webfrontend$ cd /usr/share/ganglia-webfrontend
The readme does not mention what to do with the patch file.
The web interface does contain the GPU metric, but all images are 404:
When I go to a Grid > [name] > [gpu node]
, I don't see any GPU option:
On the Ganglia server (i.e., on the server where gmetad
is running), I ran:
git clone https://github.com/ganglia/gmond_python_modules.git
sudo cp gmond_python_modules/gpu/nvidia/graph.d/* /usr/share/ganglia-webfrontend/graph.d/
sudo /etc/init.d/gmetad restart
On the Ganglia client (i.e., on the server where gmond
is running, and where the GPU is located), I ran:
git clone https://github.com/ganglia/gmond_python_modules.git
sudo pip install nvidia-ml-py
sudo cp gmond_python_modules/gpu/nvidia/python_modules/nvidia.py /usr/lib/ganglia/nvidia.py
sudo cp gmond_python_modules/gpu/nvidia/conf.d/nvidia.pyconf /etc/ganglia/conf.d
sudo /etc/init.d/ganglia-monitor restart
I use:
- Ganglia Web Frontend version 3.6.1
- Ganglia Web Backend (gmetad) version 3.6.0
- RRDtool version 1.4.7.
- Ubuntu 14.04.3 LTS x64 server