I am monitoring my servers infrastructure using Icinga2 with some master/satellite configurations.
On Linux and Windows hosts I am monitoring the defaults system metrics like CPU usage and free system memory. On worker nodes, these values often can reach 100% (or 5% free RAM) and thus I am receiving many CRITICAL alarms which are not really troubling.
So, would it better to:
- simply avoid monitoring free memory and CPU usage
- set critical alarms on 0% for free memory and 100% for CPU usage
- continue to monitor them but without receiving any alerts
- simply discard alerts
- what else?