I monitor approx. 10 Linux servers with 4 CPU cores each with Zabbix.
I was receiving way to many false alarms from "Processor load is too high" trigger lately.
The "Processor load is too high" trigger expression was:
{Template OS Linux:system.cpu.load[percpu,avg1].avg(5m)}>5
which is default.
Then I raised 5 to 12 to get less alarms, but somehow thought this is not the best way to deal with it. Therefore I made some Googling and constructed a new trigger.
{Template OS Linux:system.cpu.util[,user].max(5m)}>75
I'd ask the community:
- Will new expression reflect REAL CPU overload better than original one?
- Would you do it somehow different/better/more optimized?
How would you compose an expression, which would do this:
The trigger will fire if:- 5 min average number of processes waiting in perCPU queue will be more than 3
AND - maximum CPU utilization during the last 5 minutes will be higher than 75 %
- 5 min average number of processes waiting in perCPU queue will be more than 3
I followed the examples in some article and tried with
({Template OS Linux:system.cpu.load[percpu,avg1].avg(5m)}>3
&
{Template OS Linux:system.cpu.util[,user].max(5m)}>75)
but I failed.
Zabbix server returned error:
Incorrect trigger expression. Check expression part starting from " & {Template OS Linux:system.cpu.util[,user].max(5m)}>75)".
Since I'm not some hi expert on Zabbix (yet), the comments will be greatly appretiated.
Thanks.