1

I have a query to get used memory in Prometheus as a 0.0-1.0 percent. I can alert on this directly but don't want alerts on a short burst, only a high level over time or an average over time exceeding the limit.

I was hoping to do this in the query, but if AlertManger can do it that is acceptable, I just can't find how.

The query

(node_memory_MemTotal - node_memory_MemFree - node_memory_Buffers - node_memory_Cached) / node_memory_MemTotal

The question

How can I take the average over i minutes of that query result?

virullius
  • 988
  • 8
  • 22

1 Answers1

1

I seem to have found a way to do this but I'm not sure it's the best.

(((node_memory_MemTotal offset 5m - node_memory_MemFree offset 5m - node_memory_Buffers offset 5m - node_memory_Cached offset 5m) / node_memory_MemTotal offset 5m) + ((node_memory_MemTotal - node_memory_MemFree - node_memory_Buffers - node_memory_Cached) / node_memory_MemTotal)) / 2

This uses the offset modifier to take the same measurements 5 minutes ago and at query time, add then divide them to get the average.

virullius
  • 988
  • 8
  • 22