1

We operate a SAAS business, and we have hundreds of processes which can roam from server to server. They are .net processes which can be created (started) on any one of a bank of machines, run for a period of time (typically weeks), and then be migrated to another machine.

These processes have many different time-series outputs (which are broadcast using RabbitMQ) and we have our own bespoke system for monitoring the application processes.

We have a variety of monitoring tools (for example LogicMonitor) but we're starting using Zabbix for server monitoring.

It makes sense to me that we put all time-series data from all sources (switches, servers, hosts, VM's, applications) into one place because then we can compare the server wide data (for example CPU load, memory load).

I'm considering using Zabbix for this.

I can see that Zabbix supports sending time-series data using the https://www.zabbix.com/documentation/3.0/manual/concepts/sender. So I know I can get data into it.

I'm struggling to understand how to setup Zabbix for this given Zabbix is server centric, with keys for each time-series data. But, I expect this is a common scenario but I'm new to Zabbix.

I imagine a hierarchy along the following lines:

DataCenter (1 of n)
-> Rack (1 of n)
    keys (eg power used)  
   -> Physical Machine (1 of n) "The hosts"
       keys (eg CPU, Memory, Network Bandwidth)
      -> VM (1 of n) 
          keys (eg CPU, Memory, Network Bandwidth)    
          -> Application
             keys (eg CPU, Memory, Network Bandwidth, Jobs per second etc) 

Is this something Zabbix supports? I thought about perhaps using a naming convention for the host or keys but it feels like I'm doing something wrong.

1 Answers1

2

As you mentioned Zabbix is designed for hosts/servers and keys, so as a first step to model your hierarchy you could create hosts for every VM and then use host groups as needed for datacenters or racks.

Zabbix has no build-in support for clusters or roaming applications. To monitor those I usually create "meta-hosts", basically empty host entries without any agent. Then I use some monitoring script to send zabbix trapper items to that host.

For example: using three VMs app1, app2, app3 with normal system monitoring (CPU, memory), in addition one "meta-host" service1 with my application template. Then having my roaming application send monitoring data with zabbix_sender -z zabbixserver -s service1 -k service.some.stat -o 42 (or the equivalent library call for the programming language).

As a result I will have system stats for all VMs and continuous application stats instead of intermitted application stats spread across three VMs.

mschuett
  • 3,066
  • 20
  • 21
  • I have been struggling with the exact same issue for containers, see http://serverfault.com/questions/753659/how-do-i-configure-zabbix-to-add-containers-dynamically-and-monitor-them-across. When you configure a "meta-host" as "empty host entry without any agent", how do you do it? I cannot add a host without having some interface (Agent, SNMP, IPMI, JMX) ? – deitch Feb 04 '16 at 11:49
  • I usually configure an Agent or SNMP entry with 127.0.0.1. As long as there is no real data item there should not be any query. – mschuett Feb 04 '16 at 12:43
  • OK, so it really is a dummy. And then you don't have the Zabbix server querying for anything at all? Instead, the agents (or the app itself) are configured to run some monitoring script or binary that pushes data to Zabbix server? So every host in "Docker servers" group has an item "run_container_tests.sh" (or whatever), which itself returns no data, but runs zabbix_sender for all necessary data? – deitch Feb 04 '16 at 12:58
  • Yes, that is the general idea. – mschuett Feb 04 '16 at 13:12
  • thank you. So my options are one of: deploy a container monitoring script/module to every agent in the group "container hosts", which when run returns no real data, but instead calls `zabbix_sender`; or use some external system that runs independently. – deitch Feb 04 '16 at 13:17
  • Is there any way to add the `service1` automatically? E.g. use some discovery that uses logic to realize that `service1` exists and matters, and thus add it? – deitch Feb 04 '16 at 13:18
  • BTW, just upvoted your answer. Very helpful, thank you. – deitch Feb 04 '16 at 13:20