11

Despite researching this topic quite a bit online (to be fair I'm not a full time sysadmin) I'm unable to figure this out.

We have a bunch of VMWare ESXi 5.5 servers, some of which are integrated into vSphere, some of which are not (for cost reasons).

All of them run the standard ESXi image, with the exception of one machine which is actually running the DELL VMWare ESXi image.

What I would like to accomplish seems simple: Configure the system so that it can be queried via SNMP from a remote host, whether it's snmpwalk, Nagios, PRTG etc. I'd like to see information from temperature sensors, installed disks and their status, fan speed, PSU status etc.

I was under the impression that installing the VMWare version from DELL would automagically enable the necessary modules (OpenManage most importantly), but it seems like that is not the case.

I have conflicting information whether this is even possible at all, some documents say that you cannot query a DELL VMWare ESXi server via SNMP and you need to use a CIM client. Then there is the OMSA VIBs one can install, etc.

I imagine this being a fairly common requirement, yet the docs available pull one in all different directions.

Is what I am trying to do possible (without a complete vSphere environment) even possible?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
Lucky Luke
  • 1,555
  • 1
  • 9
  • 12
  • with OMSA our dell server answer advanced status to our internal tool, that ise snmp. – yagmoth555 Mar 28 '16 at 10:25
  • And they are running VMWare ESXi? Can you tell me what exactly you installed? – Lucky Luke Mar 28 '16 at 12:38
  • If you used dell esx iso it must be there already, try https to your server on port 1311 to see the stat, and activate snmp – yagmoth555 Mar 28 '16 at 13:14
  • It's not responding on port 1311 unfortunately (and we wouldn't have changed the port). SNMP is enabled on the host already, but it's only returning data from VMWare, not DELL. – Lucky Luke Mar 28 '16 at 13:20
  • 1
    You need to make OSMA work. it was two file to install, but I forgot the name. The webpage on port 1311 will work when your addon will be installed correctly – yagmoth555 Mar 29 '16 at 09:35
  • I guess this is where I am confused, I thought the DELL VMWare ESXi image already includes OMSA, after all why else would I pick the DELL VMWare image? Or does it just include more drivers? – Lucky Luke Mar 29 '16 at 16:13
  • It should, unless its a low end server you have that does not support it. Usually dell recommand a 3xx + serie for enterprise virtualization. What model you have? – yagmoth555 Mar 30 '16 at 13:45
  • 1
    Well, funny you say that. The model I have to test with is not exactly the newest piece of hardware, it's a PowerEdge 2970 and at least 3-4 years old. – Lucky Luke Mar 31 '16 at 02:44
  • 1
    I went to the Dell download page for their customized ESXi 6.0 (http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=HJFY8) and sure enough, my newer but "entry level" server was not listed under "Compatible systems" – Steve Bonds Apr 12 '16 at 16:38

4 Answers4

5

Yes, you can monitor the standalone ESXi Host using any SNMP monitoring software but some items may only be visible using a monitoring tool that supports the CIM protocol.

All of my ESXi Hosts are part of vCenter but we monitor them directly (using the vmkernal Host IP address) with SolarWinds NPM. There are 5 or 6 CIM modules built into ESXi 5.5 that give you hardware health but RAID card health is not one of them. You will need to add the Dell OMSA VIB that adds the additional CIM agents including the one for the RAID array. Brian Atkinson's post is still the best I have found that describes the process,

https://communities.vmware.com/people/vmroyale/blog/2012/07/26/how-to-use-dell-dset-with-esxi

You only need to follow the instructions for installing the OMSA ESXi VIB if you are going to use a third party monitoring tool that gives historical information and does alerting. If you wish to use the Dell OMSA Server you can install it remotely on bare bones server, remotely in a VM or locally as a VM.

You can use the OMSA server to connect to DRAC and iDRAC Out of Band (OOB/ IPMI/ iLo) management cards or to the ESXi Host after you install the OMSA VIB on the ESXi Host. You will not see the RAID Health information in the DRAC or iDRAC though - only when connecting the OMSA Server to an ESXi Host - I repeat the Server keyword so there is no confusion between the Server which is acting as a client to the OMSA VIB that is installed on the ESXi Host.

Some useful resources:

Show the current CIM providers on an ESXi Host https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2053715

Show the currently installed VIBs on the ESXi Host from the Host's CLI, esxcli software vib list

You do see some minor additional hardware health details when you connect to a vCenter server versus the ESXi Host directly but generally if you do not see the hardware health you are looking for in the Configuration/ Health Status panel then you are missing a CIM provider and you need to locate and install the VIB on the ESXi Host. When you add the Dell OMSA VIB to the ESXi Host you will see a Storage sensor added to the Health Status page which shows the RAID volumes, drives, controller and battery health for your storage controller. You may need to reset the sensors for it to show up and sometimes it takes 15 to 20 minutes the first time after the VIB install and reboot of the ESXi Host.

If you do not see a sensor on the ESXi Host's Health Status page when you connect with the vSphere Client then you are most likely not going to see it when you are remotely polling the sensors with monitoring software.

Also you should note that not all servers have the same sensors and you may not be able to get the same health status from all depending on the Server hardware, RAID card and version of the CIM available for the combination. You may also need to upgrade or change the VIBs for the RAID card in order for the health status to work. The CIM provider (which is the OMSA VIB in this case) talks to the hardware through the device VIB (the real device driver) and passes this information to the CIM Broker on the ESXi Host - also known as the Small Footprint CIM Broker Daemon (sfcbd). When you poll the ESXi Host for hardware health using robust monitoring software it will get some information using SNMP queries, some using CIM and some using the ESXi API (which are SOAP requests). The CIM client talks to the sfcbd process on the ESXi Host.

Sometimes the CIM process just stops working. When that happens you will be restarting the sfcbd-watchdog process on the ESXi Host. This will restart the sfcbd service and CIM polling will work again. From the CLI of the Host, /etc/init.d/sfcbd-watchdog restart

I think that covers most of the items you need to get you running.

mhughesnp
  • 136
  • 2
  • After installing the DELL VIB I'm seeing certain HW info now in the vSphere client when connecting remotely, excellent. Unfortunately it's not providing the data via SNMP for some reason, I suspect that's not possible and that I will have to revert to CIM. – Lucky Luke Apr 04 '16 at 21:47
4

I understand what you're looking for; specific notes on how to manage and monitor the health of a standalone VMware ESXi host.

In practice, the approach should be slightly different. I'll explain how I manage hosts.

In a situation where you have multiple ESXi hosts under vCenter management, the assumption is that you leverage the vCenter for monitoring and health status, versus querying the individual hosts. There's a catch-all alarm that I configure in vCenter to alert on "Host Hardware Health". I typically don't care if it's a power supply, RAM, disk or any other component, but rather that the host is unhealthy.

Monitoring a standalone ESXi host isn't going to be very helpful, as the point of the Dell/HP drivers is to expose information to vCenter. And I don't believe it's the best practice to query individual hosts in this manner. Granted, that's because you ideally want your VM hosts centrally managed.

If you run vCenter with a single host, you DO get this ability, so maybe that's an option for your environment.

If you need some sort of out-of-band monitoring, couldn't you query the DRAC instead?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Thanks. We do have vCenter, but not for all hosts. VMWare's licensing is rather odd and very expensive (I was told) once you exceed 3 hosts. So I am trying to find a different way for those hosts which are not covered by vCenter. I'll have to look into the DRAC route, will that tell me things like RAID failure etc? – Lucky Luke Mar 28 '16 at 14:43
  • The OSMA will give disk status, or raid alert unlike the DRAC. – yagmoth555 Mar 29 '16 at 09:37
0

you can use the excellent https://exchange.nagios.org/directory/Plugins/Operating-Systems/*-Virtual-Environments/VMWare/check_vmware_api/details (with or without nagios), it leverages the vmware api to get all the info you require for hardware monitoring:

check_vmware_api -H esxhost -u esx_user_read_only_role -p passwd -l runtime -s health [enter]
OK - All 450 health checks are Green 

You need the perl vmware sdk but other than that it's pretty straight forward. It works for all types of hardware (as long as the sensors are seen by the vmware api, they are checked).

natxo asenjo
  • 5,641
  • 2
  • 25
  • 27
-1

Try zabbix (http://zabbix.com):

1) it's perfect, well known world class monitoring software

2) you can easy start with Zabbix appliance available also as pre-configured virtual image (based on OpenSuSE).

3) it can monitor ESX[I] hosts and machines using Vmware Web services (like web-client). You can use low-level discovery rules to automatically discover VMware hypervisors and virtual machines and create hosts to monitor them, based on pre-defined host prototypes.

4) you'll be able to monitor whole hardware of your Dell servers using SNMP via iDrac including raid controller and it's volumes status, physical discs/memory modules/PSU and so on...

All kind of hardware statuses info [as it available in iDrac] can be accessed via SNMP (at least on servers with IDrac 7/8 - I've implemented monitoring of hardware of 50+ Dell 12/13 generation servers for my company in this way).

With perfect LLD (low-level-discovery) feature of zabbix you can easy collect all hardware components for monitoring without manual enumeration and automatically create an items for monitoring (statuses, temperatures, fan speeds, disk sizes and serials and so on), triggers (expressions to process monitoring data) and various actions...

Sergey
  • 2,091
  • 15
  • 14