OSSEC Windows Agent Fails to Sync Configuration

Question

This has proved an annoyance for the past several days, and I have yet to figure out the root cause.

In a lab, I've setup two virtual machines, an OSSEC Server Appliance and a Windows 7 x64 Enterprise SP1 client.

Both seem to work quite well when they do their own things. If I have an extensive configuration file on the Windows client, the agent reads it, and does what is required.

The issue comes about when I attempt to centralize the configuration to the "manager" or OSSEC Server Appliance.

[root@ossec etc]# md5sum /var/ossec/etc/shared/agent.conf
9cc4c937f4eae011ecbccf4468973133  /var/ossec/etc/shared/agent.conf
[root@ossec etc]# /var/ossec/bin/agent_control -i 004

OSSEC HIDS agent_control. Agent information:
   Agent ID:   004
   Agent Name: ABC
   IP address: 192.168.0.93
   Status:     Active

   Operating system:    Microsoft Windows 7 Enterprise Edition Professional ..
   Client version:      OSSEC HIDS v2.9.0 / cd66e10fca4cc1dc4c459a1f05f9b2d1
   Last keep alive:     Sat Oct  7 22:52:09 2017

   Syscheck last started  at: Sat Oct  7 21:35:12 2017
   Rootcheck last started at: Sat Oct  7 22:27:19 2017
[root@ossec etc]#

To no surprise, the configurations are not at the same version.

What should be an easy fix of restarting both the appliance and Windows agent (and waiting a few minutes) turns out not to be the case.

From reading the documentation, I have come to the understanding the agent will attempt to merge the centralized configuration:

<agent_config name="ABC">
    <localfile>
        <location>/var/log/my.log2</location>
        <log_format>syslog2</log_format>
    </localfile>
</agent_config>


<agent_config os="Linux">
    <localfile>
        <location>/var/log/my.log2</location>
        <log_format>syslog</log_format>
    </localfile>
</agent_config>


<agent_config os="Windows">
 <!-- This is a test config -->

  <!-- One entry for each file/Event log to monitor. -->
  <localfile>
    <location>Application</location>
    <log_format>eventlog</log_format>
  </localfile>

  <!-- Additional contents are in here. -->

  <active-response>
    <disabled>no</disabled>
  </active-response>

</agent_config>

With the one in has locally. Here is the agent's configuration (ossec.conf):

<ossec_config>
  <active-response>
    <disabled>no</disabled>
  </active-response>
  <client>
        <server-ip>192.168.0.21</server-ip>
        <notify_time>120</notify_time>
        <time-reconnect>240</time-reconnect>
  </client>
</ossec_config>

and the agent.conf file in the shared folder on the agent:

<agent_config>
    <localfile>
        <location>/var/log/my.log</location>
        <log_format>syslog</log_format>
    </localfile>
</agent_config>

I can see from the log, that the merging is not taking place, it's running the local copy:

2017/10/08 00:06:52 ossec-agentd: INFO: Trying to connect to server 192.168.0.21, port 1514.
2017/10/08 00:06:52 INFO: Connected to 192.168.0.21 at address 192.168.0.21:1514, port 1514
2017/10/08 00:06:52 ossec-agent: Starting syscheckd thread.
2017/10/08 00:06:52 ossec-syscheckd(1702): INFO: No directory provided for syscheck to monitor.
2017/10/08 00:06:52 ossec-syscheckd: WARN: Syscheck disabled.
2017/10/08 00:06:52 ossec-rootcheck: INFO: Started (pid: 2512).
2017/10/08 00:06:52 ossec-syscheckd: INFO: Started (pid: 2512).
2017/10/08 00:06:53 ossec-agentd(4102): INFO: Connected to server 192.168.0.21, port 1514.
2017/10/08 00:06:53 ossec-agent: INFO: System is Vista or newer (Microsoft Windows 7 Enterprise Edition Professional Service Pack 1 (Build 7601) - OSSEC HIDS v2.9.0).
2017/10/08 00:06:53 ossec-logcollector(1103): ERROR: Could not open file '/var/log/my.log' due to [(9)-(Bad file descriptor)].
2017/10/08 00:06:53 ossec-logcollector(1950): INFO: Analyzing file: '/var/log/my.log'.

In the end it doesn't seem to be a case of the agent/manager being unable to:

Connect to each other.
Parse the configuration files.
Send data back and forth (triggered rules).
Verify which version of the configuration file it's using.
Merge configurations (I see a merged.mg file periodically of 0KB on the agent).

Did I fail to set an option on the appliance/manager, or is the problem elsewhere?

dark_st3alth · Accepted Answer · 2017-10-23T22:37:28.707

So after having no success on security.stackexchange.com, the question was migrated here. Spending a few extra days on this I've found the "solution".

You can boil it down to: find another HIDS solution.

I came to this conclusion after trying an extensive list of things:

Run the OVA as is, directly off the project's website (2.8.3)
Updated/upgraded the OVA provided on the OSSEC project website.
Installed the OSSEC server/manager on a fresh install of CentOS 7.
Installed the server with the "Server GUI" and "Minimal" installs of CentOS 7.
Tried updating the Windows 7 client VM.
Using other fresh Windows based VMs.
Change ports, firewall rules, and static IP addresses.
Disabled the firewalls on both the server and client.
Increase the UDP buffer within Windows client via the registry.
Disabled SELinux (Permissive Mode active).
Verified there were agents listed on the server and restarted to detect changes.
Installed the server from the RPM sources
Compiled and installed from the source code.
Tried Windows agent versions 2.9.0 and 2.9.2.

To get some reasonable install going, that at least worked (somewhat), I followed these steps:

Boot server to CentOS 7 install media.
Choose a Minimal Install
Connect to your network, a static IP is the best.
After install, login as root.
Open the firewall up firewall-cmd --permanent --zone=public --add-port=1514/udp
Commit the changes firewall-cmd --reload
Install some extras yum install mysql-devel postgresql-devel gcc wget vim
Grab the source code wget https://github.com/ossec/ossec-hids/archive/2.9.2.tar.gz
UnTar the code tar -zxvf 2.9.2.tar.gz
Go into the new directory cd ossec-hids-2.9.2
Run the installer ./install.sh
Choose the server type for the install.
Now configure, I defaulted on all options besides setting email to no.
Setup the config of clients /var/ossec/bin/manage_agents
Config the new centralized config file via vim /var/ossec/etc/shared/agent.conf
Start the server /var/ossec/bin/ossec-control start
Install the Windows Client with the latest version (2.9.2).

What was great, after spending hours and hours, was that all my work was wasted. I found how to set the Windows client to debug level 2, and discovered the message:

2017/10/20 02:13:40 ossec-agentd: Failed md5 for: shared/merged.mg -- deleting.

Turns out that there is no warning thrown at the "normal" log level that a critical merge of configuration failed (seriously!?).

I'm was further impressed by the fact the server was unable to retrieve the md5 hash of the client's configuration after restarting the server and client (attempt #2 to #14).

In one run with the OVA (attempt #1), the server was able to grab the client's md5 of the config, but it did not match the server's. You can see this in my original question. I think the md5 from the agent was sent because I added some additional files to the conf directory on the agent (mainly agent.conf).

In pure annoyance, I turned to the internet, and found the Google Group discussion for OSSEC. After reading the full chain of messages, it became quite apparent there is a serious flaw in OSSEC:

As I said before IMHO the issue affects to Windows and UNIX agents but it's more common in Windows because the default buffer is shorter. We had this problem with a Windows agent on a private VirtualBox network: the shared file didn't arrive. With debugging enabled, we saw the message:
ossec-agent: Failed md5 for: merged.mg -- deleting.
So we did this test: we modified the source code to prevent file from being deleted although it was corrupted, and compared the received file with the original one: some chunks of the file were indeed lost, it was not a line ending issue.

Shared file chunks may be lost due to the UDP protocol, as well as any other agent event or control message. In fact using TCP seems to be a good solution for this problem. We implemented TCP communication in Wazuh a year ago from version 1.1 and we reached some advantages:
No event losing. Communication is reliable for events, control messages and Active response requests.
Agents detect that a manager is down immediately, so they are able to "lock" the transmission in order to prevent events from being dropped.
Agents with TCP connection are working properly in many systems using Linux, Windows, OpenBSD, macOS, AIX, etc.

This isn't what I expected to read. What worries me most is the fact that an OSSEC infrastructure could be brought down simply by packet loss. It is even more worrisome that at the normal log level, failing to merge configuration doesn't even show up.

While I have only tested Windows agents, I have no doubt the Linux agents work. Perhaps in the future OSSEC will move to TCP connections, but for now, OSSEC is lacking a critical piece of functionality.

tldr; What it comes down to (at least in my opinion) is poor software testing/quality assurance. I found out from the Google Group discussions UDP connections cause problems, and there is limited verification of data transmissions. Due to corruption of the manager's configuration in transit, the client refuses to merge it. This only seems to happen on Windows clients.

OSSEC Windows Agent Fails to Sync Configuration

1 Answers1