How to configure a log aggregator to authenticate data?

Question

Background: Remote log aggregation is regarded as a way to improve security. Generally, this addresses the risk that an attacker who compromises a system can edit or delete logs to frustrate forensic analysis. I've been researching security options in common log tools.

But something feels wrong. I can't see how to configure any of the common remote loggers (eg rsyslog, syslog-ng, logstash) to authenticate that an incoming message truly originates from the purported host. Without some kind of policy constraint, one log-originator could forge messages on behalf of another log-originator.

The author of rsyslog seems to warn about authenticating log data:

One final word of caution: transport-tls protects the connection between the sender and the receiver. It does not necessarily protect against attacks that are present in the message itself. Especially in a relay environment, the message may have been originated from a malicious system, which placed invalid hostnames and/or other content into it. If there is no provisioning against such things, these records may show up in the receivers’ repository. -transport-tls does not protect against this (but it may help, properly used). Keep in mind that syslog-transport-tls provides hop-by-hop security. It does not provide end-to-end security and it does not authenticate the message itself (just the last sender).

So the follow up question is: what is a good/practical configuration (in any common log tool of your choice -- rsyslog, syslog-ng, logstash, etc) which provides some amount of authenticity?

Or... if nobody authenticates log data, then why not?

--

(Aside: In discussing/comparing, it may help to use some diagrams or terminology from RFC 5424: Section 4.1: Example Deployment Scenarios -- e.g. "originator" vs "relay" vs "collector")

What part are you trying to secure? The log aggregate receiving data from a correct host, or the data itself? — Shane Andrie, Aug 27 '15 at 19:50
Receiving from the correct host. If Alice and Bob are both log-originators, and Trent is the log-collector, Alice should be able to give Trent logs with "hostname=alice" but not "hostname=bob". But I *think* the default setup is designed to assume that Alice could be a log-relay, so they would allow her to submit anything. — Tim Otten, Aug 27 '15 at 19:58

score 3 · Answer 1 · answered Aug 28 '15 at 22:31

The right thing to use for this is TLS with machine client certificates.

rsyslog is doing this since about 2008, and has great instructions: http://www.rsyslog.com/doc/v8-stable/tutorials/tls_cert_summary.html

The process is extremely simple, as these things go:

Set up a CA
Issue certificates to all your computers that you want logs from
Configure rsyslog to use that authentication

Then, your computers can't impersonate each other and nobody can log to your log server without one of your certificates.

I see you found that already, but you're still worried about their caveat. I wouldn't worry too much about that. Log injection is certainly a thing, but it is many things, including injection through the application and injection into the logging process. Authenticated rsyslog won't protect you if someone has a log injection attack in your application software, but nothing will or can; only fixing the application can help that. This will just protect you against spoofed logs.

The other caveats can be easily mitigated by not using relays, which there is really little reason to do anyway. If you don't have relays, and you use the x509/name option to the gtls connection driver in the rsyslog server, you should have no trouble.

See also the gtls config doc: http://www.rsyslog.com/doc/v8-stable/concepts/ns_gtls.html

score 1 · Accepted Answer · answered Aug 28 '15 at 11:27

This is a great question.

I use logstash to accomplish something like what you're proposing. Using logstash (or logstash-forwarder) to ship logs to your central collection system, add a logstash configuration to add a key field to the message, with its value being a long, random string that is unique to each server.

Then on the receiving side, you can add a corresponding rule to discard (or alert on) any messages where a specific hosts's key doesn't match what you expect for its hostname.

This is not bullet-proof, but it's a solid step in the right direction.

How to configure a log aggregator to authenticate data?

2 Answers2