logging/capturing STDERR/STDOUT on Amazon EC2

Question

I'm looking for a solution that would allow me to automatically capture the STDOUT/STDERR of a process running on Amazon EC2, and send it (remotely) to another server.

Sound simple, except:

I will be using spot-instances, which means I don't control exactly when they start, and they can terminate at any minute (without proper shutdown)
Because there's no shutdown, I can't write to a local file and transmit it (e.g. to s3) when the process is done.
The output is not well structured (e.g. no tabulated fields in a log file), so "Standard" cloud logging solutions aren't trivial, and using one of the cloud databases is not ideal.

Couple of ideas I considered, but each has a problem:

Appending to a file on "s3" is not possible, and rewriting files is too slow for logging.
Sharing EBS volumes (as drives) is not possible to the best of my knowledge.
Using "simple_db" is sounds too slow (and "simple_db" has been in Beta for years, so I'm not sure it's usable).
Using SQS (e.g. one message per line of output?) is very slow.
Redirecting to a network socket will fail if the connection drops for a second (e.g "myprogram 2>&1 | nc my.log.server 7070"

Perhaps there's a "syslog" solution with remote logging? but will that require a separate "on demand" instance to collect the information?

Any tips and ideas would be appreciated.

Thanks, -g

score 1 · Accepted Answer · edited Jun 11 '20 at 10:02

I was hoping there's is some "append only" or "mostly append" service by amazon that is designed for logging.

Like Amazon Kinesis, maybe?

With Amazon Kinesis you can have producers push data directly into an Amazon Kinesis stream. For example, system and application logs can be submitted to Amazon Kinesis and be available for processing in seconds. This prevents the log data from being lost if the front end or application server fails. Amazon Kinesis provides accelerated data feed intake because you are not batching up the data on the servers before you submit them for intake."

^{— http://aws.amazon.com/kinesis}

I haven't tried this, yet, because I have a homebrew supervisory process that uses S3 and SQS... at the beginning of a stream it creates unique names for the temporary files (on the instance) that will capture the logs and sends a message via SQS that results in the information about the process and its log file locations being stored in a database; when the process stops (these are scheduled or event-driven, rather than continuously-running jobs), another SQS message is sent, which contains redundant information about where the temporary files were, and gives me the exit status of the process; then both logs (out and error) are compressed and uploaded to S3, with each of those processes generating additional SQS messages reporting on the S3 upload status...

The SQS messages, as you might observe, are largely redundant, but this is designed to virtually eliminate the chance that I wouldn't know something about the existence of the process, since all 4 messages (start, stop, stdout-upload-info, stderr-upload-info) contain enough information to identify the host, the process, the arguments, and where the log files will go or have gone or should have gone, in S3. Of course, all of this redundancy has been almost totally unnecessary, since the process and SQS/S3 are very stable, but the redundancy exists if it's needed.

I don't need real-time logging for these jobs, but if I did, another option would be to modify the log collector so that instead of saving up the logs and then sending them en bloc to S3, I could, for every "x" bytes of log collected or every "y" seconds of runtime -- whichever occurred first -- "flush" the accumulated data into an SQS message... there would be no need to send an SQS message for every line.

Thanks! I was thinking of developing something very similar to your SQS+S3 solution - is it something you're willing to share? — user205122, Jan 12 '14 at 15:20

EEAA · Answer 2 · 2014-01-11T01:07:58.093

First off, there's nothing overly special about the fact that you're running on EC2. With any centralized logging infrastructure, you want to minimize the chances of log loss and as such, need to get logs shipped ASAP.

Second, don't expect magic here. You need to persist your log messages somewhere, so you're likely going to need to run a long-running instance (either inside EC2 or elsewhere) to collect and store your messages.

Here's what I'd recommend:

Run your application using supervisord. This will not only give you some rudimentary process monitoring/restart capabilities, but more importantly, supervisord will handle collection of your output streams and writing to logfiles.
On each application server, use logstash forwarder to read the log files that supervisord writes and ship them to...
A logstash/elasticsearch server, in which logstash receives logs from your nodes, organizes them (if needed), and submits them to elasticsearch for long-term storage and search.

A few extra comments:

Logstash forwarder is able to encrypt its communications with logstash, so you can ship your logs across public networks if needed, without worrying about information leakage.
Elasticsearch is quite simple to to implement, and does an amazing job of indexing your messages
Elasticsearch provides a REST interface which you can use to issue queries, but if you want a web GUI, Kibana3 is an excellent option.
If you need to monitor your logs and alert/notify on certain patterns, logstash can be configured to do so

Regarding **somewhere**: I realize the have to go somewhere, but I was hoping that there is an existing solution that would save me the trouble of managing that "Somewhere". Example: with Amazon's RDS or SQS or S3 - I just "post" the data there (and pay for it, naturally) - but I don't need to worry about extra administration. I was hoping there's is some "append only" or "mostly append" service by amazon that is designed for logging. I could use RDS, but it looks like an overkil (and expensive as well, for this purpose). — user205122, Jan 11 '14 at 01:18
Nope, AWS does not (yet) have a good solution for log collection and archiving. — EEAA, Jan 11 '14 at 01:20

logging/capturing STDERR/STDOUT on Amazon EC2

2 Answers2