47

Listening to the Secure code lessons from Have I Been Pwned made me really think about logging.

It appears that in the real world a lot of data breaches are discovered long after they happened which makes the investigation and recovery much more difficult because oftentimes there are no logs to follow and research.

What can we do about it? Should we keep all the application/system/webserver logs forever?

alecxe
  • 1,515
  • 5
  • 19
  • 34
  • 1
    I wouldn't bother keeping them once the server they relate to has been decommissioned, at least... I've certainly seen breaches that weren't discovered until several years after the fact though. – Matthew Dec 20 '17 at 17:00
  • 7
    For reference: EFF's [.well-known/dnt-policy.txt](https://www.eff.org/dnt-policy)'s 2.a includes a promise to limit logs to 10 days unless there's an ongoing attack. –  Dec 20 '17 at 20:28
  • 2
    You're adding risk at least when you keep the logs in the production systems. Some attacker can get juicy details about your users from the logs. Account names alone are a valuable information, but think about your mail.log, which lists every incoming and outgoing mail. Do you really want to risk to getting this in the hands of an attacker? Such logs should rather be deleted as soon as possible, which usually means you need to define how long you want to be able to debug user issues. – allo Dec 22 '17 at 13:34

3 Answers3

44

There is no "correct" answer to your question, unfortunately. Data retention policies are specific to the needs of an organization, and are often implemented out of necessity to comply with various legal requirements , which vary depending on the nature of the data being stored, as well as the jurisdiction that the data falls under.

Retaining log data can allow for post-mortem analysis if a breach is discovered, as you're alluding to in your question. However, retained data can also be a security risk in its own right if the logs contain sensitive information, so steps must be taken to secure log files if necessary. The other obvious factor in play is the cost of keeping the logs. Depending on availability requirements, different backup solutions may be more cost effective than others, such as keeping old logs offsite on tape storage rather than using disk redundancy.

John
  • 769
  • 6
  • 10
  • 5
    I'd add that 'forever' is probably not applicable. Would we want that data in 20 years? 50? Would it be relevant at either those points, much less readable? – baldPrussian Dec 20 '17 at 18:46
  • Yes they're readable since any enterprise retention policy would store one on the SAN and the other at an industry grade, humidity + temp controlled site like Iron Mountain. This is a standard to any network larger than a coffee table LAN inside a hobbyists kitchen. – 123456789123456789123456789 Dec 20 '17 at 19:23
  • 18
    @UncertainWhatNameToPickHere: I tried to store my backups indefinitely. After about 15 years ago, the devices to read the backup media simply didn't exist any more. I still keep some 8", 5,25", and 3,5" floppies and some Iomega Zip and Bernoulli disks for sentimental value, but the drives simply don't exist any more, and if they did, I don't have the interfaces to plug them into, and even if had, there are no drivers anymore, and even if I were to run a 15 year old OS to get the drivers to work, I wouldn't have any hardware to run that OS on. – Jörg W Mittag Dec 20 '17 at 19:42
  • 8
    I'm not saying it is impossible, but 50 years is an awful lot of time. You would not have to archive just the logs, but pretty much the entire ecosystem: the storage devices, operating systems, computers, etc. 50 years ago would have been before the introduction of the floppy disk! – Jörg W Mittag Dec 20 '17 at 19:47
  • 15
    @JörgWMittag Leeching into the irrelevant, A significantly sized enterprise migrates obsolete medium onto contemporary architecture, an enterprise has more than ONE employee with Zip disks while globally practicing retention tasks in adherence to a formal policy to address this. Major institutions aren't mothballing old gear like a Terminator 3. And your Zip and Bernoulli's would've nicely robocopy'd over to an increasingly inexpensive USB Sandisk Cruzer and network RAID array just like my Syquest SparQ cartridges did. I really can't address irrelevant hobbyist/homeowner challenges. – 123456789123456789123456789 Dec 20 '17 at 19:56
  • 14
    @UncertainWhatNameToPickHere That's exactly the objection I have to people naively assuming that format rot is a massive deal. It's only a big deal for those who don't care one bit about their data. It's so so easy to migrate to new media, especially given that new media will always have a higher capacity than older media. – forest Dec 21 '17 at 01:17
  • 3
    @JörgWMittag [I challenge your assertion](https://retrocomputing.stackexchange.com/q/551/278). The data isn't inaccessible... "just" _hard_ to reach. – wizzwizz4 Dec 21 '17 at 21:52
  • @JörgWMittag In this time disk space was expensive. Today you store the logs on your hard drive and when you migrate to SSD only you copy them over there. And from there to the next medium. And media get only larger and larger, so there isn't any need to keep stuff on older media anymore. And a text file will be readable many decades from now. If there are really changes in formats like ASCII or UTF-8, there are ways to convert the files. Even 50 years later. – allo Dec 22 '17 at 13:38
  • @JörgWMittag Sure you have the drivers. It's not like floppies used unique drivers. – forest Dec 27 '18 at 02:51
16

10 Years

Storing logs is cheap, more often they're ASCII/UNICODE and easily compressed for long-term archival.

Keeping your logs is better than purging for the reasons you can't anticipate.

But a minimum, a ten-year retention policy is an industry best practice for US-based businesses since the federal statute of limitations and in most states is a decade maximum regarding non-grave person crimes.

Specific industry sectors go further, medical clinicians including hospitals retain health records and the corollary electronic log data for 50 years.

Telecommunications providers such as NYNEX (acquired by Verizon) and other "Baby Bells" retained their Pen Registers, the logs of their subscriber's phone calls forever.

Records retention, mirroring and safe off-site archival is a practice that every sizeable organization has to tackle but becomes routine when implemented.

If you're a services provider, hosting company or in any way a custodian of Personally Identifiable Data, a 10-year retention policy will keep you in compliance with every well known and industry accepted security standards including PCI-DSS and the rest of the phonebook of industry best practices.

Demonstrating a uniform ironclad retention policy helps a business quickly staunch the topic in the RFP selection process and will define yourselves as "up to par".

  • 11
    If the storage time is more than about a year, you should probably periodically check that you can still read the archives. – Jonathan Leffler Dec 20 '17 at 18:40
  • 8
    There is a difference between logs relating to ongoing processes or contracts (e.g. Health records, tax records) and system processes (e.g. automated logs relating to web visits). In many cases, the second category can't be directly linked to a specific user. I'm dubious that they're useful to any process after such a length of time, unlike, say, records of when a given person had particular vaccinations, which could potentially have life long impact, and deserve longer term storage as a result. – Matthew Dec 20 '17 at 19:53
  • 17
    If you're a custodian of personal data then a 10-year retention policy may be compliant with US standards but risks being seen as excessive by EU data protection regulators: in the EU, the standard is that you should keep personal data only for as long as you need it for the registered purposes for which you process it. – Peter Taylor Dec 20 '17 at 21:01
  • 6
    If your logs contain any personal information (as defined by the legislation), then the likes of GDRP in Europe will mean you will have to delete that information on request of the owner of it (ie. the person to which it relates). Thus, unless you've taken the time to anonymise your logs, you could be in contravention of one law while trying to stay within another. – Ralph Bolton Dec 21 '17 at 11:09
  • 5
    Some companies have a deliberate documented policy of deleting information once it is no longer required. This is to stop somebody else's lawyer accessing it through a subpoena, It's much more difficult for someone to sue you over something you did 5 years ago if you've deleted all records that you ever did it. – Simon B Dec 21 '17 at 11:52
8

Storing these log files indefinitely MAY BE illegal in the EU. I am saying MAY BE, since the new data protection legislative comes into effect in May 2018 and there are still some unclear areas. However, the rules are following:

If you don't have explicit consent (which, I presume, you don't have), you are allowed to keep personal data only for purposes allowed by the law. Keeping log files for the purpose of investigation of data breaches is allowed, since the following exception applies: "processing is necessary in order to protect the vital interests of the data subject or of another natural person".

However, you are still bound by the principle of proportionality, so you can store log data only to the extent that it is "necessary". At some point, the usefullness of the data is only theoretical, so the legal ground for processing disappears. There is no hard-set limit, but in any case, the burden of proof is on your side - you have to prove that storing log files is necessary to protect security.

You should be concerned of this, even if you are operating in the US, since this regulation applies very widely (for example, you have clients in the EU).

Anyway, there is a way around this regulation - if your logs don't contain personal data (eg. user cannot be identified), regulation does not apply. However, since IP address is considered personal data (eve

MikiRaven
  • 81
  • 1