Why would properly logging full http requests be bad practice?

Question

I'm currently looking at Secure Coding Practices documentation provided by Veracode with their code analysis toolsuite.

In a "secure logging practices" section, they mention that logging full HTTP requests in case of error is a common mistake, but they don't explain why.

I'm working on a personal website where I have 2 separate log files :

errors.log : any unexpected exception ends up being caught and logged in that file. There, I simply log the stacktrace (classic simple basic exception logging)
security.log : any request that could not be made via the UI, which is a sign of a forged request (example : IDOR attempts like someone trying to access data from another user), leads to a custom runtime security exception being thrown. That Security Exception notably stores the http request that was made, and is then logged. Basically, all my backend validators (that make the same checks as the front-end validators) throw this Security Exception when something goes wrong - the idea is that I (or a Cronned task ;) ) can regularly verify that security.log is empty.

I decided to log the full http request (by that I mean : not the raw request, but I extract all the headers / cookies and parameters and display them in a readable manner, as well as info like timestamp, origin IP and such things) for the security exceptions only (to facilitate the analysis of potential biz-logic-related attacks).

That log file will be opened in a text editor only (VI most probably), and will not be automatically parsed by tools or displayed in a webapp.

Now, I understand that logging full http requests can be leveraged under some conditions. A classic example is a log-analysis webapp prone to XSS attacks that is used by a helpdesk - in this case an attacker could forge a malicious request to let the payload explode once the helpdesk guys check the content of the logs via the vulnerable webapp.

I also understand that logging too much stuff can lead to Denial Of Service due to disks becoming full, but this is already the case with stacktraces.

In my case, what dangers could arise from logging requests, in particular cases ? The only (technically valid but unrealistic, given the low sensitivity of the website) thing I see would be a specially crafted payload that could cause some sort of buffer overflow when parsed by VI (the attacker would have to know that I use VI + use a 0-day etc.. Ok possible for NSA but unrealistic for this small site which is not a target of interest besides for some script kiddies).

I guess someone could do some log forging but good luck finding out my unique request display :P Since I extract data in the request and display them in my own way, it's practically impossible to fool me via log forging.

Should I specifically check for end-of-file VI control characters (does that even exist ?) ?

What else could go wrong ? Am I missing something here ? Now I realize that my question could be paraphrased : "what can go wrong if I let users write text (via the request content..) to a single controlled file on my machine that will only be opened by a up-to-date well-proven text editor"

UPDATE

Update to provide more info in regards to the 3 first feedbacks (thanks for that great feedback btw !)

The login form is not covered by the mechanism (but the registration form is !)
I don't run a shop, there are no highly sensitive info. The most sensitive infos would be the first/lastname and birthdate during user registration. There is a game with points and rankings, so security is important to prevent cheating (therefore this custom logging)
Insisting on the fact that only malicious-looking (front-end-validation bypassing) requests will be logged, in an utopian world the secure-log-file would stay empty forever

Thanks to your feedback I realize the following, amongst other things :

Disclosure of the content of this file could allow session hijacking due to session cookies being extracted from the request
A bug in the request parser could make a request not get logged
A bug in my code could log a malformed user-registration request which would store sensitive info in the secu-log (password clearly being the most sensitive)

Handling :

Regarding disclosure of the content of the file, I assume that if someone gets there, I'm pwned already :)
Regarding the bug in the parser, indeed, but at least it would trigger an exception which would be logged in the "normal" logger, this is acceptable
Regarding the registration form, I consider that even if such a rare bug would occur I should not have access to a plaintext password, even by accident. I'll adapt the parser not to store any request parameter that matches a *password* pattern.

I guess that I can't realistically go further than that to protect my users (if I want to be able to detect biz-logic attacks live). I suppose that 99% of sites on the web don't go half as far as those considerations :) ). It seems clear to me that the small added attack surface is outweighed by the security benefits that this custom logging provides.

I'll add additional thorough unit tests to that part of the codebase to ultimately reduce the risks of a bug.

In a corporate environment I would guess twice regarding the logging of sensitive data - I guess I would discuss the matter with someone like a Data Compliance Officer

If you're logging all parameters, does that include password data for failed attempts, such as when you mistype your password? Are you running a shop? Logging the card number fields? The data itself can be dangerous - and people tend to forget about it when thinking of things to protect. — Matthew, Mar 30 '16 at 15:17
No, mistyping a password is a normal use-case and can be done via the UI, so it would not trigger the special security-logging. In fact the login form is the only form that does not use this mechanism to avoid storing plaintext passwords, I should have mentionned that, will update post. No I'm not running a shop, not logging any payment info, there are no sensitive data. Thanks for the good questions I'll update the main post after I read the 2 current answers — niilzon, Mar 30 '16 at 16:34
now that I think of it, it could trigger for the registration form and actually store a password in some logs. That is, if the user deliberately messes with the site (e.g. disabling javascript validation) or if there is a bug in my code. I'll expand on this with the edit of the original post ! — niilzon, Mar 30 '16 at 16:39
If you have user registration, username and password are "sensitive." Many people will [reuse credentials](http://security.stackexchange.com/q/101385/46979) from other systems, so you *must* protect them. Certainly don't log this data in plain text, as paj28 mentions. — jpmc26, Mar 30 '16 at 22:05
@niilzon If your login form crashes for some other reason that *isn't* a mistyped password, would you log the password? — user253751, Mar 31 '16 at 05:36
@jpmc26 indeed, as stated in the "update" part of the main post, the registration form could indeed cause a password to be logged. Therefore I'll adapt the request parser not to log the value of any parameter that contains the "password" word in the name (to cover password + passwordConfirmation params) — niilzon, Mar 31 '16 at 06:35
@immibis no, cf the "update" part of the main post, the login form is not covered by the mechanism (its the only form that isn't). But the registration form is - therefore it will be solved, check the update part :) — niilzon, Mar 31 '16 at 06:39
@niilzon So it'll still log pwd and passwd and pswd and pass? — user253751, Mar 31 '16 at 06:41
@immibis no, because there are no such request parameters in the app. The only password-related parameters that could be logged are "password" and "passwordConfirmation" from the registration form, and as described in the update I'll adapt the parser not to log any value of any parameter for which the name matches a ´*password*´ pattern. So for example you would have &password=secret&passwordConfirmation=secret <== since both param names contain "password", their value will not be logged — niilzon, Mar 31 '16 at 06:53

score 29 · Accepted Answer · answered Mar 30 '16 at 15:42

The key word is properly.

Properly logging HTTP requests when there is a need for it is not bad practice. I am a pen tester and I log all the HTTP requests that I make as part of a test; doing so is expected. I also work on a server system that integrates with a number of complex legacy systems. Logging full HTTP requests on error is a necessary feature. It can show details like which system in a cluster made the erroneous request, which would otherwise be lost.

Veracode is an automated source code analysis system. It can tell if you're passing a HTTP request to a log function. But it cannot really tell if you are logging "properly". So they have come up with this rather vague finding, because that's the best their system can do. Don't put too much weight on it. Does the issue have a risk rating? I suspect it would be low risk. Most people are less thorough than yourself and pay little attention to low risk issues.

They key parts to logging properly are:

Log injection / forging
Denial of service
Confidentiality of logs

You mention the first two in your question. For a personal website, the third one is a non-issue - you're the owner, sys-admin, everything - and only you will have access. This is a much bigger issue in a corporate environment, where you certainly don't want to let everyone in the company have access to the confidential data (like user passwords) in the requests. Some systems deliberately mask parts of data in logs - especially credit card numbers, for PCI compliance.

You mention you extract headers, cookies and parameters and format them in the log file. I recommend you log the raw requests, and have a separate post-processor to format them. There will be odd situations - e.g. parameter duplicated in URL and POST body - that may cause errors. Extracting and formatting can cause the feature in the original request to be lost.

amazing feedback! Actually I did not run Veracode against that codebase, but while reading the Veracode document I wondered if the logging of my personal home project was good enough :) You have a good point regarding logging the raw request in case of error in my request parser. I'll update the main post in regards to this and to the comment of Matthew — niilzon, Mar 30 '16 at 16:45

score 13 · Answer 2 · answered Mar 30 '16 at 15:27

The problem with logging everything is not that you could implement it wrong (eg allow XSS, code execution, buffer overflow, etc), because the solution to that would simply be to do it right, as it is with all code. [You do provide an additional attack surface, but if you decide that keeping additional logs is important, that may well be worth it]

The problem is with what you log: all the headers / cookies and parameters. These obviously may contain sensitive information, such as session ids, login cookies, passwords, credit card information, and so on, in plaintext.

Now, if you only log requests that you know are malicious, it may be ok to log these. But can you be sure of that? Maybe you accidental throw a security exception on a valid action, or you throw one on a security incident that can be produces by a valid user - such as an invalid login because of a typo.

Even if you store these logs securely, you may have vulnerabilities in your code that allow an attacker to read them anyways (eg via LFI). You may also have backups of these logs, which may be compromised. Or an attacker may have gained limited access to the server and gain other useful information from these logs.

Updating original post in regards to your remarks + remarks of other contributors, thanks ! — niilzon, Mar 30 '16 at 16:46

Why would properly logging full http requests be bad practice?

2 Answers2