I'm currently looking at Secure Coding Practices documentation provided by Veracode with their code analysis toolsuite.
In a "secure logging practices" section, they mention that logging full HTTP requests in case of error is a common mistake, but they don't explain why.
I'm working on a personal website where I have 2 separate log files :
errors.log : any unexpected exception ends up being caught and logged in that file. There, I simply log the stacktrace (classic simple basic exception logging)
security.log : any request that could not be made via the UI, which is a sign of a forged request (example : IDOR attempts like someone trying to access data from another user), leads to a custom runtime security exception being thrown. That Security Exception notably stores the http request that was made, and is then logged. Basically, all my backend validators (that make the same checks as the front-end validators) throw this Security Exception when something goes wrong - the idea is that I (or a Cronned task ;) ) can regularly verify that security.log is empty.
I decided to log the full http request (by that I mean : not the raw request, but I extract all the headers / cookies and parameters and display them in a readable manner, as well as info like timestamp, origin IP and such things) for the security exceptions only (to facilitate the analysis of potential biz-logic-related attacks).
That log file will be opened in a text editor only (VI most probably), and will not be automatically parsed by tools or displayed in a webapp.
Now, I understand that logging full http requests can be leveraged under some conditions. A classic example is a log-analysis webapp prone to XSS attacks that is used by a helpdesk - in this case an attacker could forge a malicious request to let the payload explode once the helpdesk guys check the content of the logs via the vulnerable webapp.
I also understand that logging too much stuff can lead to Denial Of Service due to disks becoming full, but this is already the case with stacktraces.
In my case, what dangers could arise from logging requests, in particular cases ? The only (technically valid but unrealistic, given the low sensitivity of the website) thing I see would be a specially crafted payload that could cause some sort of buffer overflow when parsed by VI (the attacker would have to know that I use VI + use a 0-day etc.. Ok possible for NSA but unrealistic for this small site which is not a target of interest besides for some script kiddies).
I guess someone could do some log forging but good luck finding out my unique request display :P Since I extract data in the request and display them in my own way, it's practically impossible to fool me via log forging.
Should I specifically check for end-of-file VI control characters (does that even exist ?) ?
What else could go wrong ? Am I missing something here ? Now I realize that my question could be paraphrased : "what can go wrong if I let users write text (via the request content..) to a single controlled file on my machine that will only be opened by a up-to-date well-proven text editor"
UPDATE
Update to provide more info in regards to the 3 first feedbacks (thanks for that great feedback btw !)
- The login form is not covered by the mechanism (but the registration form is !)
- I don't run a shop, there are no highly sensitive info. The most sensitive infos would be the first/lastname and birthdate during user registration. There is a game with points and rankings, so security is important to prevent cheating (therefore this custom logging)
- Insisting on the fact that only malicious-looking (front-end-validation bypassing) requests will be logged, in an utopian world the secure-log-file would stay empty forever
Thanks to your feedback I realize the following, amongst other things :
- Disclosure of the content of this file could allow session hijacking due to session cookies being extracted from the request
- A bug in the request parser could make a request not get logged
- A bug in my code could log a malformed user-registration request which would store sensitive info in the secu-log (password clearly being the most sensitive)
Handling :
- Regarding disclosure of the content of the file, I assume that if someone gets there, I'm pwned already :)
- Regarding the bug in the parser, indeed, but at least it would trigger an exception which would be logged in the "normal" logger, this is acceptable
- Regarding the registration form, I consider that even if such a rare bug would occur I should not have access to a plaintext password, even by accident. I'll adapt the parser not to store any request parameter that matches a
*password*
pattern.
I guess that I can't realistically go further than that to protect my users (if I want to be able to detect biz-logic attacks live). I suppose that 99% of sites on the web don't go half as far as those considerations :) ). It seems clear to me that the small added attack surface is outweighed by the security benefits that this custom logging provides.
I'll add additional thorough unit tests to that part of the codebase to ultimately reduce the risks of a bug.
In a corporate environment I would guess twice regarding the logging of sensitive data - I guess I would discuss the matter with someone like a Data Compliance Officer