I was reading a document about logging and analysis. The document talks about statistical analysis and machine learning techniques to detect some attack scenarios. For instance, If you want to detect a Possible Brute Force Log-in, you might want to look at the following features:
- Firewall Accepts, Multiple Failed Logins in a Row, At Least 1 Successful Login.
What is interesting for me is that these features are collected from different sources (Firewall, Source machine). I have a use-case where I am interested in detecting attacks that try to download and install backdoors. I have logs collected from IDS, Firewall, HTTP server and a Syslog server. I want to find some indicative feature that I can feed to my Machine learning Model. The problem for me is shown in the picture below:
This fellow researcher manually analyzed the logs and provided some useful insights. But he only used on source machine (Http). Specifically, the data field in these logs.
Does this mean that backdoors are hard detect by security devices? What if I want to use other features, as shown in the brute force example, to detect backdoors in an automated manner, what would you propose ?
PS: I only want some general ideas about these features. I know that backdoor detection can be hard. Fortunately, I only have to study the backdoor in the dataset I have x).
Bests.