7

I have a lot of means of searching for malicious code within the file system, monitoring traffic, scanning log files, checking for suspicious/masked processes etc.

However, scanning a relational database such as MySQL is no easy task. Some exploits such as the Magento Shoplift from 2015 aim to inject malicious code within the database being aware of its structure and how it cooperates with the server-side application(s). Dumping the database and then running a signature-based search would be utterly inefficient as some of data is stored in BLOB and other types of data. Also, it is not necessarily looking suspicious and is far from what a PHP webshell might look like, for instance.

My question is: what is the most practical and efficient way to detect anomalies and spot malicious code in a MySQL database?

Could the following be considered a decent approach?

  1. Dump the database;
  2. Check it line by line compared to the same database from a backup archive;
    • say using a functionality similar to that of diffchecker.com;
  3. Analyze the newly inserted/updated data;
    • this might involve skipping large amounts of data which is clearly not that of any malicious behavior;

Thank you.

McJohnson
  • 282
  • 2
  • 7
  • Why do you need to dump the database? A BLOB is simply stored as it is in the database files, a signature checker (i.e. a scanning tool) should be able to find malware signatures anywhere within a file including a tablespace file. – grochmal Sep 19 '16 at 23:32
  • As I've mentioned it earlier, signature-based detection is useless in this case. BLOBs were just an example. I don't need a solution fitted to BLOB data in particular but to the entire data in the database. – McJohnson Sep 19 '16 at 23:50
  • Fair enough. But I still believe that you might focus a little more the question about what kind of "malicious code" you're looking for. Most malware does not compress well therefore even if the database does some compression a scanner can find it. But "malicious code" can pretty well be some XSS attempt and that would not be found by a malware scan (and an XSS attempt would compress well too). – grochmal Sep 19 '16 at 23:55

1 Answers1

1

Interesting question. First you have to define what kind of malware you are looking for. In this case this might be:

  • Malicious data: Modified data that lost integrity.
  • Malicious code: Established, embedded or modified data that is capable of running (e.g. executable binaries, stored XSS).

If you are trying to detect malicious data you might chose one of the options:

  • Analyse the data for integrity by using regex. This might take a lot of time for preparation because you have to know how exactly your data structures have to look like.
  • Compare legitimate data to the current data sets to detect differences. You might create a hash value of fields, records or tables and compare them once in a while. Storing them in the same database might not be advised. Use another database or even another medium for that (e.g. a file-based approach). If an attacker is able to compromise the database and influence integrity, he might not be able to do the same on the file-system.

This approach might be able to detect malicious code also. But you might be able to use other techniques to detect it properly:

  • Use a signature-based approach to determine malicious code fragments. The easiest way is to screen you fields. But this expects the data to be in a format which is understood by the scanning engine. If you are serializing or encoding your data, this might hurt the detection possibilities.
  • You might want to create a dedicated API access possibility for you scanning desires. Your database must provide the data like it would be within an attack. For example if you are using your data within a mail attachment, your API shall create such an attachment to make it possible to be scanning how it will be in the end. Other examples are file-based and web-based output.

Signature-based scanning approaches are error-prone, inefficient and slow. If you have the chance to establish an integrity checking as soon as data is written, this might be much more solid.

Marc Ruef
  • 1,060
  • 5
  • 12