0

I am learning about file formats. I have an application that will do analytics and will have some personal information. I need to make it such that only some people are able to access data and even those people get very limited access according to an api that I will provide. I am thinking about encryption as well as using a proprietary format to make it difficult to reverse engineer anything. Encryption will allow only people with key to access the data and file format will stop people from looking into the data directly and using only an api of the platform. Although I am not very knowledgeable about it, I have heard that it is relatively easy to reverse engineer file formats. I want to create a file format that will be difficult to reverse engineer so that application is safe. Please give me pointers on what will help make a format difficult to reverse engineer. Also, are there any metrics that help know safety of an algorithm against reverse engineering attack?

kneelb4darth
  • 105
  • 3
  • Check out [reverseengineering.SE](https://reverseengineering.stackexchange.com/) for all your RE questions. In the meantime, see the answers to this question: [Any comprehensive solutions for binary code protection and anti-reverse-engineering?](https://security.stackexchange.com/questions/1069/any-comprehensive-solutions-for-binary-code-protection-and-anti-reverse-engineer) as well as a book called [Surreptitious Software](http://dl.acm.org/citation.cfm?id=1594894) – julian May 21 '17 at 16:03

2 Answers2

4

I am thinking about encryption as well as using a proprietary format to make it difficult to reverse engineer anything..... I want to create a file format that will be difficult to reverse engineer so that application is safe.

A file format which is difficult to reverse engineer is way more difficult to implement correctly in the application than a simple file format. This means testing the application will be harder and strange and hard to reproduce bugs will occur. Also probably security relevant bugs will be introduced without somebody noticing it because the code to support a complex format is way harder to review than code supporting a simple format.
Because of this the application will not be more safe but actually less safe.

In summary: don't try to reach security by adding unneeded complexity. Instead keep it simple and rely on properly implemented encryption for security, since you want to use encryption anyway.

Steffen Ullrich
  • 184,332
  • 29
  • 363
  • 424
  • Ok, I understand keeping the file format simple. The problem is that data is sensitive. The people who get access to data should also not be able to reverse engineer and get access to data. How do I solve that second hurdle? – kneelb4darth May 21 '17 at 16:07
  • @kneelb4darth: you already want to use encryption and encryption is the way to protect data, not complex file formats. – Steffen Ullrich May 21 '17 at 16:08
  • @kneelb4darth If you don't want others to access data, you have to encrypt it. Obfuscation is not the right way. – Arminius May 21 '17 at 16:09
  • If a person gets access to decrypted data using the key, they shouldn't be able to get to go through real data. Instead there is a specific api that limits what you can do with (decrypted) data. It doesn't allow filters/other analytics that is harmful. So I thought a file format inside encryption could help. If not that, what can I do? – kneelb4darth May 21 '17 at 16:12
  • I understand that encryption is great for stopping access, but I want to limit the access too, not just completely block or completely allow full access. That is the goal. How can I realistically achieve that? – kneelb4darth May 21 '17 at 16:15
  • 1
    @kneelb4darth Well, the solution is to not hand out restricted data to the user in the first place. Your API could provide the requested data on demand. – Arminius May 21 '17 at 16:24
  • 2
    @kneelb4darth: if you want to have restricted access to specific parts only then use multiple encryption keys for different parts. One can build a schema out of this where different users get access to different parts - but this would be another question. Protection through format obfuscation is a bad idea because in the simplest case the attacker will not reverse engineer the format at all but simply hook into your application to read the data - and your application understands the file format. – Steffen Ullrich May 21 '17 at 16:26
  • If the encryption key is in the app, "encryption" === "scrambling" === "obfuscation" anyway. And actually, using standard algos is attacked MUCH easier than non-standard ones. See also on "whitebox encryption". – No-Bugs Hare Sep 23 '17 at 15:25
2

Creating a file format that is difficult to reverse-engineer is something you shouldn't aim for because it's likely to be unnecessarily time-consuming and in the end ineffective.

If a user has access to the application, it's often trivial to reverse-engineer the format by altering small amounts of data in the application and observing the changes in the file. You have to expect that this is easier than reverse-engineering obfuscated executables.

If you're concerned that users may extract sensitive data from the files you gave them access to, you have to overthink the design of your application. This is not something that can be solved by obfuscation.

Arminius
  • 43,922
  • 13
  • 140
  • 136