0

I'm writing a small task management program in Java (not for mobile devices btw), and part of it was kind of a 'data service' to encapsulate and reuse database related code - such as connecting to different DBs, CRUD actions and stuff. The program itself should run offline, so I'm thinking of using SQLite for data storage. I'm aware that Hibernate does all that, but I'm a rather inexperienced dev yet and a friend who helps me with code reviews and some teaching recommended me writing such a module once, so I get a clue of what's going on behind the scenes before using Hibernate.

But back to topic. The data produced in this task management program is not highly sensitive, however, some protection would be fine. For that reason I was searching for ways to somehow protect/encrypt my SQLite database and encryption topics in general. The outcome was really confusing. There do exist some APIs that enable encryption for SQLite, but other sources stated that any offline database (with the password necessarily stored somewhere on the same machine) would be rather easily hackable if someone got control over the respective computer.

So my question is: is it generally safe/viable (at least for non-high end apps) to use offline databases? If so, what would be that best way to apply encryption? If not, why is it safer to use online databases - what are they doing differently?

The usual apologies if this has been asked before - I did search (honestly!), but couldn't find a really satisfying answer. Sorry if these questions sound stupid, but I haven't really come by security questions in my coding "career" yet and this is quite confusing in the beginning...

schroeder
  • 123,438
  • 55
  • 284
  • 319
  • You're asking about a (somewhat disguised) variant of the DRM problem. In short: They control the machine. Anything performed entirely on that machine -- automatic decryption, for example -- can be reversed by an attacker. Online databases are different because now, something is happening on a machine they (hopefully) _don't_ control. All they control are how their machine tries to interact with yours, and you, controlling your online database, control your half of the interaction. Designed properly, you can get much better security that way. – Nic Jul 31 '19 at 07:22
  • OK, I see. So basically, there are just two options: accept the lack of security, but keep it completely offline ('hybrid' variants such as toggling offline/online mode left aside here). Or sacrifice the offline feature for more safety, using online databases (which, btw, have to be maintained and I don't have a DB server)... – Oidipous_REXX Jul 31 '19 at 07:30
  • @Oidipous_REXX: online databases do not necessarily provide more security. Your application needs to interact with the database - no matter if online or offline. This means your application must have the necessary secrets (passwords, encryption keys or whatever is required) to get access. An attacker on the machine where the application is running could thus get access to these secrets too and access the database, no matter if online or offline. – Steffen Ullrich Jul 31 '19 at 07:52
  • Your question is missing a basic aspect of security - who do you want to protect the database from? From the user of the application (why, its his own data), from some hacker on the local machine, from some remote hacker, ... ? Also unclear: what should it be protected against: theft of data, manipulation of data, ... – Steffen Ullrich Jul 31 '19 at 07:54
  • @SteffenUllrich I think it's implied that the attacker has control of the client's machine, and they're trying to protect against that. Which, as you noted, is impossible. (and so did I, in roughly seventy bazillion more less clear words...) – Nic Jul 31 '19 at 07:55
  • Online vs offline database here is a red herring. What's relevant for security here is whether or not the database is running on a machine controlled by the user vs on a machine that they don't control. – Lie Ryan Jul 31 '19 at 23:31

1 Answers1

2

What you're asking is a variant of the DRM problem. In extremely generic terms, this is your scenario:

  • You have a thing which you want your software to do, on the customer's machine, and
  • You don't want the customer to be able to do that thing without your software.

This is flatly impossible. If it happens on their machine, you cannot control it. All you can control is what "it" is, at least insofar as your design requirements let you.

Let me talk through a couple of common ways to control "it" though:

  1. Storing the data on another machine, like an online database.

    This works fine, except that you're still only protecting that database. Any data sent to the client to be viewed can be stolen by malicious software, which can do anything from sniff the incoming network traffic, to capturing the display cable signal.

  2. Decrypting based on something your software can't access alone.

    The most common incarnation of this today is the password, but hardware security tokens are increasingly common alternatives. In short, by requiring some separate thing before it can access the data, your software can quite handily prevent anyone from just reverse-engineering the binary to decrypt the data.

    Note that you're still susceptible to the people who own the machine watching your process as it runs to steal the decryption keys, or the data itself, out of memory. Your code might not be able to do it entirely alone, but it's still decrypting each byte with no further user input. Hardware security tokens can help address this, but then you need to actually use the decrypted data, so... you see the point. It's an infinite regression.

  3. Minimize the amount of data you can access.

    If your broader service stores credit card numbers, addresses, usernames, and favorite colors, and your specific app only ever needs to access the last two, you can put up brick walls in front of the first two. This touches on the CIA triad, but in short, it's easier to protect against disclosure to unauthorized parties if even authorized parties can't access it.

Which one you choose depends on what you're making, what your constraints are, and how much time and money you're willing to spend. If you're planning to sell this to people for real money, even if you probably won't ever handle obviously sensitive data, you should still probably hire a professional to help you design it securely. As you can guess, it's much better to spend a little money now being proactively secure than a lot of money later when your lack of security comes back to bite you.

Nic
  • 1,806
  • 14
  • 22
  • Thank you very much for clarifying! I'll probably use point 2) in this case, even though I won't sell it (it's just my wife using it ;-) However, I'll definitely let a professional dev review its design at some point. Could you tell me some good and comprehensible tutorial/introduction (either online or as a book) to this topic? I feel like I really should get a better grasp of the whole thing myself, and I need to start somewhere... – Oidipous_REXX Jul 31 '19 at 09:22
  • @Oidipous_REXX I don't really know any. I got my start by being curious and semi-accidentally writing a rootkit to investigate how the kernel worked. That's probably not a particularly typical path. – Nic Jul 31 '19 at 10:01
  • @Oidipous_REXX By topic, you mean DRM, or information security in its broad sense? The later is a very large field of study. Either way, I would recommend you to be curious and browse security.SE, looking for the best answers in your tags of interest. Read the links in those answers. You can read Wikipedia on topics of interest, then come back here to read more on specifics. The CIA "triad" is a fundamental, but be wary: for cryptographers the "A" means "authenticity" instead of "availability". Also, the "CIA" does not encompass all security needs, just the more common ones. – A. Hersean Jul 31 '19 at 12:22
  • @Oidipous_REXX You should also read about security assessment and threat modeling, and the methodologies to do it (without going into the details). You can also learn and practice your skills by solving security challenges. – A. Hersean Jul 31 '19 at 12:25