I've been thinking about this problem for some time and I wanted to ask if there are any known methods, or research papers, about how to prove "authenticity" or correctness of data originating from a potentially compromised source (remote server, process, etc). Specifically what I've been imagining is say you have service A
and service B
, service B
sources data from A
but is worried that A
has been compromised such that even if data is signed by A
, B
can't trust that it was generated by code written by A
's developers. Is it possible for B
to prove to itself that data from A
is authentic, that it was indeed generated by the expected code and not injected or generated by an attacker who has compromised A
?
One solution I've been thinking about is using a sort of distributed ledger or blockchain so that multiple nodes compute the same data, and in doing so raises the bar such that an attacker would have to compromise N% of the services producing the needed data, this provides naturally replication and I can use an appropriate consensus protocol, but ofc introduces some overhead, efficiency concerns, and I would need to think hard about side-effects being performed more than once.
If there is only one node possible of generating data, such as a sensor node, and it is compromised, I'd imagine all hope is lost, but I also wouldn't be surprised if there is some clever crypto scheme that attempts to solve this problem as well.
I hope it's clear as to what the question is, thank you.
Edit: After some research I stumbled upon two crytoschemes that seem to attempt to address the problem:
- Secure Multiparty Computation (SMC). I found a thesis paper Implementation of a Secure Multiparty Computation Protocol and the author says
In typical use cases of SMC, parties involved are mutually distrustful, butone can also imagine the case of multiple machines owned by a single party, performingSMC to collectively decrypt and process confidential data. No single machine would havethe key, and no single machine would see the plaintext. Now it would not be enough forthe APT to compromise a single machine holding the decryption key, but every single oneof the machines would have to be compromised.
This seems almost what I was looking for.
- Homomorphic Encryption: This seems to be another cryptoscheme that might be able to achieve a similar goal, except that, if I understand correctly, an attacker could still perform arbitrary operations on encrypted data while not knowing exactly what the data is.
I don't know enough about cryptography to know if these two schemes might one day be a practical option to solve the problem of not trusting service A
as described earlier, any insight?
Thanks again.