I am assuming you are making a toy protocol for the fun of it -- learn about crypto, have fun implementing something, but understand that this toy scheme is probably flawed and that if you have sensitive data you should use well-known vetted protocols, not toy schemes you just came up with. (If not see reasons to not roll your own ).
First, RSA just describes the scheme c = m^e mod N
where you encrypt a message m with a public key (N,e)
, and decrypt with private key pair (N,d)
via m = c^d mod N
. You should not encrypt your OTP with RSA -- it will be very inefficient and there are many attacks against RSA without proper padding. You really should always use padding schemes like OAEP (specifically PKCS #1v2). RSA is expensive for long messages, in practice you always use hybrid encryption. That is use RSA to encrypt a randomly generated key for a symmetric encryption function like AES (other encryption functions are fine, just chose AES for concreteness). So Alice first sends ERSA(RSA-PubKey, OAEP(Random-AES-key)), where RSA-PubKey = (N,e) and ERSA(RSA-PubKey, m) = me mod N. Then she sends the AES-encrypted secret data EAES(Random-AES-Key, Secret-Data), which just means Secret-Data
is encrypted with AES and the recently generated Random-AES-Key. Then the receiver first decrypts the AES-key using RSA with the private key (N,d) and then undoes the OAEP padding, and then decrypts the message using that AES-key to get back the secret data.
That said, checksums should not be used to prevent tampering of a message in transit. If an attacker can guess a message, they can calculate the checksum, and then tamper the encrypted message to alter both the message and checksum. For example, if the message was encrypted with AES-CTR mode and the message is m = "Transfer $1000 from Alice to Bob's account."
and uses MD5 hash as checksum h=b2a26c14a029b0a2aadba4fa2ecd32d2
. Eve could calculate xor between that message and m' = "Transfer $9999 from Alice to Eve's account."
which has the md5 checksum h' =308b23cb47b0efff365c2593e0a005d7
and then XOR the encrypted message with m XOR m'
and the encrypted checksum with h XOR h'
.
Message Authentication Codes are the way to provide integrity that your message wasn't tampered. These are essentially checksums that intrinsically rely on a shared secret key. There's always a question of how to MAC, should you Encrypt-then-MAC -- (send E(Data) ++ MAC(E(Data)
) or MAC-then-Encrypt (send E(data ++ MAC(data))
) or Encrypt-and-MAC (send E(data) ++ MAC(Data)
), but Encrypt-then-MAC is the consensus best option though other schemes can be secure for certain ciphers.
There are other issues (e.g., generating OTP is quite expensive and uses a lot of entropy which could be problematic). I also don't see what the OTP step gains in security -- e.g., if the RSA step is compromised all the other data is compromised as well to any eavesdropper.
Furthermore, the one-time pad only has provable security when the pad is used exactly once. The many-time pad is notoriously weak as c1 XOR c2 = m1 XOR m2. E.g., assume Eve can't send requests for files to Bob, that there's some undisclosed authentication function that verifies Alice's identity to Bob before Bob will respond back with a file encrypted via a OTP). If she observes Alice ask for file1, sending an encrypted 7 MB OTP (OTP1), she could then request file2 (only 6 MB) sending back the same first 6 MB of OTP1. Bob then replies back with file2 XOR OTP1, leaking to Eve the XOR of file1 and file2. So along with the OTP that is MAC'd, there must be a scheme with a random nonce (generated by the server) that must be included with the encrypted OTP, that is verified (to make sure the encrypted pad wasn't replaced with a previously used OTP).