Trying to understand RSA and it's terminology?

Question

I get the gist of RSA but I'm mostly self taught so sometimes the phrases I picked to describe something don't match up with the widely accepted terminology. I'm trying to remedy this.

I'm working with a NodeJS library called node-rsa. After you load the key, it basically has 4 functions: encrypt, decrypt, sign, and verify.

My understanding is that the following statements are true, although I'm not positive. I'm also assuming that the library knows and follows the proper terminology and isn't implementing things in any abnormal way.

Encrypt uses the provided (or generated from the private key) public key to encrypt the data.
Decrypt uses the provided private key to decrypt the data.
Sign uses the provided private key to encrypt the data, the output is called a "Signature".
Verify uses the provided (or generated from the private key) public key to decrypt the signature.
Encrypting the same data two different times could result in two different outputs depending on the padding. Signing uses no padding and will produce the same output for matching inputs.
If I sign something, I'll need to send the input that was signed as well as the signature to the recipient of the message.
The input for a signature is usually the hash of the message being sent to verify that the message hasn't been changed.

Am I on track or is this all nonsense?

EDIT:

These I feel are obvious, but I'd really be doing myself a disservice if I believed them so fundamental and was wrong about them.

When encrypting something I'm sending, I use the other party's public key.
When decrypting something I've received, I use my private key.
When signing something I'm sending, I use my private key.
When verifying something I've received, I use the other party's public key.

There are still some details I have questions about such as is the signature included in the message or is it transmitted unencrypted and alongside the payload? These questions probably warrant unique questions on the site however.

It sounds like you've got your ducks in a row. Just be sure to keep the private key secure. — RoraΖ, Oct 02 '14 at 15:06
@raz I added a few thoughts about ownership of keys that may or may not prove my ducks are in a row. — Corey Ogburn, Oct 02 '14 at 15:19
No you've still got a good grasp on things. You're correct in thinking that the message should be unencrypted along with the encrypted signature. The signature is generally an encrypted hash of the message. Using secure algorithms. — RoraΖ, Oct 02 '14 at 15:29
To generate a cryptographic signature, you use your private key to encrypt a "digest" of the document you want to sign. To create the digest you input the document to a hash function. The recipient will use your public key to decrypt the signature and reveal the digest you computed. The decrypted digest will be used by the recipient to verify that the document has not been altered. They do this by inputting the document into the same hash function that you used. If they compute the same digest as the one they decrypted with your public key, then the document has not been altered in transit. — , Oct 02 '14 at 16:49
@JamesR Is it safe to assume that `sign` functions of RSA libraries do the hashing for me, or do I hash and sign the hash? node-rsa is really light on the details. — Corey Ogburn, Oct 02 '14 at 16:52
I just had a dig around in the node-rsa source code on github. Looks like its just a light wrapper around the node.js crypto library. The node-rsa code seems to default to the SHA-256 hash algorithm by calling crypto.createSign('RSA-SHA256'). So it appears that it does the hashing for you, you just need to provide the data you want to be signed. — , Oct 02 '14 at 17:08
@raz: I suppose I don't know what's _actually_ generally done, but signatures definitely [_should not be_ encrypted anything](http://crypto.stackexchange.com/q/14875/991). — , Oct 02 '14 at 17:10
@RickyDemer I'm not very mathy, so I don't really know the relevance of that link. But signing data is the act of encrypting with the private key. Perhaps my wording wasn't very clear in the last comment. I did not mean to imply encrypting the signature. — RoraΖ, Oct 02 '14 at 17:14
@raz: I guess that might be true, even though poncho's answer to the question I linked to gives a fast attack against that, and pages 28 and 32 of [this pdf from EMC](http://www.emc.com/collateral/white-papers/h11300-pkcs-1v2-2-rsa-cryptography-standard-wp.pdf) describe the process as something different. — , Oct 02 '14 at 17:28
@RickyDemer that fast attack is based on a public key of 0x3 and generalized for 0x5. Most public keys are no less than 0x10001. — RoraΖ, Oct 02 '14 at 17:36
@raz: No, it's "based on a public" _exponent_ "of 0x3 and generalized for 0x5." — , Oct 02 '14 at 17:42

score 22 · Answer 1 · edited Oct 07 '21 at 06:58

The source of your confusion is an historical "explanation" of RSA signatures as being "encryption of the hash with the private key". This explanation is confusing, and flawed in many ways. It works only for RSA. Actually it does not work for RSA either.

Let's define things properly:

An asymmetric encryption system defines two functions:
- Encrypt: takes as inputs a message (a sequence of bytes, possibly limited in length) and a public key. It outputs a value in a given space of possible values (in the case of RSA, this is a big integer, no longer than the modulus).
- Decrypt: takes as inputs an encrypted message (an output of Encrypt) and a private key, and produces back the original message (provided that the private key matches the public key which was used for encryption, of course).
A signature algorithm defines two functions:
- Sign: takes as inputs a message (a sequence of bytes, usually not limited in length) and a private key. It outputs a value in a given space of possible values.
- Verify: takes as inputs a message, a signature value (an output of Sign) and a public key, and outputs a boolean value ("true" on match, "false" otherwise).

Since these four functions take different types of inputs and produce different kind of outputs, they cannot be swapped around.

In general, asymmetric encryption algorithms and signature algorithms live in separate worlds. It so happens that there is an asymmetric encryption algorithm known as "RSA", and there is a signature algorithm that is also known as "RSA". Moreover, they both use the same kind of public/private key pairs. However, still generally speaking, encryption and signature keys should be distinct.

If we look at the internals of RSA (both algorithms), as defined in PKCS#1, then we may find some common elements, and others which are distinct.

For asymmetric encryption, the message must be smaller (by at least 11 bytes) than the modulus. The message to encrypt is "padded" as follows:

0x00 0x02 rr rr ... rr 0x00 mm...

where mm... is the message, and the rr are random non-zero bytes. The number of rr bytes is adjusted so that the total length is equal to that of the modulus, and there must be at least eight such non-zero bytes (hence the restriction on the message length). The resulting sequence of bytes is interpreted as an integer modulo N (the "modulus" part of the public/private key pair), and raised to the public exponent modulo N. The result is the encrypted message: it is an integer modulo N (the convention is to encode that message in big-endian, yielding a sequence of bytes of the same length as the modulus).

To give some numbers: if you use a 1024-bit RSA key, the modulus has size 128 bytes. Messages that can be encrypted must be no longer than 117 bytes. The encryption result is a sequence of 128 bytes.

For signatures, the message that is to be signed is first hashed with a suitable hash function like SHA-256. Thus, arbitrarily long messages can be processed (SHA-256 has no practical limit on input length). The hash value is then padded, with a padding which looks like this:

0x00 0x01 0xFF 0xFF ... 0xFF 0x00 ii... hh...

where ii... is a conventional header that designates the hash function, and hh... is the hash value. The number of 0xFF bytes is again adjusted so that the total length matches that of the modulus. The resulting sequence is again interpreted as in integer, and exponentiated modulo N, but this time the private exponent is used. Moreover, most RSA implementations actually optimize that operation by using the Chinese Remainder Theorem through their knowledge of the factors of N (these factors are part of the private key).

So, while both encryption and signature generation rely on some sort of padding followed by an exponentiation, they differ in significant ways:

For encryption, the message is taken "as is" and is limited in length. For signature, the message is first hashed, and has no length limit.
The paddings differ. It is important for security that the padding used in encryption includes sufficiently many random bytes. It is equally important for security that the padding used in signatures includes sufficiently many non-random bytes. Less obviously, it is very important that the padding bytes occur "on the left" in the big-endian convention.
The exponentiations differ. For encryption, you use the public exponent, which is usually short (possibly down to e = 3, although the value e = 65537 is traditional)(as it happens, 65537 is used for no good reason at all, but hey, tradition is tradition). For signature generation, the private exponent is used, but the operation rarely use that exponent "as is"; in fact, two sub-exponentiations are done modulo p and q (the prime factors of N), for a total speed-up of about 4x.

In fact, the only really common point is that both RSA-encrypted messages and RSA signatures are integers modulo N, and are encoded with big-endian convention. Everything else differs. And, as I said earlier, you should not do signatures and encryption with the same key pairs anyway; both kinds of key pairs call for distinct management and lifecycles.

Some implementations of signature generation/verification accept as input a pre-hashed value. The hash function is still an important part of the overall process. Even if a given piece of code accepts the hash value as input, the hash must still occur at some point.

Some implementations of the RSA signature verification algorithm actually on "just" the signature and the public key; instead of the simple boolean result (match/mismatch), they produce the hash value itself; the caller is then responsible for matching that hash value with the message that is purported to be signed. The ability to recover the hashed message from the signature+public key is a specificity of RSA (RSA is said to be a signature algorithm with recovery). Not all other signature algorithms can do that. And, again, it is important that the hash occurs; don't let it make you believe that you could do "RSA signatures without hashing".

Traditionally, RSA signatures used in SSL have relied on a slightly different procedure: the padding lacks the identifying header (the ii... in my notations), and the hash value is actually the concatenation of MD5 and SHA-1, both computed over the message that is to be signed. This is a remnant from a past where the PKCS#1 standard was not as firmly entrenched as it is today.

1

POWER answer... THANKS!!!! – MarcoZen Mar 28 '20 at 11:22

score 3 · Answer 2 · answered Oct 03 '14 at 05:21

Just keep things simple. There are two main purpose of RSA 1. Key exchange You will share secret key with other and hence encrypt/wrap the secret key using public key of receiver. The receiver will decrypt/unwrap using his private key. 2. Digital signature You will provide some data to maintain integrity. The mechanism will hash the data and combined with hash algorithm it encrypt using your private key. Verifier will use your public key to decrypt and check by comparing hash.

Now think about some considerations Q: Why we are saying key exchange, not encryption/decryption? A: Encrypting and decrypting is expensive using RSA and for large data it is infeasible to do this. So, usually we do hybrid. Encrypt using session key and encrypt/wrap that session key to share. Usually session keys are symmetric block cipher algorithm's key.

Q: Which one need more secrecy, Key exchange/Digital signature? A: Anybody can decrypt signature data. But nobody should get key value.

So, PKCS 1 have some padding mechanism to keep more secret. It uses at least 8 bytes random data along with 3 indicator bytes for padding to key exchange. To keep it simple, it padd same way for digital signature and avoid random number by placing 'FF' bytes. As digital signature don't need extra security it works for same technique to unpadd for both key exchange and digital signature.

Please try to keep in mind, Using algorithm is ok. But, understanding it's purpose is more important.

Here is sample padding KE: 00 02 (indicating random number) [At least 8 bytes random number without any byte of value '00'] 00 [Data] DS: 00 01 (indicating 'FF;) [At least 8 bytes 'FF'] 00 [Data]

Now random number bytes or 'FF bytes depends on data length and key modulus length (1024/2048/.. bits)'

Simple math for 1024 bits aka 128 bytes. Say data length will be 34 bytes So length of random data or 'FF' bytes will be 128 - 2 (00 [indicator]) - 1 - 34

Hope it will help

Trying to understand RSA and it's terminology?

2 Answers2

Linked