The whole concept of trying to explain signatures with the terminology of encryption is flawed. It simply does not work. So let's unravel the whole thing, and this will require some formalism.
Formally, a cryptographic signature system consists in three algorithms:
KeyGen: takes as input a "security parameter" (say, the length of the key we want to obtain), and produces a new public/private key pair (Kp, Ks).
Sign: takes as input a message m and a private key Ks; output is a signature s.
Verify: takes as input a message m, a signature s and a public key Kp; output is a boolean (true on success, false if the signature is not valid).
The system is said to be sound if the algorithms operate as advertised (Sign produces signatures that Verify accepts, using key pairs produced by KeyGen). The system is said to be cryptographically secure if it is computationally infeasible to make forgeries: given a public key Kp, and without knowing Ks_, it should not be feasible (within the limits of existing technology) to produce a (m, s) pair such that Verify(m, s, Kp) = true. The definition implies, in particular, that the private key should not be computable from the public key alone, because otherwise forgeries would be easy.
None of the above says anything about how the algorithms work. Various systems have been invented, described and standardized.
RSA is a very well-known asymmetric algorithm, but that's wrong, because RSA is not one algorithm. RSA is the name for an internal operation called a trapdoor permutation, from which an asymmetric encryption system and a signature system have been derived. The RSA operation is, roughly, the following:
Let n be a big integer such that n = pq, where p and q are two big, distinct primes. Knowledge of p and q is the "private key". Let e be some (usually small) integer, called the "public exponent"; e must be such that it is relatively prime to both p-1 and q-1. Traditional values for e are 3 and 65537.
Given an integer x modulo n (an integer in the 0 to n-1 range), the RSA forward operation is computing xe mod n (x is raised to exponent e modulo n). This is easy enough to do. It so happens that this operation is a permutation of integers modulo n (each y modulo n is equal to xe mod m for exactly one x). The "magic" part is that, for some reason, nobody found an efficient way to compute the reverse operation (getting x from xe mod n) without knowing p and q. And that's not for lack of trying; integer factorization has been studied by the finest minds for more than 2500 years. When you know p and q, the RSA reverse operation becomes easy. The knowledge of p and q is thus called the trapdoor.
Now that we have this trapdoor permutation, we can design a signature algorithm which works the following way:
KeyGen: given a target length k, produce two random primes p and q of length about k/2 bits, such that p-1 and q-1 are both relatively prime to an a priori chosen e (e.g. e = 3), and n = pq has length k bits. The public key is (n, e), the private key is (p, q, e).
Sign: take message m, hash it with some hash function (e.g. SHA-256), and "turn" the hash output (a sequence of 256 bits in the case of SHA-256) into an integer y modulo n. That transform is what the padding is about, because the standard method (as described in PKCS#1) is writing the hash output with some extra bytes, and then interpreting the result as an integer (in big-endian convention in the case of PKCS#1). Once the hashed message has been converted through the padding into an integer y, the private key owner applies the trapdoor (the reverse RSA operation) to compute the x such that xe = y mod n (such a x exists and is unique because the RSA operation is a permutation). The signature s is the encoding into bytes of that integer x.
Verify: given a signature s, decode it back into an integer x modulo n, then compute y = xe modulo n. If this value y is equal to what would be the padding of h(m) (hash of message m), then the signature is accepted (returned value is true).
RSA encryption is another, distinct system, that also builds on the RSA trapdoor permutation. Encryption is done by raising an integer x to the exponent e modulo n; decryption is done by reversing that operation thanks to the knowledge of the private key (the p and q factors). Since such a system processes only big integers, and we want to encrypt and decrypt bytes, then there must also be some sort of conversion at some point, so a padding procedure is involved. Crucially, the security requirements for the encryption padding are quite distinct from those for the signature padding. For instance, the encryption padding MUST include a substantial amount of randomness, while the signature padding MUST include a substantial amount of determinism. In practice, the two padding systems are quite different.
When people looked at RSA signatures and RSA encryption, they found it fit to describe signatures as a kind of encryption. If you look at it, the RSA forward operation (raising to the exponent e) is done for RSA encryption, and also for RSA signature verification. Similarly, the reverse operation is done for RSA decryption, and for RSA signature generation. Furthermore, as a stroke of genius if genius was about confusing other people, some noticed that the RSA reverse operation can also be mathematically expressed as "raising an integer to some power modulo n", just like the forward operation (but with a different exponent). Thus they began to call that reverse operation "encryption". At that point, RSA encryption, RSA decryption, RSA signature generation and RSA signature verification are all called "encryption". For some weird psychological reason (I blame the deleterious effects of post-Disco pop music), many people still find it pedagogically sound to try to explain four different operations by first giving them the same name.
We described RSA; let's have a look at another, completely different algorithm called DSA. DSA does not use a trapdoor permutation. In DSA, we do computations modulo a big prime (traditionally called p) and modulo another, smaller prime (called q) which is such that p-1 is a multiple of q. p and q are known to everybody.
There is an operation-that-goes-one-way in DSA. Given an integer g modulo p (strictly speaking, in a specific subset of p called the subgroup of order q) and an integer x modulo q, everybody can compute gx mod p; however, recovering x from gx mod p is computationally infeasible.
While this somehow looks like RSA, there are crucial differences:
Here, the operation is raising g to exponent x, where the actual input is x (the exponent), because g is a fixed, conventional value.
This is not a permutation, because x is an integer modulo q and gx mod p is an integer modulo p, a quite different set.
This is certainly not a trapdoor: there is no "secret knowledge" that allows recovering x, except if you already know that exact value x.
However, a signature algorithm can be built on that operation. It looks like this:
KeyGen: the p, q and g integers are already fixed, and potentially shared by everybody. To generate a new private key, produce a random integer x between 1 and q-1. The public key is y = gx mod p.
Sign:
- Given a message m, hash it, then convert the hash value into an integer h modulo q.
- Generate a new, fresh, discard-after-use random integer k between 1 and q-1.
- Compute r = gk mod p mod q (the exponentiation is done modulo p, then the result is furthermore reduced modulo q).
- Compute s = (h + xr) / k mod q. The signature is (r, s).
Verify:
- Hash message m to recompute h.
- Compute w = 1 / s mod q.
- Compute u1 = hw mod q.
- Compute u2 = rw mod q.
- Compute v = gu1 yu2 mod p mod q.
- If v = r, the signature is valid; otherwise, it is not.
Now good luck with trying to describe that as some sort of "encryption". If you find that it is unclear what is being encrypted here, it is because nothing is encrypted here. This is not encryption.
However, there is an hand-waving conceptual description of signatures that works with both RSA, DSA, and many other signature algorithms. You can view signatures as a specific kind of authentication.
In authentication, one person (the prover) demonstrates his identity to another (the verifier). The prover does this by performing some action that only that person can do, but in such a way that the verifier can be convinced that he witnessed the genuine thing. For instance, a very basic authentication system is called "show-the-password": the prover and the verifier both know a shared secret (the "password"); the prover demonstrates his identity to the verifier by uttering the password.
For signatures, we want something a bit more complex:
- The signature is asynchronous. The signer acts once; verification is done afterwards, possibly elsewhere, and without any further active help from the signer.
- The verifier should not need to know any secret. The signature should be convincing for everybody.
- By signing, the signer shall not reveal any secret. His private key should not be consumed (yes, I know there are signature schemes that work with consumption; let's not go there).
- The signature should be specific to a given message m.
One rather generic structure for authentication schemes is based on challenges: the verifier sends to the prover a challenge, that the prover can answer to only thanks to his knowledge of his secret.
If you look at RSA, then you can see that it is a challenge-based authentication mechanism. The challenge is the hashed-and-padded message. The signer demonstrates his mastery of the private key by applying the RSA reverse operation on that challenge, something that only he can do; but everybody can apply the RSA forward operation to see that the challenge was indeed well met.
If you look at DSA, then you can again see a challenge-based authentication mechanism. The signer first commits to a secret value k by publishing r; then the challenge is (again) the message h combined with the commitment r; the signer can answer to that challenge only by using his private key x. In DSA, the signer has a permanent private key x, produces a one-shot private value k, and demonstrates his knowledge of x/k mod q. (This does not leak information on x because k is used only once.)
Summary: signature algorithms are not encryption algorithms, and explanations of signatures based on encryption can only be, at best, utterly confusing. A much better explanation is by showing that a signature algorithm is, in fact, a specific kind of authentication mechanism, by which the signer demonstrates his knowledge of the private key in response to a synthetic challenge that involves the signed message.
This authentication is convincing for bystanders as long as the said challenge is sufficiently well specified that it is provably not cooked in the advantage of the signer. In RSA, it is the result of a deterministic hashing and padding (and the padding takes care to avoid the values where the RSA reverse operation becomes easy). In DSA, the challenge is computed from a prior commitment of the signer.
Indeed, any zero-knowledge authentication system can be turned into a signature mechanism by making it non-interactive: since a ZK system works by commitments, challenges and responses to these challenges, you can make the signer compute all his commitments, hash them all along with the message to sign, and use the hash value as the challenges. This does not mean that a ZK proof lurks within all signature algorithms; however, if you find that DSA kinda looks like that, well, there are good reasons for that.