2

I've read a lot of articles on the PKI and digital certificate topics because it's very rare to find one article covers all the aspects; also the topic is confusing at the beginning, (this beautiful question is my last reading: How do certificates work in terms of encryption, hashing, and signing?).

I drew this graph from my understanding point: https://i.stack.imgur.com/sI7Od.png

I have 2 questions:

  1. I don't understand actually how the public key along with the ciphertext are "packaged" together (it's the best way I could think of to move on to the next step!), at the end the *.csr file is just a plain base64 text, are they encrypted together, so the order of the steps in the drawing is inaccurate?

  2. In order to verify the sender identity (from the web browser for example), it should have a copy of the CSR, but according to my understanding, the certificate only has the hash of it?

Edits:

  1. This video might help to understand the ANS.1 encoding: https://www.youtube.com/watch?v=EccHushRhWs

  2. I corrected the diagram, following Marc's answer here, and here and the specs in IETF/Basic Certificate Fields:

enter image description here

mshwf
  • 147
  • 6
  • The wiki on the subject has a breakdown for you: https://en.wikipedia.org/wiki/Certificate_signing_request – schroeder Aug 21 '20 at 14:34
  • Your second question seems to suggest that the csr is sent to the client. It doesn't. Is that what you meant to say? – schroeder Aug 21 '20 at 14:35
  • @schroeder I think the certificate has (among other data) the hashed CSR (?) – mshwf Aug 21 '20 at 14:41
  • It does not, there's no reason to. – Marc Aug 21 '20 at 14:44
  • So the CSR fields are copied (by somehow) into the certificate because when I open a certificate it includes all the data supplied to the CRT? – mshwf Aug 21 '20 at 14:49
  • Yes, the CA crafts a certificate using the fields in the CSR. Please read my answer for more details. – Marc Aug 21 '20 at 14:54

1 Answers1

5

You have a few misconceptions in your diagram.

The most important is that both your encrypt boxes are wrong, they should say sign. Following from that, the CSR sent to the CA includes the various fields (including Subject) and the subject's public key, there is no ciphertext involved just plain data and a signature.

RFC2986: PKCS#10: Certification Request Syntax details the steps to build a CSR:

The process by which a certification request is constructed involves the following steps:

        1. A CertificationRequestInfo value containing a subject
           distinguished name, a subject public key, and optionally a
           set of attributes is constructed by an entity requesting
           certification.

        2. The CertificationRequestInfo value is signed with the subject
           entity's private key.  (See Section 4.2.)

        3. The CertificationRequestInfo value, a signature algorithm
           identifier, and the entity's signature are collected together
           into a CertificationRequest value, defined below.

The subject's (in your case: applicant) public key is included verbatim in the CSR, as is the subject information. This is signed using the subject's public key and everything sent over to the CA.

Not in diagram form, but here are the steps to build the CSR and which data is included: The correct steps to build the CSR (not in diagram form) are:

  • Build a CertificationRequestInfo using:
    • Subject Distinguished Name
    • Subject Public Key
    • Other attributes
  • Obtain Signature by signing the CertificationRequestInfo using Subject Public Key and a particular algorithm Signature Algorithm.
  • Construct a CSR object by including:
    • CertificationRequestInfo
    • Signature
    • Signature Algorithm
  • Send this CSR blob to the CA.

Note that the CSR still contains plaintext CertificationRequestInfo and Subject Public Key.

Upon receiving CSR, the CA will more or less do the following:

  • parse the CSR
  • verify that the signature matches the fields in the CSR by using the subject's public key
  • verify that the various fields match its requirements (eg: you can't ask for CN=google.com without proving that you own the domain)
  • craft a certificate using some fields from the CSR, some from itself
  • sign the certificate using its (the issuer) private key

The final certificate still contains the subject fields and the subject's public key.

To more specifically answer your two questions:

  1. the subject's public key is one of the fields in the CSR. Nothing is encrypted, just signed.
  2. The subject fields are copied into the final certificate, they are there for any client to see.

You can see list of certificate fields in RFC5280. There is no hash of the CSR because there's no need for it, all relevant information was copied into its own fields of the certificate.

Marc
  • 4,091
  • 1
  • 17
  • 23
  • The second box (dashed box) is signing operation already, but detailed: hash then encrypt.. isn't that what you mean? – mshwf Aug 21 '20 at 15:13
  • Except it's not usually called encryption, and the result is not usually called ciphertext. The important part is that 1) the CSR includes the "subject fields" **in plaintext** (well, DER encoded, but definitely not encrypted) and 2) the **subject info as well as the public key** are part of the data that is signed. – Marc Aug 21 '20 at 15:15
  • 1
    I've updated my answer to show exactly what data is in the CSR. Note that **there is no ciphertext** anywhere in a CSR or Certificate. – Marc Aug 21 '20 at 15:28
  • The 3-steps process of creating a CSR says there are 3 distinct parts in the CSR: a readable value of the subject, signing value of the subject and the signature algorithm, but I only see a base64 text, this is really what confuses me – mshwf Aug 21 '20 at 15:28
  • If you look at [this question](https://security.stackexchange.com/questions/234365/help-understanding-csr-fields) and its sample CSR, the fields that are meaningful for humans are parsed and printed in plaintext. The public key and signature mean nothing to us, so they are shown in hexadecimal representation. Whether it's base64 or hex, it's just a way of encoding the data, it's not encrypted. If you look at a plain `.csr` file, that's encoded a few times, you need a tool to parse it and make it human readable. – Marc Aug 21 '20 at 15:31
  • Just to add to your answer, the CA may actually refuse to sign cert or overwrite certain attributes based on the certificate practice statement and the configured certificate profiles. @Marc should we have a semantic discussion about the signature being actual cypher text? Since it is an encrypted hash, I would consider it to be a cipher text, ergo signe certificate does contain cipher text. What is your take on that? – nethero Aug 21 '20 at 19:07
  • Sure, we can call it "ciphertext" (very reluctantly since a hash encrypted with a private key is a signature). But even then, the box is still missing the actual subject info, I think that's where OP was getting hung up. – Marc Aug 21 '20 at 19:10
  • 1
    Typo, I think, in "This is signed using the subject's public key and everything sent over to the CA", private key, not public, to match what IETF recommends (2nd point) – mshwf Aug 22 '20 at 10:54
  • @Marc can you please check the new diagram in the question, is it more accurate now? – mshwf Aug 22 '20 at 16:37
  • Don't forget: the **`CertificationRequestInfo` also contains the subject's public key**. Otherwise, the CSR generation looks ok. For the part after (actions taken by the CA), you would need some way of showing data being copied from the CSR (perhaps with modification) and fields being added by the CA. Then the whole thing is signed (remove any mention of encryption). Note also that the CA's public key is **not** in the certificate. – Marc Aug 23 '20 at 04:51
  • @Marc If the public key of the CA is **not** included, how is the client (browsers) verify it, aren't they use the public key of the CA to look up in the pre-installed CAs? – mshwf Aug 23 '20 at 09:01
  • The issuer certificate is found by matching the distinguished name. See https://security.stackexchange.com/questions/234227/x-509-how-is-certificate-chain-of-trust-subject-name-issuer-name-match-com – Marc Aug 23 '20 at 12:21
  • I'm sure I've read somewhere (I'll try to remember) that browsers keep track of trusted CAs by their public key and match it with the certificate's CA's public key. It is terrifying how mismatching and misleading information on this topic! – mshwf Aug 23 '20 at 14:30
  • Yes, OSes and some browsers have a pool of trusted root certificates. The way they build a chain from leaf certificate to root is by connecting certificates using the distinguished name. It's also possible to use the authority key ID extension for matching, see https://security.stackexchange.com/questions/200295/the-difference-between-subject-key-identifier-and-sha1fingerprint-in-x509-certif – Marc Aug 23 '20 at 15:58
  • @Mark I've updated the diagram with your notes and added more details from IETF, I'd be thankful to review it – mshwf Aug 24 '20 at 12:49