1

I'm working on a diagram to describe the process of issuing digital certificates, with the help of answers on my question here and some other research:

enter image description here

I just read in the IETF that:

The signatureValue field contains a digital signature computed upon the ASN.1 DER encoded tbsCertificate. The ASN.1 DER encoded tbsCertificate is used as the input to the signature function. This signature value is encoded as a BIT STRING and included in the signature field.

Because I'm somehow familiar with the process of "signing > encoding to ASN.1 format > encoding to base64" from the CSR step, it looks different here, so instead of this order it's: "encoding to ASN.1 format > signing > encoding to base64??"

but it doesn't make sense to me; shouldn't the three main fields of a certificate (tbsCertificate, signatureAlgorithm, signatureValue) encoded to ASN.1 format before encoding to Base64? so the ASN.1 is used twice by the CA in issuing certificates?

So this is the correct flow: https://i.stack.imgur.com/bJd0o.png ?

  • also, I'd like to know how accurate is the diagram describing the digital certificate issuing process?
mshwf
  • 147
  • 6

1 Answers1

2

TLDR: Yes, everything in a CSR or a certificate is using ASN.1 notation, ASN.1 is a description of the data types.

There are a few things to unpack here but let's start with some definitions:

Data formats used in PKI:

ASN.1:

This is a syntax notation, or how to represent various data types (integer, string, bytes, etc...). A programming language equivalent would be a data structure with defined types. eg:

struct Certificate {
  tbsCertificate       TBSCertificateType      // Another struct
  signatureAlgorithm   SignatureAlgorithmType  // Another struct
  signatureValue       []byte                  // An array of bytes
}

DER:

DER encoding is a way of serializing ASN.1 data. The output is an array of bytes that can be stored in a file, sent over the network, or as input to a signature algorithm.

PEM:

PEM is the base64 representation of DER data plus a header and footer. This results in ASCII data that can be easily sent over ASCII-only channels (eg: email doesn't do binary, it needs to be encoded).

I'm going to ignore PEM encoding because it's not strictly necessary here, it is only used to encapsulate binary data to pass around between parties.

How signatures work:

Signature algorithms in PKI have three parameters

  • the signature algorithm (and optional parameters), eg: PKCS #1 SHA-256 With RSA Encryption
  • the key: the private key of the signer
  • the input: raw bytes to sign

There is one output: the signature, which is a bit string.

note: under ASN.1, an octet string and a bit string are two different types. They are both binary data but the bit string can be any length whereas an octet string length in bits must be a multiple of 8.

Signing the input:

No matter what the input is (certificationRequestInfo for CSR, or tbsCertificate for Certificate), it must first be encoded into plain binary.

We need a standard for what that binary is. Luckily, we have one.

  • For a CSR, RFC 2986 tells us:: The value of the certificationRequestInfo component is DER encoded, yielding an octet string
  • For a Certificate, RFC 5280 tells us: The signatureValue field contains a digital signature computed upon the ASN.1 DER encoded tbsCertificate.

In both cases, the relevant data in ASN.1 format must be DER encoded, then fed into the signature algorithm. The output (the signature) is a bit string.

What happens to the output:

The output of the signature algorithm is stored in the signatureValue of field of the certificate with type bit string. Again, this is in ASN.1 notation.

Once all certificate fields have been set properly, the certificate is DER encoded.

If needed, the DER data is base64 encoded to yield the PEM data format. But this is not required.

A word of warning:

I think you're trying to accomplish too much. I would start with the basic components of the CSR/Certificate flow and worry about the finer details (such as data encoding) later.

Instead of a diagram that contains all the possible details, start at a high level, get a good handle on the process, then dig into the gritty details if you must.

Marc
  • 4,091
  • 1
  • 17
  • 23
  • I will read your answer when I'm back to work tomorrow, but regarding your last word, I found putting my understanding into a diagram is the best way to tie things together and to understand the big picture, after spending too much reading different resources that confused me more than helping me – mshwf Aug 24 '20 at 19:13
  • Sorry, I didn't mean to say diagrams aren't useful, quite the contrary, they're great. Instead, I would suggest focusing on high-level diagrams. Once you get the hang of those, dig down into individual boxes. For example: saying "signature of subject info and public key" is enough to start. Once everything is clear, you can focus on exactly what that entails. Worrying about data representation is probably pushing it a bit far. – Marc Aug 24 '20 at 19:29
  • No need to be sorry, I understand your point, and I know it's valid. I know that I'm trying to fit such a complex topic (at least to me - a cryptography noob) into simple shapes and relationships, (but that raises questions I coulden't raise before) and with the help of the community here (like your answers) I can proceed better. – mshwf Aug 24 '20 at 20:35
  • Is there a conversion involved from the plain text (original data, like CertificationRequestInfo) to [ASN.1](https://en.wikipedia.org/wiki/ASN.1#Example) before converting to DER, I'm confused if ASN.1 is just a standard that's implemented by DER, CER, BEM.. or just another format that tie the actual data with binary data? – mshwf Aug 25 '20 at 12:50
  • 1
    Sure, the actual data is first read from stdin (or a file) and saved in a data structure native to the programming language being used. It will then be converted to ASN.1 notation and serialized into DER either at the same time, or in two passes, that really depends on the library being used. ASN.1 notation is integral to BER/CER/DER, all those do is use generate a byte stream that is self-descriptive (ASN.1 notation includes types IDs for everything). – Marc Aug 25 '20 at 13:06
  • And when including data like `CertificateRequestInfo` or `TBSCertificate` with (signature and signature algorithm) before converting to PEM (that is, DER and Base64, as you mentioned), is it included in DER, ASN.1 or the actual data as it is? – mshwf Aug 25 '20 at 13:19
  • 1
    In both cases, they are DER-encoded to generate an `OCTET string` (a bunch of bytes) then signed. The output of the signature algorithm is a `BIT string` (a bunch of bytes, but the length doesn't have to be a multiple of 8 bits). This `BIT string` is one of the fields of the final CSR or certificate and is therefore DER-encoded like the rest of the CSR or certificate and everything is base64 encoded to yield a PEM file. – Marc Aug 25 '20 at 13:26
  • I just realized that the public key of the CA has no role in the diagram, is this right or I'm missing something? – mshwf Aug 25 '20 at 13:49
  • 1
    You're not missing anything. The public key of the CA is found in the CA's certificate, which is either a root (included you system's trusted certificate pool), or an intermediate (included in the certificate chain returned by the server). The CA's public key **may** be used to generate the `Authority Key ID`, but it could be something else. As such, it might not play any part in certificate construction. – Marc Aug 25 '20 at 13:55