2

I'm building a script that catalogues the use of public intermediate and root certificates, given a site's public certificate, so I need to get hold of the certificates programatically. Sometimes these may be bundled with the server responses, but other times they may need fetching, or extracting from a CA cert file. I often see the certs listed in the output from tools like testssl.sh and openssl s_client and https://whatsmychaincert.com, but I'm not clear how I can get at the same thing and extract them reliably, ideally in PEM format. Can anyone suggest how best to do this?

Synchro
  • 647
  • 1
  • 6
  • 14
  • The certs you see in `openssl s_client` are the ones sent by the server. If your question is how to get these then this would be a programming question, i.e. off-topic here and on-topic at stackoverflow.com. – Steffen Ullrich Nov 22 '16 at 12:19
  • It's really that I need to know more about the internal structure of certificates and the way they fit together. How does a cert store a reference to the root cert it's ultimately signed by? Does it just provide a fingerprint and you have to "just know" where to find it, or is there a URL embedded somewhere? Given this kind of info, I could set about the programming part of it, but now I need info, not code, so it would be OT for SO. – Synchro Nov 22 '16 at 14:57
  • So you are asking how the certificate chain is validated, i.e. how the potential issuer for a subject is found and how it is checked that it actually has signed the certificate? Please adjust your question to state more clearly what exactly you want to know and also how your question differs from the [all the other questions](http://security.stackexchange.com/search?q=certificate+chain+verification) which have certificate chain validation as topic. – Steffen Ullrich Nov 22 '16 at 15:08

1 Answers1

4

You can use openssl s_client to capture the certificate chain from a given web site, and (with the -showcerts option) it will helpfully identify the (s)ubject and (i)ssuer for each certificate. If the root is not included, then you should be able to find it in your local certificate store.

Here is some example output from openssl s_client, with the meat of the actual certificates trimmed (...) for brevity:

$ echo "" | openssl s_client -showcerts -connect www.google.com:443
CONNECTED(00000003)
---
Certificate chain
 0 s:/C=US/ST=California/L=Mountain View/O=Google Inc/CN=www.google.com
   i:/C=US/O=Google Inc/CN=Google Internet Authority G2
-----BEGIN CERTIFICATE-----
MIIEgDCCA2igAwIBAgIIRLhIyqBaCIMwDQYJKoZIhvcNAQELBQAwSTELMAkGA1UE
...
segXWw==
-----END CERTIFICATE-----
 1 s:/C=US/O=Google Inc/CN=Google Internet Authority G2
   i:/C=US/O=GeoTrust Inc./CN=GeoTrust Global CA
-----BEGIN CERTIFICATE-----
MIID8DCCAtigAwIBAgIDAjqSMA0GCSqGSIb3DQEBCwUAMEIxCzAJBgNVBAYTAlVT
...
wSHGFg==
-----END CERTIFICATE-----
 2 s:/C=US/O=GeoTrust Inc./CN=GeoTrust Global CA
   i:/C=US/O=Equifax/OU=Equifax Secure Certificate Authority
-----BEGIN CERTIFICATE-----
MIIDfTCCAuagAwIBAgIDErvmMA0GCSqGSIb3DQEBBQUAME4xCzAJBgNVBAYTAlVT
...
b8ravHNjkOR/ez4iyz0H7V84dJzjA1BOoa+Y7mHyhD8S
-----END CERTIFICATE-----
---
Server certificate
subject=/C=US/ST=California/L=Mountain View/O=Google Inc/CN=www.google.com
issuer=/C=US/O=Google Inc/CN=Google Internet Authority G2
---

The first certificate in the chain (0) shows you it's subject and issuer. As you can see, the subject for the second certificate (1) is the same as the issuer for (0), and the issuer for (1) is the subject for (2). The issuer for (2) is the root certificate, which is not included. (It is accepted, but not required, to include the root in the chain; as the purpose of the root is to be a locally stored "trusted" copy, software should never "trust" the root handed to it by a web server. That said, I have seen (non-browser) software which broke if the root was not included in the chain, so it does happen).

On a Linux system, you can track down the root certificate in the /etc/ssl/certs directory, either as an individual file or as part of a large file such as ca-certificates.crt (Ubuntu) or ca-bundle.crt (Red Hat). You can use openssl x509 -noout -subject -in filename where filename is a single certificate file to get the Subject of each certificate; comparing those to what you extracted from the certificate chain above will allow you to track down the server certificate.

(extracting the subjects from all those files, or exploding the bundle into a number of files and extracting subjects from them, is a medium-complexity Unix scripting challenge left to the reader).

You may find web servers which do not provide the necessary chain. In such cases, it's usually because the intermediate certs are also commonly distributed by browser vendors, so they can get away with it.

Finally, if you can't track down an issuer cert by any of these methods, try Google-ing the Subject. If it's a public certificate authority, chances are it's listed somewhere.


Reply to comment:

No, the subject is not guaranteed to be unique - remember, the whole point of x509 certificates is that you have to have a trust chain up to an implicitly trusted root; you know the Subject is good because it was signed by someone you trust. But by themselves, a subject could be forged if you're not verifying the cert.

So key fingerprints are the best way to uniquely ID a certificate, and you can use openssl x509 to print them given the certificate:

$ openssl x509 -sha256 -noout -in my_cert.crt -fingerprint
SHA256 Fingerprint=DC:5F:B5:98:53:E0:FC:B2:33:7B:8A:CE:64:09:75:76:65:84:A0:8C:2F:B1:D4:01:6D:1F:70:04:C6:0E:23:69
$ 
gowenfawr
  • 71,975
  • 17
  • 161
  • 198
  • Thanks, this is pretty much what I'm after, thanks for the effort. The main thing I'm not clear on is the primacy of the subject - should it be considered a unique identifier, and should exact-matching on it be reliable? I assume the CN alone is not enough? – Synchro Nov 28 '16 at 01:13
  • @Synchro updated answer to address your comment. – gowenfawr Nov 30 '16 at 20:48