6

I suspect that the process of building the CRL cache may cause latency in some applications.

We have several .NET applications that occasionally "act slow" with no CPU or disk access. I suspect that they are hung up on authentication when trying to validate the certificate, since the timeout is almost 20 seconds.

As per this MSFT article

Most applications do not specify to CryptoAPI to use a cumulative time-out. If the cumulative time-out option is not enabled, CryptoAPI uses the CryptoAPI default setting which is a time-out of 15 seconds per URL. If the cumulative time-out option specified by the application, then CryptoAPI will use a default setting of 20 seconds as the cumulative timeout. The first URL receives a maximum timeout of 10 seconds. Each subsequent URL timeout is half of the remaining balance in the cumulative timeout value.

Since this is a service, how can I detect and log CryptoAPI hangs for applications I have sourcecode to, and also 3rd party

makerofthings7
  • 8,821
  • 28
  • 115
  • 196

3 Answers3

4

One way to get more information on this is to enable the CAPI2 eventlog

  • Open Eventvwr -> Application and Services Logs ->
  • Microsoft -> Windows -> CAPI2 -> Operational ->
  • Right click Enable Log

The information that appears in the event log will assist in determining where the certificate validation process is taking a long period of time.

To Enable Logging

  wevtutil.exe sl Microsoft-Windows-CAPI2/Operational /e:true

To save the log to a file

 wevtutil.exe epl Microsoft-Windows-CAPI2/Operational filename.elf

To disable logging

 wevtutil.exe sl Microsoft-Windows-CAPI2/Operational /e:false

To clear logs

 wevtutil.exe cl Microsoft-Windows-CAPI2/Operational
makerofthings7
  • 8,821
  • 28
  • 115
  • 196
4

I have several different options for you, since troubleshooting a potential PKI can be a complex issue.

CRLs are slow, cumbersome beasts...

First of all, I'm going to tell you that CRL timeouts have led the largest PKI in the world not to use CRLs for the vast majority of PKI validation needs. Downloading a 50MB file when a user just needs a simple yea or nay before sending off an encrypted email is a non-starter!

How to do Validation

1 - You can test a replacement Microsoft's native validation client with a 3rd party validation client such as Tumbleweed or others, then monitoring the 3rd party validation client (as a control). Tumbleweed/Axways sells and provides trials of a popular 3rd party validation client, OCSP repeaters and responders. You can also use OpenSSL, ejbca, or OpenCA as validation responders. Additionally, there is an OSS framework PKIF, that includes specific CAPI logging functions, and OCSP clients located at http://pkif.sourceforge.net/

2 - You can control the validation data sources (CRL, OCSP, SCVP).

3 - Another source of problems may be slow DNS resolution, a lack of DNS caching, or misconfigured PKI, missing trusts, or the time required to refresh CRLs (causing a temporary hang if a larger PKI).

Can you share more detail about the configuration of the applications, the PKI itself, the fields to the x509 certificates, bandwidth, etc. In particular I am interested in the fields relevant to validation, such as the Authority Information Access (AIA) field for example, and any hardcoded references to CRL or OCSP.

Keep in mind that there is well-known ambiguity in the response types too. The RFCs specifically for OCSP have issues in that a good response and an unknown response are equally valid for a certificate of which the validation has no knowledge.

A discussion of Alternatives:

There are two main methods of doing certificate validation, white-listing and black-listing.

Black-listing protocols include CRLs, OCSP and SCVP, and then variations of OCSP.

White-listing on Windows can be performed using OCSP interfacing with a CA DB, CTLs*, and SCVP.

In most cases, white-listing methods are considered superior, more secure, faster, more real time, and better alternatives to black-listing methods.

Brennan
  • 1,388
  • 6
  • 18
  • Can you tell me more about what 3rd party software I should use? Good point about the DNS resolution. I don't have the certs to inspect, but what x509 fields would affect resolution time? I'm also intrigued but the ambiguity comment. Please do elaborate on this! – makerofthings7 Jun 02 '12 at 20:38
  • response updated to address your questions – Brennan Jun 02 '12 at 20:52
  • Awesome! I want to plus +++++ this. I think part of the issue is that the certificate contains an LDAP url I don't have access to, a File URL on a different network. I don't know what the AIAs or CRL.. of the CRLs.. is going to be. I suspect that may be causing it. – makerofthings7 Jun 02 '12 at 21:00
  • Along the lines of "misconfigured", we have a dev/test CA, production CA, and an offsite recovery CA. We have actually had issues where the validation was not occurring against the expected CA. That may also be a DNS issue, but the upshot is if you have a complex environment it can help to confirm the servers that are used for validating certificates are actually the servers that should be used. – Greg Askew Jun 02 '12 at 21:23
1

For what it's worth, there is a hotfix that may not address the specific issue, but contains an updated cryptnet.dll to address OCSP issues. This particular file also was not updated in SP1.

You cannot use a certificate-based logon method to authenticate requests on a computer that is running Windows Server 2008 R2

http://support.microsoft.com/kb/2666300

Greg Askew
  • 34,339
  • 3
  • 52
  • 81
  • +1 great tip: The bug is that the locally cached Certificate Revocation List (CRL) is expired and the OSCP responder is offline. – makerofthings7 Jun 02 '12 at 19:55