Let's Encrypt issued less secure certificates based on the bug. It's more of a race condition than anything else.
A CAA record is an optional DNS record that restricts which certificate providers are allowed to issue certificates for a domain. So if the subscriber validates a domain name at time X
and the CAA records for the domain allowed Let's Encrypt issuance during the time the subscriber validated their domain, the subscriber would have 30 days from the validation date to issue valid certs. So if someone later added CAA records that prohibit LE certs to be issued anytime between the time X + 30 days, the subscriber would be able to issue certs ignoring the CAA record.
If you look at the PR that had the bug, this is where it was first triggered when the NewAuthorizationSchema
feature flag was enabled in prod.
The problem code is a common mistake in Go, see the relevant code for boulder-ra:
// authz2ModelMapToPB converts a mapping of domain name to authz2Models into a
// protobuf authorizations map
func authz2ModelMapToPB(m map[string]authz2Model) (*sapb.Authorizations, error) {
resp := &sapb.Authorizations{}
for k, v := range m {
// Make a copy of k because it will be reassigned with each loop.
kCopy := k
authzPB, err := modelToAuthzPB(&v)
if err != nil {
return nil, err
}
resp.Authz = append(resp.Authz, &sapb.Authorizations_MapElement{Domain: &kCopy, Authz: authzPB})
}
return resp, nil
}
Which is the issue: taking a reference to a loop iterator variable
They failed to handle the second loop iterator variable v
properly.
And in turn, failed to consider two important fields in this function: IdentifierValue
and RegistrationID
func modelToAuthzPB(am *authzModel) (*corepb.Authorization, error) {
expires := am.Expires.UTC().UnixNano()
id := fmt.Sprintf("%d", am.ID)
status := uintToStatus[am.Status]
pb := &corepb.Authorization{
Id: &id,
Status: &status,
Identifier: &am.IdentifierValue,
RegistrationID: &am.RegistrationID,
Expires: &expires,
}
Boulder (specifically boulder-ra) determines that a given FQDN requires CAA rechecking and it uses the Identifier field from the authorization object, which was incorrect due to the above commit. So the way that the identifier field is handled, it would be the same for all values in a single map. So for instance, if you had multiple authorizations that required CAA rechecking, boulder-ra would recheck only one FQDN and not the others.
A bad bug indeed.