5

I am currently building an ASN.1 parser which supposed to decode X.509v3 certificates and epoch files in ASN.1 DER format. The parser is working well apart from one issue which I couldn't seem to get. If I decode the DER format I see that for the public key the following form of BITSTRING is used:

 BIT STRING, **encapsulates** {

 SEQUENCE {

 INTEGER

     // public key hex string

but when I look at the signature I see that it is represented with BITSTRING without the encapsulates tag and contains only the hex buffer of the signature:

 SEQUENCE {

 OBJECT IDENTIFIER sha256WithRSAEncryption (1 2 840 113549 1 1 11)   

 NULL

 } 

 BIT STRING 

 // RAW signature hex buffer }

It is important for my parser to know whether the BITSTRING will only contain a buffer (like, in case of the signature) or will encapsulate some other types (like in the public key case). In the DER encoding of both I didn't find any difference that may imply usage of encapsulation.

My question being: how can I distinguish those 2 scenarios in the parser code?

schroeder
  • 123,438
  • 55
  • 284
  • 319
Dima Shifrin
  • 129
  • 7
  • The encapsulated content is basically a new parser. In case of the public key the format depends on the key type, so it's not a typical asn1 choice but a higher level decision. – eckes Jun 14 '17 at 19:38

3 Answers3

4

You really can't know that without knowing what type of data you are parsing.

If you know what type of data you are decoding, then you should use ASN modules and related documentation to adjust your parser. In addition, you may need to make some intellectual work.

the BITSTRING will only contain a buffer (like in case of the signature)

it is not a raw buffer, it may be a nested type as well. For example, if you look at ECDSA signatures, you will find that encoded signature is nested complex type:

ECDSA-Sig-Value ::= SEQUENCE {
    r  INTEGER,
    s  INTEGER
   }

and here is how it looks: enter image description here

or will encapsulate some other types(like in the public key case)

Again, not always. RSA key uses nested structure in BIT_STRING, EC key does not: enter image description here

these are just examples where your assumptions fail. Signature may have nested structure, public key may not have it. It depends on a context and without documentation it is hard to tell what to expect. So I would suggest to use the documentation. For example, RFC5912 that contains most PKIX ASN modules.

if you don't know what data you are parsing, then you have to do a hard work to proactively check whether the current type is primitive or constructed (without CONSTRUCTED bit set). That is attempt to unroll nested type if possible.

I have a general-purpose raw parser to decode ASN binary data to discrete structures written in C#: Asn1DerParser.NET and parser class: Asn1Reader

Crypt32
  • 5,750
  • 12
  • 24
3

A BIT STRING is a basic type that says "value is a sequence of bits", with absolutely no extra information on how these bits are to be interpreted. There is no difference in encoding depending on whether the value bits are supposed to be themselves the DER encoding of some nested structure, or are "just bits".

The "normal" answer is the one given by @Crypt32: when you are decoding a structure (e.g. an X.509 certificate), you are supposed to know what kind of structure you are decoding, and act accordingly. This can be quite complex in practice; for instance, the SubjectPublicKeyInfo structure contains an AlgorithmIdentifier that identifies the public key type, and a BIT STRING that contains the public key value. If the public key is a RSA key, then the value of the BIT STRING is supposed to be a DER-encoded structure (a SEQUENCE of two INTEGER values); however, if this is an EC public key, then the contains of the BIT STRING will be the raw encoded public point, with no ASN.1 or DER.

If your tool is for diagnostics, then you may use heuristics. You might want to have a look at DDer, which is a generic ASN.1/DER parser (actually, it supports BER). When encountering a "binary value" (BIT STRING or OCTET STRING), it tries to see if that value could itself be decoded as an ASN.1 object. Since BER/DER encoding is quite redundant, it happens quite rarely that an arbitrary randomish value like a signature would match DER encoding with no error (basically 1 in 10000 times or so).

(DDer also comes with a companion tool called MDer that does the reverse operation: from a text-based representation to DER encoding. This can be convenient for making test objects.)

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
0

Bit strings and octet strings can both contain either data or embedded ASN.1. Though it helps to know what format something should be taking before parsing, it is not entirely necessary. (Those ASN.1 decoders don't know what they are decoding before completely parsing.) When parsing an octet string, you can simply look at the contents and see it it fits the format of an ASN.1 primitive (1 byte header, followed by size info, followed by the amount of data indicated by the size).

Bit strings are a bit more complex to dissect. Bit strings have a byte which indicates the number of offset bits--the bit string is held in bytes, but the actual number of bits being stored in the object could be anywhere from 1 to 7 off of an multiple of 8. Only if the offset is 0 can you look at the bytes contained and evaluate whether or not the data contained is an ASN.1 primitive.

Lampshade
  • 388
  • 3
  • 4