A clever approach to X.509 for IOT

The idea of an “Internet of things”, of smart devices communicating with each other over wired and wireless networks, has been around for quite a while now. Over the last two decades, however, it has gone from being an idea to being a reality. Today, low-powered devices can communicate over low-powered networks using energy they harvest from their own antennas or from the power lines they monitor. For these types of devices, energy comes at a premium and is a scarce resource to say the least. With this in mind, the four authors of the paper I just finished reading have devised a clever approach for X.509 certificates to be communicated and used on these types of networks.

The paper, by Filip Forsby, Martin Furuhed, Panos Papadimitratos, and Shahid Raza, is called “Lightweight X.509 Digital Certificates for the Internet of Things”. In it, Forsby et al. describe a number of clever ways to reduce the number of bytes transmitted, the amount of data kept around in memory, and the computational complexity of the necessary operations. Their approach is based on judiciously choosing the appropriate algorithms for signatures and keys, and omitting all fields from their encoding that can be pre-determined by configuration or convention, thus drastically reducing the amount of data that needs to be exchanged between devices for the equivalent of an X.509 certificate.

All algorithms are chosen in advance, as are the version of X.509 to use and anything else that can be set in stone. Any of these choices, once made and therefore known by implementations, no longer have to be conveyed. Similarly, once you’ve decided that CA CommonNames have to be unique, it becomes the only thing you need to know about the CA to be able to locally find its certificate, so the rest of the DistinguishedName can be omitted as well. Using these techniques, the authors omit anything they can pre-configure or know by convention, scraping off large chunks of what would otherwise need to be communicated.

Obviously, in as far as the choices they make have an effect on the computational complexity of the operations that need to be run regardless of what’s communicated, their choices are equally judicious: the assymetric algorithms chosen are all elliptic curve, the signature algorithm is ecdsaWithSHA256, the curve is prime256v1, etc. Choosing ECDSA over RSA saves both run-time memory and computational complexity when generating and verifying signatures, in addition to having fewer bytes to send and receive for those signatures and the associated public keys. SHA-256 is a less obvious choice in this regard, but there doesn’t appear to be that much research into the energy efficiency of hash functions (the only paper I found on the topic that directly addresses the question was a 2012 paper by Damasevicius et al. in which the only 256-bit hash functions tested were SHA-256 and Haval256, in which case I think SHA-256 is the more judicious choice).

Whenever data needs to be conveyed, Forsby et al. do their best to convey it as tersely as possible: UNIX time stamps are preferred over ASN.1 time stamps and are conveyed in 32 or 40 bits, depending on the date encoded. ECC public keys, which consist of a point on the curve, get only their X coordinate and a one-bit hint for the Y coordinate, which is enough to find the point on the curve. The only thing they can’t really compress is the X.509 certificate extensions, though they do away with the first two bytes of the OID field because those always have the same value.

Finally, they do away with DER encoding and use a scheme that is more akin to CER encoding, but based on JSON. It’s called CBOR, for Concise Binary Object Representation, and is defined by RFC 7049. This is where they perhaps go a step further than needed: ASN.1’s Canonical Encoding Rules are an efficient encoding equivalent to the Distinguished Encoding Rules that are completely deterministic and for which the syntax can still be described in the ASN.1 schema language. I think it would have been a better choice to encode X.509 certificates with, given that X.509 itself is completely described in the ASN.1 schema language.

As you will probably have surmised, stripping all that data out means a signature applied by a CA to a standard X.509 certificate can’t survive all these transformations – except that it can! Because X.509 is defined in ASN.1 and certificates are encoded using the Distinguished Encoding Rules (DER), the encoding is entirely deterministic. That means that when a stripped-down certificate is read, it can be transformed back into a DER-encoded, complete X.509 certificate for verification and validation. This relies on the CA using DER, not BER, to encode its certificates (which the authors don’t point out in their paper), but that ought to be the rule anyway. Once the validation and verification is done, the DER-encoded certificate can be discarded and the CBOR-encoded data can be represented in memory as a structure containing only the data needed for other computations.

There are really only three other shortcomings I can see in this scheme: the first is that they appear to require devices to know their own 64-bit unique identifier and the 64-bit unique identifiers and Distinguished Names for every device they communicate with using these stripped-down certificates, as the only identifying information for the device itself is a 64-bit unique identifier (where X.509 certificates will normally contain a Distinguished Name). Devices thus need a dictionary of device names to match the EUI-64 to a Distinguished Name. Of course, some CAs may allow encoding an EUI-64 directly into the Distinguished Name. Such a CA would preclude the requirement of having a dictionary in the device.

The second is that they rely on the device firmware to “know”, or have as part of their configuration, all of the choices (algorithms etc.) the scheme prescribes. Even the X.509 version is stipped out. Having a single-byte version number included in the encoded CBOR would have allowed a bit of flexibility and future-proofing.

The third and final shortcoming, in my opinion, is that they reference the Constrained Application protocol, CoAP, for their choices of cipher suites. CoAP mandates TLS_PSK_WITH_AES_128_CCM_8. This scheme has a pre-shared symmetric key and uses 128-bit AES in Counter-with-CBC-MAC mode. Pre-shared keys are generally not a good choice for long-term security, as they are hard to keep out of the hands of humans. Provided there’s a robust key rotating scheme, this is not really a problem though.

One final, minor, issue is that they compare a base64-encoded X.509 certificate size with a binary CBOR-encoded certificate. This isn’t really a fair comparison as X.509 certificates can easily be conveyed in binary DER format, shaving a good chunk of the file size off just by keeping it in binary form. The comparison of a 1526-byte base64-encoded certificate in PEM format with a 146-byte CBOR-encoded binary object that contains the same information (but not the same data) is impressive, but I’m not sure it’s exactly apples to apples.

None of these are show-stoppers, and the scheme really is an interesting take on how you can reduce the overhead of PKI in a system where overhead is always expensive.