Until recently, all DNS software and services were apparently vulnerable to DoS attack
25-year-old vulnerability in DNSSEC design has now been patched
25-year-old vulnerability in DNSSEC design has now been patched
Last month, the DNS world was shaken by the publication of a serious vulnerability in the DNSSEC specification. Known as KeyTrap, the vulnerability meant that it was possible to mount a denial-of-service (DoS) attack on a validating resolver from a DNS server.
Because the cause was a flaw in the specification itself, all popular DNS resolvers and services were affected. What's more, the issue was so serious that it couldn't be published until it had been fixed. For several months, therefore, the people that discovered the vulnerability worked with software vendors and large service providers to prepare and implement corrective updates.
Validating resolver operators are now strongly advised to update their software as a matter of urgency.
The KeyTrap vulnerability [technical report] was discovered late last year by researchers at Germany's National Research Center for Applied Cybersecurity (ATHENE). However, its cause goes back more than 20 years, to RFC 2535, the first version of the modern DNSSEC security methodology, published in 1999. Via the successors to that RFC, the vulnerability found its way into various implementations, including BIND9, Unbound, PowerDNS and other resolver software.
The KeyTrap vulnerability originates from what the researchers describe as "eager validation of signatures and of keys". The DNSSEC protocol requires a validating resolver to go on evaluating all the signatures provided (in an RRSIG record) and all the associated public keys (in a DNSKEY record) until a valid signature for the relevant DNS record is found. The idea is, of course, to ensure that the DNSSEC mechanism is as robust as possible. Moreover, the simultaneous publication/overlap of multiple signatures and multiple public keys is a common phenomenon, occurring in the context of every re-signing and rollover.
However, there was a problem with the key tag, a 16-bit identifier that serves to distinguish key pairs from one another, enabling the resolver to immediately find the right public key when evaluating a signature. Although the key tag is generated on a (pseudo-)random basis – RFC 4034, appendix B recommends using a simple checksum – it is not necessarily unique. Consequently, 'key tag collisions' are technically permissible.
A malicious actor could take advantage of that situation by publishing a DNS record with a string of signatures all of the same cryptographic type, in combination with a string of keys all with the same key tag. If all the signatures were invalid, the resolver would not conclude that that was the case until it had evaluated all the available signatures for all the keys provided.
Because cryptographic operations – especially the validation of ECDSA-based signatures – require considerable processing power, a record with the combination of signatures and keys described would require the resolver to do a lot of work trying all the possible combinations. Until it had finished performing all the calculations, the resolver would be able to do little or nothing else, resulting in denial of service. The researchers found that a validating resolver running on low-power hardware could be kept occupied for 16 hours by a record of the type described.
Because the KeyTrap issue made it easy to disable large parts of the internet, the researchers described it as the biggest DNS vulnerability ever discovered. However, their claim that this security hole went unnoticed for 25 years appears to be open to question: In 2019, Dex Bleeker published about exactly this same problem. However, he did not optimize his keys for maximum impact, which meant he saw an increased load but did not shut down the resolver. [1]
As indicated above, the source of the KeyTrap vulnerability is not the software, but the design of DNSSEC itself. In fact, it appears to be a consequence of one of the guiding principles for the development of internet standards: "Be conservative in what you send, and liberal in what you accept." That robustness principle was formulated by Jon Postel 45 years ago when the Internet Protocol was codified in RFC 791, and is now known as Postel's Law.
Of course, over time, security considerations have led to the robustness principle quite rightly being qualified in various respects. Indeed, Postel himself refined his principle as follows in RFC 1122: "assume that the network is filled with malevolent entities that will send in packets designed to have the worst possible effect".
Resolver software developers are in no way to blame for the KeyTrap vulnerability. The evaluation of all available signatures and all available keys and the associated key tags is a fixed feature of the DNSSEC mechanism, which requires a resolver to keep on looking as long no valid signature has been found. However, the developers' contention that this security flaw had gone unnoticed for 25 years is debatable. Dex Bleeker published an essay detailing precisely this issue back in 2019. However, Dex didn't configure his keys for maximum impact, with the consequence that, although he observed an increased load, he didn't take down the resolver. [1]
The KeyTrap vulnerability's discoverers report that, over the last few months, they have worked confidentially with 31 different actors – software developers, vendors, major service providers and people at the IETF – to resolve the issue. All the resolvers and services that they looked at were found to be vulnerable, even though a wide variety of implementations were involved. By creating a single DNS record with 340 signatures and 582 keys (based on algorithm 14: ECDSAP384SHA384) they were able to increase the workload associated with a query by a factor of 2 million. That resulted in the time required to evaluate the record typically being several minutes, assuming a single-core processor (e.g. PowerDNS, Stubby, Akamai DNSi CacheServe), with upper outlier values of more than a quarter of an hour for Unbound and 16 hours for BIND9, and a lower outlier of 1 minute for Knot. During that time, the resolvers blocked queries from other clients, effectively disabling the resolver users' internet access.
Multi-threading systems could be attacked with almost equal ease simply by requiring them to evaluate several of the specially prepared DNS records. Only Akamai's DNSi CacheServe appeared to still be able to respond to clients from its cache, since that involves a separate thread in the software.
The researchers also found that, as well as popular resolvers, well known tools (e.g. dig, delv, DNSViz), libraries (e.g. dnspython, ldns) and the main DNS service providers (Cloudflare, Google DNS, OpenDNS, Quad9) were vulnerable to the same attack method.
After performing some experiments, the group decided to cap the number of keys with the same key tag (per algorithm) to 4: more than is ever encountered in practice. In order to additionally provide resilience against attacks where a single key is used in combination with a large number of signatures (SigJam), the number of invalid signatures was also capped at 16. Finally, the number of evaluations performed for an ANY query was limited to 8.
With those limitations in effect, the researchers found that a KeyTrap attack still increased the resolver's workload, but no longer to the point where service denial resulted (DoS).
However, the approach described is not a structural solution to the KeyTrap vulnerability. That will ultimately require revision of the DNSSEC protocol.
Over the last few weeks, all (open) resolver software has been patched as described to address the KeyTrap vulnerability, which is now formally known as CVE-2023-50387. Additional information and the relevant software version numbers are available here: Unbound, BIND9, PowerDNS, Akamai. All the major DNS services are now protected against attacks of this type as well.
Validating resolver operators are now strongly advised to update their software as a matter of urgency.