Security flaws in Network Time Protocol make other (security) protocols vulnerable
Workable cryptographic security finally added to NTP
Workable cryptographic security finally added to NTP
With publication of the Network Time Security (NTS) protocol in RFC 8915, workable cryptographic security has finally been added to the Network Time Protocol (NTP). And not a moment too soon: effective security is badly needed to address the multiple vulnerabilities in the NTP published in recent years, and their knock-on implications for various internet security protocols (including DNS and DNSSEC). This article describes the various problems with current NTP; the new NTS protocol is covered in a separate article. Details of our own NTP service TimeNL – including our experimental NTS server – can be found in an earlier article.
The current version of NTP (version 4, defined in RFC 5905) does include security provisions, but they are rarely used, with good reason. One possibility is to use symmetric encryption. However, that requires the exchange of a private key before time synchronisation takes place. What's more, the encryption is based on the MD5 algorithm, which is known to be insecure and whose use is therefore discouraged. The unofficial alternative to MD5 is SHA-1, which is itself no longer regarded as secure but can nevertheless be used for this type of authentication. Asymmetric authentication on the basis of Autokey, defined in RFC 5906, is also possible. However, Autokey is a problematic and insecure protocol whose use is definitely inadvisable. Until now, therefore, NTP has lacked appropriate security. With NTPv4, the usual workaround has been to link the response to the address of the server that sent the incoming query, together with the timestamp (nonce) of the most recent synchronisation with that server.
Because, like DNS, NTP has traditionally relied on the stateless User Datagram Protocol (UDP), an attacker can send spoofed NTP responses to an NTP client, in the hope that the client will use the contents of the packets to set its system clock. One of the ways that NTP servers protect against such attacks is by allowing only minor adjustments relative to the current system time (1000 seconds = 16 minutes). Another common precaution is configuring a client so that it won't adjust its time until between ten and a hundred NTP responses (samples) have been received from a server. Systems nevertheless remain vulnerable at start-up, as with IoT devices that don't have onboard RTC clock chips. Also, the precautions described do not protect against an attacker changing a system clock in a series of small steps ('time skimming').
Traditional attacks on NTP systems and implementation-related problems have typically involved the abuse of (public) NTP services. Such abuse can be an effective attack strategy if, for example, an IoT device manufacturer (independently) sets up the device to use a particular server, or if an access provider does the same with modems issued to customers. Another known problem is the use of DDoS amplification attacks. In 2015, however, Malhotra's team at Boston University published a paper exposing several vulnerabilities in the NTP itself. The paper explained how, by adjusting a machine's system time, it was possible to interfere with various security protocols that rely on system time.
One of the highlighted issues is that the KSK/ZSK pairs used in DNSSEC are typically refreshed once every few months, but the digital signatures (RRSIG records) are refreshed every few weeks. By advancing the system time on a validating resolver, it's therefore possible to execute a DoS attack or to flush the cache by causing the TTLs of the records to expire prematurely. Conversely, putting a machine's clock back would allow for a replay attack. Protocols such as RPKI and Bitcoin are vulnerable to smaller time discrepancies of days or hours. Where the former is concerned, time manipulation would enable the RPKI manifest in a router's cache to be emptied, and then populated with outdated (incomplete) information. With Bitcoin, time manipulation on a node could cause valid blocks to be rejected, or miners to work on invalid blocks. With Kerberos-authenticated sessions and other online authentication services, putting back system time by just minutes can open the way for various types of replay attack involving the use of outdated login data. An attacker who managed to put back a system clock by years could even negate TLS by allowing the use of certificates that have been withdrawn after being compromised.
In their publication, Malhotra e.a. describe various ways of abusing NTP to modify system time in practice. In addition to the techniques mentioned above (spoofed NTP responses and time skimming), it's possible to block access to a system's upstream NTP servers, for example. As well as obviously enabling a DoS attack on the NTP servers, that also opens the way for sending a (spoofed) kiss-of-death (KoD) packet to the client. KoD functionality is intended to allow a server to ask a rapid-fire client to pause the flow of queries briefly. However, it can also be abused to stop clients querying their servers for prolonged periods. Protection against spoofed fake NTP responses involves an approach similar to the use of transaction IDs in DNS. Instead of a transaction ID, the approach relies on the origin timestamp, which indicates when the last synchronisation with the server took place. The timestamp adds roughly 32 bits of entropy (randomness), making it difficult for an attacker to compose a fake NTP response capable of fooling a client. Curiously, however, the timestamps on KoD responses weren't being checked by receiving clients to verify that any incoming message really does match the associated enquiry. Since publication of the research, that issue has fortunately been addressed. Nevertheless, there remains nothing to stop an attacker reversing the tactic and sending a large number of spoofed client NTP queries to a real server, prompting it to send genuine KoD responses to the real client. The client may then be unable to synchronise with any of its configured NTP servers, resulting in system time drift.
Another approach open to an attacker is to circumvent the origin timestamp altogether by injecting fake segments into a stream of fragmented UDP packets, with the aim of leaving the timestamp intact but changing the payload (the time). Naturally, the attacker can't provide the correct UDP checksum, but that doesn't matter, because the checksum isn't mandatory for IPv4 and can therefore simply be set to zero. Thus, the attack strategy involves sending only UDP packet fragments, rather than complete packets, in the hope that they will be accepted by the client along with other previously received fragments. For the strategy to work, the attacker has to send multiple correctly timed fragments that are also accurately dimensioned and fitted, so that the (overlapping) fragments can be inserted into the appropriate place in the UDP packet when it is reassembled.
Publication of the Boston University research has triggered the implementation of NTS. Again, parallels can be drawn with DNS and DNSSEC: it was Kaminsky's demonstration of how fake information could be injected into a caching resolver in 2008, that finally got the implementation of DNSSEC moving. SIDN Labs and various Dutch software developers played important pioneering roles in that context. Publication of the 'SAD DNS' attack earlier this month underlined the importance of DNSSEC as a structural solution.
In the summer of 2019, the same Aanchal Malhotra and the team at NLnet Labs (responsible for the development of NSD, Unbound, OpenDNSSEC, Routinator and various other network tools) published a paper focusing specifically on the importance of correct (secure) system time in the context of the DNS. They outlined several scenarios:
How long records remain in a resolver's cache is determined by the TTL. The TTL itself is expressed in relative terms (a remaining time interval in seconds), but cache implementations translate the relative expression into an absolute time on the basis of the system time. Therefore, by interfering with the system time, it's possible to bring forward or delay the time that cache entries (records) are flushed, facilitating various other types of attack, including the 'SAD DNS' attack published earlier this month. For example, a successful cache poisoning attack could be prolonged by extending the time that the fake records remain in the cache. Failover configurations could also be frustrated by similarly interfering with the short TTLs of DNS-based load balancing systems. Alternatively, the TTLs of NXDOMAIN replies (negative caching) could be extended. The advice for caching resolver developers is therefore that TTLs in administrative cache fields should always be expressed in relative terms. That principle has since been implemented in Unbound.
In DNSSEC, the validity period of a digital signature (RRSIG record) is also expressed in absolute terms. The system can really be made more robust only by adding cryptographic security to your validating resolver's NTP-based timing set-up.
Numerous other DNS-based security systems can be compromised in a similar way, including DNSBLs (DNS-based Blackhole Lists).
NTP-targeted MITM attacks are relatively easy to execute. All that's needed is to copy the origin timestamp from an NTP query to a spoofed NTP response. Prior to the introduction of NTS, the only way to prevent authenticity and integrity being compromised by such an attack was to use a shared (symmetric) key and a MAC (a Message Authentication Code, which has some similarity to a digital signature). Off-path attacks are harder to carry out because of the nonce (timestamp), but are nevertheless possible using the tactics described above. One of the main challenges with NTP is that the same software is used on internet-connected systems of all sizes, some of which remain in operation for very long periods, with IoT devices in particular often not updated. Where validating resolvers are concerned, the advice is to enable NTP, but ensure that your configuration is correct. The configuration should support client activity only, not server activity. In addition, the NTP clients on validating resolvers are the first systems in line for implementation of the NTS protocol described here.