Looking back at .nl's algorithm rollover
We've switched from algorithm 8 to 13
We've switched from algorithm 8 to 13
In July 2023, we switched to a different algorithm for signing records in the .nl zone. We're now using the ECDSA algorithm. The reasons for the change were explained in an earlier blog. In this blog, we describe the algorithm transition process. It was very important to us that the process went smoothly, because any issues could have had far-reaching consequences. We therefore took a very cautious approach and allowed ample time for preparation and testing. Our findings are shared below.
One vital criterion was that all signed zones had to remain secure throughout the algorithm transition. In other words, there should be no point in time when DNSSEC-validating resolvers were unable to validate the entire DNSSEC chain and recognise it as secure. Throughout the rollover period, we therefore monitored the integrity of the 'trust chain' using RIPE Atlas probes. With the probe network, we were able to continuously check hundreds of validating resolvers around the world.
Figure 1: Percentage of resolvers that see the .nl zone as secure (green line) for IPv6 and IPv4.
As the charts above show, the percentage of resolvers viewing the zone as secure remained stable, indicating that the chain remained intact. By monitoring the percentage throughout the transition, we were able to detect any issues that might arise with the chain of trust ourselves.
In order to make sure we were familiar with the entire process and therefore able to perform a smooth rollover in a production setting, we first performed a number of dry runs in our testing and acceptance environment. However, that environment differs from the production environment in one important respect: it is not linked to the public root name servers. To enable us to test for the possibility of problems arising during the rollover, we therefore created a local dummy root zone and a DNSSEC-validating resolver for integration within our testing and acceptance environment. For that arrangement to work, we had to give our dummy root zone its own KSK and ZSK. A trust anchor was used to tell the resolver that the path was secure.
While work was in progress in the testing and acceptance environment, we ran a script that allowed us to monitor various things by means of commands:
Does a validating resolver's response to a DNS query about .nl include an AD bit?
dig @resolver soa nl
dig @resolver dnskey nl
Record an audit trail of all configuration changes
cat /var/lib/opendnssec/enforcer/zones.xml
cat /var/lib/opendnssec/signconf/nl.xml
cat /etc/opendnssec/kasp.xml
What is the status of the zone on the signer?
dig +short +norec +dnssec @signer dnskey nl
dig +short +norec +dnssec @signer soa nl
dig +short +norec +dnssec @signer ns nl
dig +short +norec +dnssec @signer a nonexistant.nl
What is the status of the KSKs and ZSKs in OpenDNSSEC?
ods-hsmutil list
ods-enforcer rollover list –z nl
ods-enforcer key list –v –z nl
ods-enforcer key list –d –z nl
ods-enforcer key export –-ds –z nl
Logging the responses to the commands made it easy to see what the status of the environment was at any given point in time. The information thus obtained enabled us to draw up a reliable timeline for the rollover in the production environment. Moreover, our repeated use of ods-enforcer commands revealed a flaw in OpenDNSSEC, which was subsequently fixed in version 2.1.13.
To reduce the duration of rollovers performed in our testing and acceptance environment, we defined a special OpenDNSSEC policy. The kasp.xml used for testing is reproduced at the end of this blogpost.
The key differences between the test policy and our production policy were as follows:
Keys – RetireSafety = PT360S Keys – PublishSafety = PT360S
Because there are no caching resolvers in our environment, we were able to keep RetireSafety and PublishSafety very short.
Keys – Purge = P1D
A low value was set, enabling us to quickly see whether the old keys were successfully removed from the HSM. Ordinarily, a much higher value is used so that a rollback can be performed if necessary.
Keys – ZSK – Lifetime = P5D
This setting ensures that a ZSK rollover occurs immediately after the algorithm rollover, meaning that we can see whether a rollover to ECDSA has gone smoothly.
Zone – PropagationDelay = PT120S Parent – PropagationDelay = PT60S Parent – DS – TTL = PT60S Parent – SOA – TTL = PT600S Parent – SOA – Minimum = PT600S
Having our own dummy root server allowed us to keep these values very short.
Once the new policy was in force, OpenDNSSEC began activating the new algorithm. That involved adding the ECDSA KSK and ZSK, plus the DNSKEYs, to the zone alongside the RSA key material, as well as RRSIGs based on the ECDSA keys. The following charts show how the new DNSKEYs were viewed by resolvers around the world. All the resolvers checked by our test probes had cached the new keys within 2 hours. That is twice the TTL of the DNSKEY record set, illustrating that it would be unwise to proceed hastily with the steps of the rollover. It is better to allow resolvers additional time to pick up the new keys and signatures. OpenDNSSEC takes care of that automatically.
In the following charts, the numbers in the legend boxes are the key tags for the various keys.
62920 = RSA ZSK
34112 = RSA KSK
10212 = ECDSA ZSK
17153 = ECDSA KSK
Figure 2: The DNSKEY records visible to resolvers checked by the RIPE Atlas probes. Tests performed hourly. The orange line represents the increasing visibility of the ECDSA KSK.
Activation of the new keys within the trust chain required IANA to add a DS record to the root zone. In the following charts, the blue line shows when IANA added the record. Again, some resolvers took longer to pick up on the change than might have been expected, given the TTL of the DS record set.
Figure 3: The DS records visible to resolvers checked by the RIPE Atlas probes. Tests performed daily.
Once the DS record became available in the root, we were able to use the 'ods-enforcer key ds-seen --zone=nl --keytag=17153
' command to tell OpenDNSSEC to go ahead with the rollover. After a short interval, we were then able to initiate the procedure for deleting the RSA DS record from the root zone. That procedure involves IANA performing various checks. Ultimately, a new root zone was published containing only the ECDSA DS record. That was subsequently picked up by resolvers around the world, as shown by the orange line in the charts above.
Once we were sure that the RSA keys were no longer in use anywhere, we gave the 'ods-enforcer key ds-gone --zone=nl --keytag=34112
' command to tell OpenDNSSEC to stop using the old keys.That resulted in a big difference between 2 successive zones, and a delay before the new zone was available on all name servers. The following diagram shows a point in time when the new zone was not yet available on all name servers.
Figure 4: Visualisation of a point in time when the new zone was not yet available on all name servers.
The warning triangles in the diagram are due to the slow propagation of the zone. The new zone was available everywhere after a few hours.
Figure 5: Visualisation of the final situation, where the new zone is available on all name servers.
The algorithm rollover was performed without most people being aware of it, just as we intended. We can therefore say that we have successfully transitioned to algorithm 13, thus bringing the .nl zone back into line with the latest recommendations in this field. We subsequently switched to algorithm 13 for the .amsterdam, .aw and .politie domains as well. We expect that the next rollover will be a normal KSK rollover. In due course, however, when quantum cryptography for DNSSEC becomes available, another algorithm rollover will be required.
We were asked by the DNS community whether it might have been preferable to initially publish only the RRSIGs, and then publish the DNSKEY RRset later, as described in RFC 6781. However, our tests indicated that it was not necessary to follow RFC 6781, which is merely informational. The approach we adopted was in line with RFC 6840.
The kasp.xml used for testing
<?xml version="1.0" encoding="UTF-8"?> <KASP> <Policy name="default"> <Description>SIDN default (nl)</Description> <Signatures> <Resign>PT5H</Resign> <Refresh>P3D</Refresh> <Validity> <Default>P4D</Default> <Denial>P4D</Denial> </Validity> <Jitter>PT12H</Jitter> <InceptionOffset>PT600S</InceptionOffset> <MaxZoneTTL>PT3600S</MaxZoneTTL> </Signatures> <Denial> <NSEC3> <Resalt>P90D</Resalt> <Hash> <Algorithm>1</Algorithm> <Iterations>0</Iterations> <Salt length="0"/> </Hash> </NSEC3> </Denial> <Keys> <!-- Parameters for both KSK and ZSK --> <TTL>PT3600S</TTL> <RetireSafety>PT360S</RetireSafety> <PublishSafety>PT360S</PublishSafety> <Purge>P1D</Purge> <!-- Parameters for KSK only --> <KSK> <Algorithm length="2048">8</Algorithm> <Lifetime>P6Y</Lifetime> <Repository>HSM-HAgroup</Repository> <Standby>0</Standby> <ManualRollover/> </KSK> <!-- Parameters for ZSK only --> <ZSK> <Algorithm length="1024">8</Algorithm> <Lifetime>P5D</Lifetime> <Repository>HSM-HAgroup</Repository> <Standby>0</Standby> </ZSK> </Keys> <Zone> <PropagationDelay>PT120S</PropagationDelay> <SOA> <TTL>PT3600S</TTL> <Minimum>PT600S</Minimum> <Serial>keep</Serial> </SOA> </Zone> <Parent> <!-- implies good monitoring of the settings in the rootzone! --> <PropagationDelay>PT60S</PropagationDelay> <DS> <TTL>PT60S</TTL> </DS> <SOA> <TTL>PT600S</TTL> <Minimum>PT600S</Minimum> </SOA> </Parent> </Policy> <Policy name="policy-nl"> <Description>SIDN default (nl) EC policy</Description> <Signatures> <Resign>PT5H</Resign> <Refresh>P3D</Refresh> <Validity> <Default>P4D</Default> <Denial>P4D</Denial> </Validity> <Jitter>PT12H</Jitter> <InceptionOffset>PT600S</InceptionOffset> <MaxZoneTTL>PT3600S</MaxZoneTTL> </Signatures> <Denial> <NSEC3> <Resalt>P90D</Resalt> <Hash> <Algorithm>1</Algorithm> <Iterations>0</Iterations> <Salt length="0"/> </Hash> </NSEC3> </Denial> <Keys> <!-- Parameters for both KSK and ZSK --> <TTL>PT3600S</TTL> <RetireSafety>PT360S</RetireSafety> <PublishSafety>PT360S</PublishSafety> <Purge>P1D</Purge> <!-- Parameters for KSK only --> <KSK> <Algorithm length="256">13</Algorithm> <Lifetime>P6Y</Lifetime> <Repository>HSM-HAgroup</Repository> <Standby>0</Standby> <ManualRollover/> </KSK> <!-- Parameters for ZSK only --> <ZSK> <Algorithm length="256">13</Algorithm> <Lifetime>P5D</Lifetime> <Repository>HSM-HAgroup</Repository> <Standby>0</Standby> </ZSK> </Keys> <Zone> <PropagationDelay>PT120S</PropagationDelay> <SOA> <TTL>PT3600S</TTL> <Minimum>PT600S</Minimum> <Serial>keep</Serial> </SOA> </Zone> <Parent> <!-- implies good monitoring of the settings in the rootzone! --> <PropagationDelay>PT60S</PropagationDelay> <DS> <TTL>PT60S</TTL> </DS> <SOA> <TTL>PT600S</TTL> <Minimum>PT600S</Minimum> </SOA> </Parent> </Policy> </KASP>