[This is an updated version of a blog post from about 2 years ago to reflect current situation and fix links].
According to various surveys a large majority of recursive resolvers and authoritative servers are DNSSEC ready. This is measured by looking at the resolver that sends queries with the ENDS0 and DO bit set. Similarly, resolvers can measure this by seeing how many authoritative servers return answers with EDNS0 option.
When .GOV was signed some sites noticed that many queries for .GOV took a longer time and resulted in TCP queries. When .ORG was signed there was a large spike in TCP queries to .ORG servers. Both of these spikes can be attributed to large DNSSEC answers that did not get through “narrow” DNS links. To be fair, in both cases the volume of TCP queries was amplified by “mistakes” in signing the zones.
Both zones are signed by NSEC3 (which is fine) but NSEC3 has certain properties that make some of the answers larger than if NSEC was used. GOV originally used 2048 bit RSA zone signing key; this means each signature was 287 bytes long. In an NSEC3 signed zone each negative answer requires 4 signed RRsets, the SOA record plus 3 NSEC3 answers, resulting in answers over 1500 bytes long. ORG’s mistake was to set the TTL of all records at the zone apex to 0, which effectively killed all negative caching for that zone. Once this was fixed, the TCP traffic volume fell. At the height of the TCP flood ORG’s servers were answering about 15% of all queries over TCP.
Both mistakes are actually a good thing, as they highlight certain important issues for future DNSSEC deployment and can help us avoid them in the future. GOV’s spike of TCP connections was primarily caused by “narrow” pipes; i.e. links that cannot handle DNS packets over certain size. In this discussion we will focus on the size issue and how to address it. ORG’s mistake further amplified TCP traffic from resolvers that are sitting behind “narrow” pipes and have ill-chosen fallback mechanisms when big packets cannot get through.
Some technical background: DNS, as defined 25+ years ago, specified a maximum payload size of 512 bytes over UDP. This was appropriate at the time given the links available. RFC2671, issued in August 1999, specifies a mechanism to extend DNS packets in various ways including specifying larger messages. The RFC recommended larger message size of 4096 has become quite common, but because one of the ugly truths of the Internet is that no path is wider than the narrowest link, the large DNS answer does not get through all the time.
Question: “Why are there narrow links?”
There are number of reasons why large UDP messages cannot pass through links. Most links close to edges use Ethernet frames that are about 1500 bytes long. Thus, large TCP and UDP messages are broken up into units (fragments) smaller than 1500 bytes. A large number of links do not allow UDP fragments to pass through or only the first fragment is passed on. There are multiple reasons why this is the case:
- Some implementations in routers/NAT boxes do not know what to do with UDP fragments
- Some firewalls outlaw fragments as a security risk.
- Some firewalls “know” that all DNS are less than N bytes, thus any DNS packet larger that that is bad and is dropped.
If your site wants to use DNSSEC either as a consumer (in resolution) or as a producer (answering for signed zones) it is important that you know how large a message will get through your links, and if the path is restricted, tune your DNS systems to avoid transmission of packets that will not get through. The leading DNSSEC capable DNS resolver implementations – Bind, Unbound, CNS and MS DNS – have to set/change the advertised UDP size. When DNS resolution problems occur it is human nature to want to point fingers at the remote site, but it is just as likely that the problem is local. Fortunately, there are a number of tools that can be used to check resolvers and their paths:
- DNSFUNNEL is a sub tool in Vantage test tools
- NetAnzlr from UCB is a good network testing tool
Configuring DNS resolvers to ask for smaller answer (using 1480 as example):
in options set edns-udp-size=1480
Unbound: 1.4.0+ (older versions do not support) set parameter
Update: 18 months after the root was signed, there are no reports about big problems related to link sizes, this is a result of people checking their links and zone publishers being aware of the issue and keep their RRset’s small enough that the packet size does not become an issue.