Development of the domain Name system Paul V Mockapetris USC Information Sciences Institute, Marina del Rey, California Kevin J. Dunlap Digital Equipment Corp, DECwest Engineering, Washington (Originally published in the Proceedings of SIGCOMM 88, Computer Communication Review Vol. 18, No. 4, August 1988, pp. 123-133. Abstract Simple growth was one cause of these problems; an- The domain Name (DNS) provides nam other was the evolution of the community usin service for the darPa t. It is one of the largest HOSTS. TXT from the NCP-based original arPanEt name services in operation today, serves a highly to the IP/TCP-based Internet. The research diverse community of hosts, users, and networks, and ARPANET'S role had changed from being a single uses a unique combination of hierarchies, caching, and network connecting large timesharing systems to being datagram access one of the several long-haul backbone networks linking local networks which were in turn populated with This paper examines the ideas behind the initial design workstations. The number of hosts changed from the of the dNS in 1983. discusses the evolution of these number of timesharing systems(roughly organizations ideas into the current implementations and usages, to the number of workstations (roughly users). This notes conspicuous surprises, successes and increase was directly reflected in the size of shortcomings, and attempts to predict its future evo- HOSTS. TXT. the rate of change in HOSTS. tXt. and ution the number of transfers of the file, leading to a much ger than linear increase in total resource use for 1 Introduction distributing the file. Since organizations were being forced into management of local network addresses The genesis of the DNS was the observation, circa gateways, etc, by the technology anyway, it was quite 1982, that the HOSTS. TXT system for publishing the logical to want to partition the database and allow local mapping between host names and addresses was control of local name and address spaces. A distributed encountering or headed for problems. HOSTS.TXT is naming system seemed in order the name of a simple text file, which is centrally maintained on a host at the sri network in formation Existing distributed naming systems included the Center(SRI-NIC) and distributed to all hosts in the DARPA Internets IENI16 [IEN 116] and the XerOX Internet via direct and indirect file transfers Grapevine [Birrell 82 and Clearinghouse systems [Oppen 83]. The IENI 16 services seemed excessively The problems were that the file, and hence the costs of limited and host specific, and IENI 16 did not provide its distribution, were becoming too large, and that the much benefit to justify the costs of renovation. The centralized control of updating did not fit the trend XEROX system was then, and may still be toward more distributed management of the Internet sophisticated name service in existence, but it was no This research was supported by the Defense Advanced Research Projects Agency under contract MDA903-87-C-0719. Views and conclusions contained in this report are the authors and should not be interpreted as representing the official opinion or policy of DARPA, the U.s. government, or any person or agency connected with them. ermission to copy without fee all or part of this material is granted provided that the copies are not made or di for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. ACM SIGCOMM Computer Communication Review
ACM SIGCOMM -1- Computer Communication Review Development of the Domain Name System* Paul V. Mockapetris USC Information Sciences Institute, Marina del Rey, California Kevin J. Dunlap Digital Equipment Corp., DECwest Engineering, Washington (Originally published in the Proceedings of SIGCOMM ‘88, Computer Communication Review Vol. 18, No. 4, August 1988, pp. 123–133.) *This research was supported by the Defense Advanced Research Projects Agency under contract MDA903-87-C-0719. Views and conclusions contained in this report are the authors’ and should not be interpreted as representing the official opinion or policy of DARPA, the U.S. government, or any person or agency connected with them. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. Abstract The Domain Name System (DNS) provides name service for the DARPA Internet. It is one of the largest name services in operation today, serves a highly diverse community of hosts, users, and networks, and uses a unique combination of hierarchies, caching, and datagram access. This paper examines the ideas behind the initial design of the DNS in 1983, discusses the evolution of these ideas into the current implementations and usages, notes conspicuous surprises, successes and shortcomings, and attempts to predict its future evolution. 1. Introduction The genesis of the DNS was the observation, circa 1982, that the HOSTS.TXT system for publishing the mapping between host names and addresses was encountering or headed for problems. HOSTS.TXT is the name of a simple text file, which is centrally maintained on a host at the SRI Network Information Center (SRI-NIC) and distributed to all hosts in the Internet via direct and indirect file transfers. The problems were that the file, and hence the costs of its distribution, were becoming too large, and that the centralized control of updating did not fit the trend toward more distributed management of the Internet. Simple growth was one cause of these problems; another was the evolution of the community using HOSTS.TXT from the NCP-based original ARPANET to the IP/TCP-based Internet. The research ARPANET’s role had changed from being a single network connecting large timesharing systems to being one of the several long-haul backbone networks linking local networks which were in turn populated with workstations. The number of hosts changed from the number of timesharing systems (roughly organizations) to the number of workstations (roughly users). This increase was directly reflected in the size of HOSTS.TXT, the rate of change in HOSTS.TXT, and the number of transfers of the file, leading to a much larger than linear increase in total resource use for distributing the file. Since organizations were being forced into management of local network addresses, gateways, etc., by the technology anyway, it was quite logical to want to partition the database and allow local control of local name and address spaces. A distributed naming system seemed in order. Existing distributed naming systems included the DARPA Internet’s IEN116 [IEN 116] and the XEROX Grapevine [Birrell 82] and Clearinghouse systems [Oppen 83]. The IEN116 services seemed excessively limited and host specific, and IEN116 did not provide much benefit to justify the costs of renovation. The XEROX system was then, and may still be, the most sophisticated name service in existence, but it was not
clear that its heavy use of replication, light use of architecture, or organizational style onto its caching, and fixed number of hierarchy levels were sers. This idea applied all the way from appropriate for the heterogeneous and often chaotic concerns about case sensitivity to the idea that style of the DARPA Internet. Importing the XEROX the system should be useful for both large design would also have meant importing supporting timeshared hosts and isolated PCs. In general elements of its protocol architecture. For these reasons we wanted to avoid any constraints on the system a new design was begun different implementation structures as possible due to outside influences and permit as many design of the DNS was specified in [RFC 883]. The outward appearance is a The HOSTS.TXT emulation requirement was not hierarchical name space with typed data at the nodes particularly severe, but it did cause an early Control of the database is also delegated in a examination of schemes for storing data other than hierarchical fashion. The intent was that the data types name-to-address mappings. a hierarchical name space be extensible, with the addition of new data type seemed the obvious and minimal solution for the continuing indefinitely as new applications were distribution and size requirements. The interoperability added. Although the system has been modified and and performance constraints implied that the system efined in several areas [RFC 973, RFC 974], the would have to allow database information to be current specifications [RFC 1034, RFC 1035] and buffered between the client and the source of the data similar to the since access to the source might not be possible Drawing an exact line between experimental use and production status is difficult, but 1985 saw some hosts The initial DNs design assumed the necessity of use the DNS as their sole means of accessing naming striking a balance between a very lean service and a information. While the DNS has not replaced the completely general distributed database. A lean service HOSTS.TXT mechanism in many older hosts, it is the was desirable because it would result in more standard mechanism for hosts, particularly those based mplementation efforts and early availability. a general on Berkeley UNIX, that track progress in network and design would amortize the cost of introduction across operating system design more applications, provide greater functionality, and increase the number of environments in which the DNS would eventually be used. The"leanness 2. DNS Design criterion led to a conscious decision to omit many of The base design assumptions for the DNS were that it the functions one might expect in a state-of-the-art In par pdate of the database O provide at least all of the same information as considerations was omitted. The intent was to add HOSTS. TXT these eventually, but it was believed that a system that o Allow the database to be maintained in a included these features would be viewed as too distributed manner complex to be accepted by the community Have no obvious size limits for names. name components, data associated with a name, etc. 2.1 The architecture O Interoperate across the darPa Internet and in as many other environments as possible The active components of the DNS are of two major types: name servers and resolvers. Name servers are o Provide tolerable performance repositories of information, and answer queries using Derivative constraints included the following whatever information they possess. Resolvers interface to client programs, and embody the alg o The cost of implementing the system could onl necessary to find a name server that has the be justified if it provided extensible services. In particular, the system should be independent of network topology, and capable of encapsulating These functions may be combined or separated to suit other name spaces the needs of the environment. In many cases, it is useful to centralize the resolver function in one or more O In order to be universally acceptable, the system should avoid trying to force igle os pecial name servers for an organization. This structure shares the use of cached information and also ACM SIGCOMM Computer Communication Review
ACM SIGCOMM -2- Computer Communication Review clear that its heavy use of replication, light use of caching, and fixed number of hierarchy levels were appropriate for the heterogeneous and often chaotic style of the DARPA Internet. Importing the XEROX design would also have meant importing supporting elements of its protocol architecture. For these reasons, a new design was begun. The initial design of the DNS was specified in [RFC 882, RFC 883]. The outward appearance is a hierarchical name space with typed data at the nodes. Control of the database is also delegated in a hierarchical fashion. The intent was that the data types be extensible, with the addition of new data types continuing indefinitely as new applications were added. Although the system has been modified and refined in several areas [RFC 973, RFC 974], the current specifications [RFC 1034, RFC 1035] and usage are quite similar to the original definitions. Drawing an exact line between experimental use and production status is difficult, but 1985 saw some hosts use the DNS as their sole means of accessing naming information. While the DNS has not replaced the HOSTS.TXT mechanism in many older hosts, it is the standard mechanism for hosts, particularly those based on Berkeley UNIX, that track progress in network and operating system design. 2. DNS Design The base design assumptions for the DNS were that it must: provide at least all of the same information as HOSTS.TXT. Allow the database to be maintained in a distributed manner. Have no obvious size limits for names, name components, data associated with a name, etc. Interoperate across the DARPA Internet and in as many other environments as possible. Provide tolerable performance. Derivative constraints included the following: The cost of implementing the system could only be justified if it provided extensible services. In particular, the system should be independent of network topology, and capable of encapsulating other name spaces. In order to be universally acceptable, the system should avoid trying to force a single OS, architecture, or organizational style onto its users. This idea applied all the way from concerns about case sensitivity to the idea that the system should be useful for both large timeshared hosts and isolated PCs. In general, we wanted to avoid any constraints on the system due to outside influences and permit as many different implementation structures as possible. The HOSTS.TXT emulation requirement was not particularly severe, but it did cause an early examination of schemes for storing data other than name-to-address mappings. A hierarchical name space seemed the obvious and minimal solution for the distribution and size requirements. The interoperability and performance constraints implied that the system would have to allow database information to be buffered between the client and the source of the data, since access to the source might not be possible or timely. The initial DNS design assumed the necessity of striking a balance between a very lean service and a completely general distributed database. A lean service was desirable because it would result in more implementation efforts and early availability. A general design would amortize the cost of introduction across more applications, provide greater functionality, and increase the number of environments in which the DNS would eventually be used. The “leanness” criterion led to a conscious decision to omit many of the functions one might expect in a state-of-the-art database. In particular, dynamic update of the database with the related atomicity, voting, and backup considerations was omitted. The intent was to add these eventually, but it was believed that a system that included these features would be viewed as too complex to be accepted by the community. 2.1 The architecture The active components of the DNS are of two major types: name servers and resolvers. Name servers are repositories of information, and answer queries using whatever information they possess. Resolvers interface to client programs, and embody the algorithms necessary to find a name server that has the information sought by the client. These functions may be combined or separated to suit the needs of the environment. In many cases, it is useful to centralize the resolver function in one or more special name servers for an organization. This structure shares the use of cached information, and also
lows less capable hosts, such as PCs, to rely on the the domain space), but the default assumption is that resolving services of special servers without needing a the only way to tell definitely what a name represents resolver in the Pc is to look at the data associated with the name The recommended name space structure for hosts, users,and other typical applications is one that mirrors The DNS internal name space is a variable-depth tree the structure of the organization controlling the local here each node in the tree has an associated label domain this is convenient since the dns features for The domain name of a node is the concatenation of all distributing control of the database is most efficient the path from the node to the when it parallels the tree structure. An administrative Labels are variable-length strings of octets, and each decision [RFC 920] was made to make the top levels octet in a label can be any 8-bit value. The zero length rrespond to country codes or broad organization label is reserved for the root. Name space searching types (for example EDu for educational, MIL for operations(for operations defined at present) are done military, UK for Great Britain) in a case-insensitive manner (assuming Ascii). Thus the labels"Paul","paul", and"PAUL would match 2. 3 Data attached to names each other. This matching rule effectively prohibits the creation of brother nodes with labels having equivalent Since the dns should not constrain the data that spelling but different case. The rational for this system applications can attach to a name, it can't fix the data's is that it allows the sources of information to specify its format completely. Yet the dNs did need to specify canonical case, but frees users from having to deal with some primitives for data structuring so that replies to case. Labels are limited to 63 octets and names are queries could be limited to relevant information, and so restricted to 256 octets total as an aid the DNS could use its own services to keep track of implementation, but this limit could be easily changed servers. server addresses. etc. Data for each name in if the need arose the dns is organized as a set of resource records (RRs): each RR carries a well-known type and class The DNS specification avoids defining a standard field, followed by applications data. Multiple values of printing rule for the internal name format in order to the same type are represented as separate rrs encourage DNs use to encode existing structured names. Configuration files in the domain system Types are meant to represent abstract resources or represent names as character strings separated by dots, functions, for example, host addresses and mailboxes but applications are free to do otherwise. For example, About 15 are currently defined. The class field ost names use the internal dns rules meant to divide the database orthogonally from type VENERA. ISI. EDU is a name with four labels(the nul and specifies the protocol family or instance. The name of the root is usually omitted). Mailbox names, DARPA Internet has a class, and we imagined that stated as USER(@dOMain (or more generally as lasses might be allocated to ChAoS. ISo. XNs or local-part@organization) encode the text to the left of similar protocol families. We also hoped to try setting the"@"in a single label (perhaps including". )and up function-specific classes that would be independent use the dot-delimiting dNs configuration file rule for of protocol (e.g. a universal mail registry). Three the part following the@. Similar encodings could be classes are allocated at present: DARPA Internet developed for file names, etc CHAOS and Hesiod The DNS also decouples the structure of the tree from The decision to use multiple RRs of a single type rather any implicit semantics. This is not done to keep names than including multiple values in a single rr differed free of all implicit semantics, but to leave the choices from that used in the XEROX system, and was not a for these implicit semantics wide open for the clear choice. The space efficiency of the single RR with application. Thus the name of a host might have more multiple values was attractive, but the multiple RR or fewer labels than the name of a user. and the tree is option cut down the maximum RR size. This appeared not organized by network or other grouping. Particular to promise simpler dynamic update protocols, and also sections of the name space have very strong implicit eemed suited to use in a limited-size datagram semantics associated with a name, particularly when environment (i.e. a response could carry only those the DNS encapsulates an existing name space or is items that fit in a maximum size packet without regard used to provide inverse mappings (e.g. IN- to partial RR transport) ADDR ARPA. the ip addresses to host name section of ACM SIGCOMM Computer Communication Review
ACM SIGCOMM -3- Computer Communication Review allows less capable hosts, such as PCs, to rely on the resolving services of special servers without needing a resolver in the PC. 2.2 The name space The DNS internal name space is a variable-depth tree where each node in the tree has an associated label. The domain name of a node is the concatenation of all labels on the path from the node to the root of the tree. Labels are variable-length strings of octets, and each octet in a label can be any 8-bit value. The zero length label is reserved for the root. Name space searching operations (for operations defined at present) are done in a case-insensitive manner (assuming ASCII). Thus the labels “Paul”, “paul”, and “PAUL”, would match each other. This matching rule effectively prohibits the creation of brother nodes with labels having equivalent spelling but different case. The rational for this system is that it allows the sources of information to specify its canonical case, but frees users from having to deal with case. Labels are limited to 63 octets and names are restricted to 256 octets total as an aid to implementation, but this limit could be easily changed if the need arose. The DNS specification avoids defining a standard printing rule for the internal name format in order to encourage DNS use to encode existing structured names. Configuration files in the domain system represent names as character strings separated by dots, but applications are free to do otherwise. For example, host names use the internal DNS rules, so VENERA.ISI.EDU is a name with four labels (the null name of the root is usually omitted). Mailbox names, stated as USER@DOMAIN (or more generally as local-part@organization) encode the text to the left of the “@” in a single label (perhaps including “.”) and use the dot-delimiting DNS configuration file rule for the part following the @. Similar encodings could be developed for file names, etc. The DNS also decouples the structure of the tree from any implicit semantics. This is not done to keep names free of all implicit semantics, but to leave the choices for these implicit semantics wide open for the application. Thus the name of a host might have more or fewer labels than the name of a user, and the tree is not organized by network or other grouping. Particular sections of the name space have very strong implicit semantics associated with a name, particularly when the DNS encapsulates an existing name space or is used to provide inverse mappings (e.g. INADDR.ARPA, the IP addresses to host name section of the domain space), but the default assumption is that the only way to tell definitely what a name represents is to look at the data associated with the name. The recommended name space structure for hosts, users, and other typical applications is one that mirrors the structure of the organization controlling the local domain. This is convenient since the DNS features for distributing control of the database is most efficient when it parallels the tree structure. An administrative decision [RFC 920] was made to make the top levels correspond to country codes or broad organization types (for example EDU for educational, MIL for military, UK for Great Britain). 2.3 Data attached to names Since the DNS should not constrain the data that applications can attach to a name, it can’t fix the data’s format completely. Yet the DNS did need to specify some primitives for data structuring so that replies to queries could be limited to relevant information, and so the DNS could use its own services to keep track of servers, server addresses, etc. Data for each name in the DNS is organized as a set of resource records (RRs); each RR carries a well-known type and class field, followed by applications data. Multiple values of the same type are represented as separate RRs. Types are meant to represent abstract resources or functions, for example, host addresses and mailboxes. About 15 are currently defined. The class field is meant to divide the database orthogonally from type, and specifies the protocol family or instance. The DARPA Internet has a class, and we imagined that classes might be allocated to CHAOS, ISO, XNS or similar protocol families. We also hoped to try setting up function-specific classes that would be independent of protocol (e.g. a universal mail registry). Three classes are allocated at present: DARPA Internet, CHAOS, and Hessiod. The decision to use multiple RRs of a single type rather than including multiple values in a single RR differed from that used in the XEROX system, and was not a clear choice. The space efficiency of the single RR with multiple values was attractive, but the multiple RR option cut down the maximum RR size. This appeared to promise simpler dynamic update protocols, and also seemed suited to use in a limited-size datagram environment (i.e. a response could carry only those items that fit in a maximum size packet without regard to partial RR transport)
2.4 Database distribution server for a zone need not be part of that zone scheme allows almost arbitrary distribution, but is The DNs provides two major mechanisms for efficient when the database is distributed in parallel transferring data from its ultimate source to ultimate with the name hierarchy. When a server answers from destination: zones and caching. Zones are sections of zone data, as opposed to cached data, it marks the the system-wide database which are controlled by a answer as being authoritative pecific organization. The organization controlling a zone is responsible for distributing current copies of a goal behind this scheme is that an organization he zones to multiple servers which make the zones should be able to have a domain. even if it lacks the available to clients throughout the Internet. Zone communication or host resources for supporting the ransfers are typically initiated by changes to the data domain's name service. One method is that in the zone. Caching is a mechanism whereby data organizations with resources for a single server can acquired in response to a client's request can be locall form buddy systems with another organization of lure rt similar means. This can be especially desirable to clients when the organizations are far apart (in network terms), since it makes the data available from Note that the intent is that both of these mechanisms be separated sites. Another way is that servers agree to invisible to the user who should see a single database provide name service for large communities such as without obvious boundaries CSNET and UUCP and receive master files via mail or ftp from their subscribers nes A zone is a complete description of a contiguous Caching section of the total tree name space, together with some In addition to the planned distribution of data via zone pointer"information to other contiguous zones. Since transfers. the dns resolvers and combined name ne divisions can be made between any two connected server/resolver programs also cache responses for use nodes in the total name space, a zone could be a single by later queries. The mechanism for controlling ode or the whole tree, but is typically a simple caching is a time-to-live(TTl)field attached to each RR. This field, in units of seconds, represents the From an organizations point of view, it gets control of length of time that the response can be reused. A zero a zone of the name space by persuading a parent TTL suppresses caching. The administrator defines organization to delegate a subzone consisting of a TTL values for each RR as part of the zone definition; single node. The parent organization does this by a low TTL is desirable in that it minimizes periods of inserting rRs in its zone which mark a zone division transient inconsistency, while a high TTL minimizes The new zone can then be grown to arbitrary size and traffic and allows caching to mask periods of server further delegated without involving the parent unavailability due to network or host problems although the parent al ways retains control of the initial Software components are required to behave as if they delegation. For example, the IsI. EDU zone was create continuously decremented TTLs of data in caches. The by persuading the owner of the edu domain to mark a recommended TTL value for host names is two days zone boundary between EDU and ISI. EDU Our intent is that cached answers be as good The responsibilities of the organization include the answers from an authoritative server, excepting maintenance of the zone's data and providing changes made within the TTL period. However, all redundant servers for the zone. The typical zone is components of the DNs prefer authoritative maintained in a text form called a master file by some nformation to cached information when both are system administrator and loaded into one master available locally server. The redundant servers are either manually eloaded. or use an automatic zone refresh algorithm which is part of the DNS protocol. The refresh 3. Current Implementation Status algorithm queries a serial number in the masters zone The dns is in use throughout the DARPA Internet data, then copies the zone only if the serial number has [RFC 1031] catalogs a dozen implementations or ports, increased. Zone transfers require TCP for reliability ranging from the ubiquitous support provided as part of Berkeley UNIX, through implementations for a particular name server can support any number of IBM-PCS Macintoshes. LISP machines. and fuzzballs zones which may or may not be contiguous. The ACM SIGCOMM Computer Communication Review
ACM SIGCOMM -4- Computer Communication Review 2.4 Database distribution The DNS provides two major mechanisms for transferring data from its ultimate source to ultimate destination: zones and caching. Zones are sections of the system-wide database which are controlled by a specific organization. The organization controlling a zone is responsible for distributing current copies of the zones to multiple servers which make the zones available to clients throughout the Internet. Zone transfers are typically initiated by changes to the data in the zone. Caching is a mechanism whereby data acquired in response to a client’s request can be locally stored against future requests by the same or other client. Note that the intent is that both of these mechanisms be invisible to the user who should see a single database without obvious boundaries. Zones A zone is a complete description of a contiguous section of the total tree name space, together with some “pointer” information to other contiguous zones. Since zone divisions can be made between any two connected nodes in the total name space, a zone could be a single node or the whole tree, but is typically a simple subtree. From an organization’s point of view, it gets control of a zone of the name space by persuading a parent organization to delegate a subzone consisting of a single node. The parent organization does this by inserting RRs in its zone which mark a zone division. The new zone can then be grown to arbitrary size and further delegated without involving the parent, although the parent always retains control of the initial delegation. For example, the ISI.EDU zone was created by persuading the owner of the EDU domain to mark a zone boundary between EDU and ISI.EDU. The responsibilities of the organization include the maintenance of the zone’s data and providing redundant servers for the zone. The typical zone is maintained in a text form called a master file by some system administrator and loaded into one master server. The redundant servers are either manually reloaded, or use an automatic zone refresh algorithm which is part of the DNS protocol. The refresh algorithm queries a serial number in the master’s zone data, then copies the zone only if the serial number has increased. Zone transfers require TCP for reliability. A particular name server can support any number of zones which may or may not be contiguous. The name server for a zone need not be part of that zone. This scheme allows almost arbitrary distribution, but is most efficient when the database is distributed in parallel with the name hierarchy. When a server answers from zone data, as opposed to cached data, it marks the answer as being authoritative. A goal behind this scheme is that an organization should be able to have a domain, even if it lacks the communication or host resources for supporting the domain’s name service. One method is that organizations with resources for a single server can form buddy systems with another organization of similar means. This can be especially desirable to clients when the organizations are far apart (in network terms), since it makes the data available from separated sites. Another way is that servers agree to provide name service for large communities such as CSNET and UUCP, and receive master files via mail or FTP from their subscribers. Caching In addition to the planned distribution of data via zone transfers, the DNS resolvers and combined name server/resolver programs also cache responses for use by later queries. The mechanism for controlling caching is a time-to-live (TTL) field attached to each RR. This field, in units of seconds, represents the length of time that the response can be reused. A zero TTL suppresses caching. The administrator defines TTL values for each RR as part of the zone definition; a low TTL is desirable in that it minimizes periods of transient inconsistency, while a high TTL minimizes traffic and allows caching to mask periods of server unavailability due to network or host problems. Software components are required to behave as if they continuously decremented TTLs of data in caches. The recommended TTL value for host names is two days. Our intent is that cached answers be as good as answers from an authoritative server, excepting changes made within the TTL period. However, all components of the DNS prefer authoritative information to cached information when both are available locally. 3. Current Implementation Status The DNS is in use throughout the DARPA Internet. [RFC 1031] catalogs a dozen implementations or ports, ranging from the ubiquitous support provided as part of Berkeley UNIX, through implementations for IBM-PCs, Macintoshes, LISP machines, and fuzzballs
[Mills 88] Although the HOStS. TXT mechanism is Since access to the root and other top level zones is so still used by older hosts, the DNS is the recommended nportant, the root domain, mechanism. Hosts available through HOSTS. TXT top-level domains managed by the SRI-NIC,is form an ever-dwindling subset of all hosts, a recent ported by seven redundant name servers. These root measurement [Stahl 87] showed approximately 5, 500 servers are scattered across the major long haul host names in the present HOSTS.TXT, while over backbone networks of the Internet, and are also 20.000 host names were available via the dns redundant in that three are TOPS-20 systems running The current domain name space is partitioned into JEEVES and four are UNIX systems running BIND roughly 30 top level domains. Although a top level The typical traffic at each root server is on the order of domain is reserved for each country(approximately 2 a query per second, with correspondingly higher rates in use, e.g. US, UK), the majority of hosts and when other root servers are down or otherwise subdomains are named under six top level domains unavailable. While the broad trend in query rate has named for organization types(e.g. educational is EDU, generally been upward, day-to-day and month-to- commercial is COM). Some hosts claim multiple month comparisons of load are driven more by changes names in different domains, though usually one name in implementation algorithms and timeout tuning than is primary and others are aliases. The SRI-NIC growth in client population. For example, one bad manages the zones for all of the non-country, top-level release of popular domain software drove averages to domains, and delegates lower domains to individual over five times the normal load for extended periods universities, companies, and other organizations who At present, we estimate that 50% of all root server wish to manage their own name space raffic could be eliminated by improvements in various The delegation of subdomains by the SRI-NIC has solver implementations to use less aggressive transmission and be grown steadily. In February of 1987, roughly 300 domains were delegated. As of March 1988, over 650 The number of clients which access root servers can be mated based on measurement tools on the toPS-20 normal name spaces controlled by organizations other version. These root servers keep track of the first 200 than the SRI-NIC, while 250 of these delegated clients after root server initialization. and the first 200 domains represent network address spaces (i.e. parts of clients typically account for 90% or more of all queries IN-ADDR. ARPA)no longer controlled by the NiC at any single server. Coordinated measurements at the porary DNS three TOPS-20 root servers typically shov so called "root servers"' which are the redundant name approximately 350 distinct clients in the 600 entries servers that support the top levels of the domain name The number of clients is falling as more organizations space, and the Berkeley subdomain, which is one of the adopt strategies that concentrate queries and caching domains delegated by the sri-NIC in the edu for accesses outside of the local organization domain The clients appear to use static priorities for selecting which root server to use, and failure of a particular root server results in an immediate increase in traffic at 3. Root servers other servers. The vast majority of queries are four The basic search algorithm for the dns allows a types: all information (25 to 40%), host name to resolver to search " downward from domains that it address mappings(30-40%), address to host mappings can access already. Resolvers are typically configured 10 to 15%), and new style mail information calle with"hints" pointing at servers for the root node and MX (less than 10%). Again, these numbers vary widely the top of the local domain. Thus if a resolver can as new software distributions access any root server it can access all of the domain refer 10-15% of all queries to rs for lower level if the res in a network partitioned from the rest of the Internet. it can at least access local 3.2 Berkeley Although a resolver resolver builds up cached information about servers for lower domains, the availability of root server University of California, Berkeley, partially as res important robustness issue, and root server activity in distributed systems, and partially out of necessity monitoring provides insights into DNS usage due to growth in the campus network [Dunlap 86a, Dunlap 86b]. The result is the Berkeley Internet Name ACM SIGCOMM Computer Communication Review
ACM SIGCOMM -5- Computer Communication Review [Mills 88]. Although the HOSTS.TXT mechanism is still used by older hosts, the DNS is the recommended mechanism. Hosts available through HOSTS.TXT form an ever-dwindling subset of all hosts; a recent measurement [Stahl 87] showed approximately 5,500 host names in the present HOSTS.TXT, while over 20,000 host names were available via the DNS. The current domain name space is partitioned into roughly 30 top level domains. Although a top level domain is reserved for each country (approximately 25 in use, e.g. US, UK), the majority of hosts and subdomains are named under six top level domains named for organization types (e.g. educational is EDU, commercial is COM). Some hosts claim multiple names in different domains, though usually one name is primary and others are aliases. The SRI-NIC manages the zones for all of the non-country, top-level domains, and delegates lower domains to individual universities, companies, and other organizations who wish to manage their own name space. The delegation of subdomains by the SRI-NIC has grown steadily. In February of 1987, roughly 300 domains were delegated. As of March 1988, over 650 domains are delegated. Approximately 400 represent normal name spaces controlled by organizations other than the SRI-NIC, while 250 of these delegated domains represent network address spaces (i.e. parts of IN-ADDR.ARPA) no longer controlled by the NIC. Two good examples of contemporary DNS use are the so called “root servers” which are the redundant name servers that support the top levels of the domain name space, and the Berkeley subdomain, which is one of the domains delegated by the SRI-NIC in the EDU domain. 3.1 Root servers The basic search algorithm for the DNS allows a resolver to search “downward” from domains that it can access already. Resolvers are typically configured with “hints” pointing at servers for the root node and the top of the local domain. Thus if a resolver can access any root server it can access all of the domain space, and if the resolver is in a network partitioned from the rest of the Internet, it can at least access local names. Although a resolver accesses root servers less as the resolver builds up cached information about servers for lower domains, the availability of root servers is an important robustness issue, and root server activity monitoring provides insights into DNS usage. Since access to the root and other top level zones is so important, the root domain, together with other top-level domains managed by the SRI-NIC, is supported by seven redundant name servers. These root servers are scattered across the major long haul backbone networks of the Internet, and are also redundant in that three are TOPS-20 systems running JEEVES and four are UNIX systems running BIND. The typical traffic at each root server is on the order of a query per second, with correspondingly higher rates when other root servers are down or otherwise unavailable. While the broad trend in query rate has generally been upward, day-to-day and month-tomonth comparisons of load are driven more by changes in implementation algorithms and timeout tuning than growth in client population. For example, one bad release of popular domain software drove averages to over five times the normal load for extended periods. At present, we estimate that 50% of all root server traffic could be eliminated by improvements in various resolver implementations to use less aggressive retransmission and better caching. The number of clients which access root servers can be estimated based on measurement tools on the TOPS-20 version. These root servers keep track of the first 200 clients after root server initialization, and the first 200 clients typically account for 90% or more of all queries at any single server. Coordinated measurements at the three TOPS-20 root servers typically show approximately 350 distinct clients in the 600 entries. The number of clients is falling as more organizations adopt strategies that concentrate queries and caching for accesses outside of the local organization. The clients appear to use static priorities for selecting which root server to use, and failure of a particular root server results in an immediate increase in traffic at other servers. The vast majority of queries are four types: all information (25 to 40%), host name to address mappings (30–40%), address to host mappings (10 to 15%), and new style mail information called MX (less than 10%). Again, these numbers vary widely as new software distributions spread. The root servers refer 10–15% of all queries to servers for lower level domains. 3.2 Berkeley UNIX support for the DNS was provided by the University of California, Berkeley, partially as research in distributed systems, and partially out of necessity due to growth in the campus network [Dunlap 86a, Dunlap 86b]. The result is the Berkeley Internet Name