Internet Indirection Infrastructure lon Stoica Daniel Adkins Shelley zhuang Scott Shen Sonesh Surana University of California, Berkeley istoica, dawkins, shelleyz, sonesh /@cs. berkeley. edu ABSTRACT to the(fixed) location of the desired IP address. The simplicity of this point-to-point communication abstraction contributed great tempts to generalize the Internets point-to-point communication to the scalability and efficiency of the Internet abstraction to provide services like multicast, anycast, and mobility ave faced challenging technical problems and deployment barri- However, many applications would benefit from ommunication abstractions, such as multicast, anycast, an an overlay-based Internet Indirection Infrastructure (13)that offers mobility. In these abstractions, the sending host no longer a rendezvous-based communication abstraction. Instead of explic- the identity of the receiving hosts (multicast and anycast) and the itly sending a packet to a destination, each packet is associated with location of the receiving host need not be fixed(mobility). Thus, n identifier; this identifier is then used by the receiver to obtain de- there is a significant and fundamental mismatch between the livery of the packet. This level of indirection decouples the act of inal point-to-point abstraction and these more general ones. All sending from the act of receiving, and allows i3 to efficiently sup attempts to implement these more general abstractions have relied ort a wide variety of fundamental communication services. To on a layer of indirection that decouples the sending hosts from the demonstrate the feasibility of this approach, we have designed and receiving hosts, for example, senders send to a group address(mul built a prototype based on the Chord lookup protocol. network is responsible for delivering the packet to the appropriate Categories and Subject descriptors location(s) Although these more general abstractions would undoubtedly H 4.3 [nformation Systems: Communication bring significant benefit to end-users, it remains unclear how to achieve them. These abstractions have proven difficult to imple General terms ment scalably at the IP layer [4, 13, 27). Moreover, deploying ad- ditional functionality at the lp layer requires a level of community wide consensus and commitment that is hard to achieve. In short, implementing these more general abstractions at the IP layer poses Keywords difficult technical problems and major deployment barriers Indirection, Abstraction, Scalable, Internet, Architecture In response, many researchers have turned to application-layer solutions(either end-host or overlay mechanisms)to support these 1. INTRODUCTION abstractions [4, 15, 24]. While these proposals achieve the desired functionality, they do so in a very disjointed fashion in that solu The original Internet architecture was designed to provide uni tions for one service are not solutions for other services; e.g,pro- cast point-to-point communication between fixed locations. In this posals for application-layer multicast don't address mobility, and asic service, the sending host knows the IP address of the receiver vice-versa. As a result, many similar and largely redundant mech- and the job of IP routing and forwarding is simply to deliver packets anisms are required to achieve these various goals. In addition, if This research was sponsored by Nsf under grant numbers Career overlay solutions are used, adding a new abstraction requires the Award AN1-0133811. and ItR Award ani-0085879. Views an deployment of an entirely new overlay infrastructure onclusions contained in this document are those of the authors In this paper, we propose a single new overlay network that and should not be interpreted as representing the official policies, serves as a general-purpose Internet Indirection Infrastructure(23) either expressed or implied, of NSF, or the U.S. government. 23 offers a powerful and flexible rendervous-based communication tiCSI Center for Internet Research (cIR), Berkel abstraction, applications can easily implement a variety of commu- shenker(@icsi. berkeley. edt nication services, such as multicast, anycast, and mobility, on top of this communication abstraction. Our approach provides a gen- eral overlay service that avoids both the technical and deployment challenges inherent in IP-layer solutions and the redundancy and Permission to make lack of synergy in more traditional application-layer approaches fee provided that copies are We thus hope to combine the generality of IP-layer solutions with bear this notice and the full citat the deployability of overlay solutions. sts,requires prior specific The paper is organized as follows. In Sections 2 and 3 we pre vide an overview of the 23 architecture and then a general discus- SIGCOMM02 Pittsburgh, Pennsylvania U sion on how 23 might be used in applications. Section 4 covers ad- Copyright 2002 ACM X-XXXXX-XX-X/XX/XX.5.00
Internet Indirection Infrastructure Ion Stoica Daniel Adkins Shelley Zhuang Scott Shenker ✁ Sonesh Surana University of California, Berkeley ✂ istoica, dadkins, shelleyz, sonesh✄ @cs.berkeley.edu ABSTRACT Attempts to generalize the Internet’s point-to-point communication abstraction to provide services like multicast, anycast, and mobility have faced challenging technical problems and deployment barriers. To ease the deployment of such services, this paper proposes an overlay-based Internet Indirection Infrastructure (☎✝✆) that offers a rendezvous-based communication abstraction. Instead of explicitly sending a packet to a destination, each packet is associated with an identifier; this identifier is then used by the receiver to obtain delivery of the packet. This level of indirection decouples the act of sending from the act of receiving, and allows ☎✝✆ to efficiently support a wide variety of fundamental communication services. To demonstrate the feasibility of this approach, we have designed and built a prototype based on the Chord lookup protocol. Categories and Subject Descriptors H.4.3 [Information Systems]: Communication General Terms Design Keywords Indirection, Abstraction, Scalable, Internet, Architecture 1. INTRODUCTION The original Internet architecture was designed to provide unicast point-to-point communication between fixed locations. In this basic service, the sending host knows the IP address of the receiver and the job of IP routing and forwarding is simply to deliver packets This research was sponsored by NSF under grant numbers Career Award ANI-0133811, and ITR Award ANI-0085879. Views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of NSF, or the U.S. government. ✁ ICSI Center for Internet Research (ICIR), Berkeley, shenker@icsi.berkeley.edu Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGCOMM ’02 Pittsburgh, Pennsylvania USA Copyright 2002 ACM X-XXXXX-XX-X/XX/XX ...$5.00. to the (fixed) location of the desired IP address. The simplicity of this point-to-point communication abstraction contributed greatly to the scalability and efficiency of the Internet. However, many applications would benefit from more general communication abstractions, such as multicast, anycast, and host mobility. In these abstractions, the sending host no longer knows the identity of the receiving hosts (multicast and anycast) and the location of the receiving host need not be fixed (mobility). Thus, there is a significant and fundamental mismatch between the original point-to-point abstraction and these more general ones. All attempts to implement these more general abstractions have relied on a layer of indirection that decouples the sending hosts from the receiving hosts; for example, senders send to a group address (multicast or anycast) or a home agent (mobility), and the IP layer of the network is responsible for delivering the packet to the appropriate location(s). Although these more general abstractions would undoubtedly bring significant benefit to end-users, it remains unclear how to achieve them. These abstractions have proven difficult to implement scalably at the IP layer [4, 13, 27]. Moreover, deploying additional functionality at the IP layer requires a level of communitywide consensus and commitment that is hard to achieve. In short, implementing these more general abstractions at the IP layer poses difficult technical problems and major deployment barriers. In response, many researchers have turned to application-layer solutions (either end-host or overlay mechanisms) to support these abstractions [4, 15, 24]. While these proposals achieve the desired functionality, they do so in a very disjointed fashion in that solutions for one service are not solutions for other services; e.g., proposals for application-layer multicast don’t address mobility, and vice-versa. As a result, many similar and largely redundant mechanisms are required to achieve these various goals. In addition, if overlay solutions are used, adding a new abstraction requires the deployment of an entirely new overlay infrastructure. In this paper, we propose a single new overlay network that serves as a general-purpose Internet Indirection Infrastructure (☎✝✆). ☎✝✆ offers a powerful and flexible rendezvous-based communication abstraction; applications can easily implement a variety of communication services, such as multicast, anycast, and mobility, on top of this communication abstraction. Our approach provides a general overlay service that avoids both the technical and deployment challenges inherent in IP-layer solutions and the redundancy and lack of synergy in more traditional application-layer approaches. We thus hope to combine the generality of IP-layer solutions with the deployability of overlay solutions. The paper is organized as follows. In Sections 2 and 3 we provide an overview of the ☎✝✆ architecture and then a general discussion on how ☎✝✆ might be used in applications. Section 4 covers ad-
,R) Trigger(t) (id,R) Figure 1:(a)i3s APl. Example illustrating communication between two nodes. (b) The receiver R inserts trigger(id, R).(e)The sender sends packet (id, data ). ditional aspects of the design such as scalability and efficient rout- inserts the trigger(id, R)into the network. When a packet is sent ing. Section 5 describes some simulation results on 3 performance to identifier id, the trigger causes it to be forwarded via IP to R along with a discussion on an initial implementation. Related work Thus, much as in IP multicast, the identifier id represents a log- discussed in Section 6, followed by a discussion on future work cal rendezvous between the sender's packets and the receiver's Section 7. We conclude with a summary in Section 8 trigger. This level of indirection decouples the sender from the receiver. The senders need neither be aware of the number of re- 2. 23OVERVIEW ceivers nor their location Similarly receivers need not be aware of In this section we present an overview of 23. We start with the the number or location of senders The above description is the simplest form of the abstraction. basic service model and communication abstraction, then briefly We now describe a generalization that allows inexact matching be- describe 83s design tween identifiers. (A second generalization that replaces identi- 2.1 Service model fiers with a stack of identifiers is described in Section 2. 5.) We The purpose of 13 is to provide indirection; that is, it sume identifiers are m bits long and that there is some exact-mate es threshold k with k m. We then say that an identifier id, in the act of sending from the act of receiving. The 13 service model trigger matches an identifier id in a packet if and only if s simple: sources send packets to a logical identifier, and receivers express interest in packets sent to an identifier. Delivery is best (a)id and id have a prefix match of at least k bits, and e ffort like in todays Internet, with no guarantees about packet de here is no trigger with an identifier that has a longer prefix This service model is similar to that of Ip multicast. The cru match with id cial difference is that the 23 equivalent of an IP multicast join is more flexible. IP multicast offers a receiver a binary decision of In other words, a trigger identifier idt matches a packet identi- whether or not to receive packets sent to that group(this can fier id if and only if idt is a longest prefix match(among all other indicated on a per-source basis). It is up to the multicast infrastruc trigger identifiers) and this prefix match is at least as long as the ture to build efficient delivery trees. The i3 equivalent of a join is exact-match threshold k. The value h is chosen to be large enough nserting a trigger. This operation is more flexible than an IP mul- so that the probability that two randomly chosen identifiers match ticast join as it allows receivers to control the routing of the packe is negligible. This allows end-hosts to choose the identifiers inde This provides two advantages. First, it allows them to create, at the pendently with negligible chance of collision application level, services such as mobility, anycast, and service 2.3 Overview of the design composition out of this basic service model. Thus, this one simple service model can be used to support a wide variety of application- We now briefly describe the infrastructure that supports this ren- level communication abstractions, alleviating the need for many dezvous communication abstraction; a more in-depth description parallel and redundant overlay infrastructures. Second, the infras- follows in Section 4. 23 is an overlay network which consists of tructure can give responsibility for efficient tree construction to the a set of servers that store triggers and forward packets(using IP) end-hosts. This allows the infrastructure to remain simple, robust, between 23 nodes and to end-hosts. Identifiers and triggers have meaning only in this 23 overlay One of the main challenges in implementing 23 is to efficiently 2.2 Rendezvous-Based Communication match the identifiers in the packets to those in triggers. This he service model is instantiated as a rendezvous-based com- done by mapping each identifier to a unique 23 node(server);at munication abstraction. In their simplest form, packets are pairs ny given time there is a single 23 node responsible for a given id data)where id is an m-bit identifier and data consists of When a trigger(id, addr) is inserted, it is stored on the i3 node a payload(typically a normal IP packet payload). Receivers use responsible for id. When a packet is sent to id it is routed by i3 to goers to indicate their interest in packets. In the simplest form, the node responsible for id; there it is matched against any triggers triggers are pairs(id, addr ), where id represents the trigger iden- for that &d and forwarded(using IP tifier, and addr represents a node's address which consists of an sent to that identifier. To facilitate inexact matching, we require that IP address and a port number. A trigger(id, addr)indicates that all id s that agree in the first k bits be stored on the same i3 serv all packets with an identifier id should be forwarded(at the IP The longest prefix match required for inexact matching can then be level) by the 23 infrastructure to the node identified by addr. More executed at a single node(where it can be done efficiently) specifically, the rendezvous-based communication abstraction ex Note that packets are not stored in 3; they are only forwarded. 23 ports three basic primitives as shown in Figure 1(a). ovides a best-effort service like todays Internet. i3 implements Figure I(b) illustrates the communication between two nodes neither reliability nor ordered delivery on top of IP, End-hosts use where receiver R wants to receive packets sent to id. The receiver In our implementation we choose m= 256 and k =128
✂✁’s Application Programming Interface (API) ✄✆☎✞✝✠✟☛✡✌☞✎✍✑✏✒☎✞✓✕✔✖✘✗ send packet ✝✠✄✞☎✆✙✚✓✜✛✢✙ ✜✣☛✣ ☎✞✙✤✔✥✓✦✗ insert trigger ✙✧☎✆★✪✩✞✫✬☎✑✛✢✙ ✜✣☛✣ ☎✞✙✤✔✥✓✦✗ remove trigger (a) ✭✮✭✮✭✮✭ ✯✮✯✮✯ ✰✮✰✮✰✮✰ ✰✮✰✮✰✮✰ ✱✮✱✮✱✮✱ ✱✮✱✮✱✮✱ ✲✮✲✮✲✮✲ ✳✮✳✮✳✮✳ ✴✮✴✮✴✮✴ ✵✮✵✮✵ (id, R) (b) (id, R) (c) sender (S) (R, data) (id, data) receiver (R) sender (S) receiver (R) Figure 1: (a) ☎✝✆’s API. Example illustrating communication between two nodes. (b) The receiver ✶ inserts trigger ✷ ☎✦✸✠✹✺✶✼✻ . (c) The sender sends packet ✷ ☎✦✸✠✹✕✸✾✽❀✿❁✽❂✻ . ditional aspects of the design such as scalability and efficient routing. Section 5 describes some simulation results on ☎ ✆ performance along with a discussion on an initial implementation. Related work is discussed in Section 6, followed by a discussion on future work Section 7. We conclude with a summary in Section 8. 2. ❃✚❄ OVERVIEW In this section we present an overview of ☎✝✆. We start with the basic service model and communication abstraction, then briefly describe ☎✝✆’s design. 2.1 Service Model The purpose of ☎✝✆ is to provide indirection; that is, it decouples the act of sending from the act of receiving. The ☎✝✆ service model is simple: sources send packets to a logical identifier, and receivers express interest in packets sent to an identifier. Delivery is besteffort like in today’s Internet, with no guarantees about packet delivery. This service model is similar to that of IP multicast. The crucial difference is that the ☎✝✆ equivalent of an IP multicast join is more flexible. IP multicast offers a receiver a binary decision of whether or not to receive packets sent to that group (this can be indicated on a per-source basis). It is up to the multicast infrastructure to build efficient delivery trees. The ☎✝✆ equivalent of a join is inserting a trigger. This operation is more flexible than an IP multicast join as it allows receivers to control the routing of the packet. This provides two advantages. First, it allows them to create, at the application level, services such as mobility, anycast, and service composition out of this basic service model. Thus, this one simple service model can be used to support a wide variety of applicationlevel communication abstractions, alleviating the need for many parallel and redundant overlay infrastructures. Second, the infrastructure can give responsibility for efficient tree construction to the end-hosts. This allows the infrastructure to remain simple, robust, and scalable. 2.2 Rendezvous-Based Communication The service model is instantiated as a rendezvous-based communication abstraction. In their simplest form, packets are pairs ✷ ☎❅✸✠✹✺✸✾✽✾✿❁✽❆✻ where ☎✦✸ is an ❇-bit identifier and ✸✾✽✾✿✕✽ consists of a payload (typically a normal IP packet payload). Receivers use triggers to indicate their interest in packets. In the simplest form, triggers are pairs ✷ ☎❅✸✠✹✺✽❀✸✾✸✾❈✒✻ , where ☎✦✸ represents the trigger identifier, and ✽❀✸✾✸✾❈ represents a node’s address which consists of an IP address and a port number. A trigger ✷ ☎❅✸✠✹✺✽❀✸✾✸✾❈✬✻ indicates that all packets with an identifier ☎❅✸ should be forwarded (at the IP level) by the ☎✝✆ infrastructure to the node identified by ✽❀✸✾✸✾❈. More specifically, the rendezvous-based communication abstraction exports three basic primitives as shown in Figure 1(a). Figure 1(b) illustrates the communication between two nodes, where receiver ✶ wants to receive packets sent to ☎❅✸. The receiver inserts the trigger ✷ ☎❅✸✠✹✑✶✼✻ into the network. When a packet is sent to identifier ☎✦✸, the trigger causes it to be forwarded via IP to ✶. Thus, much as in IP multicast, the identifier ☎✦✸ represents a logical rendezvous between the sender’s packets and the receiver’s trigger. This level of indirection decouples the sender from the receiver. The senders need neither be aware of the number of receivers nor their location. Similarly, receivers need not be aware of the number or location of senders. The above description is the simplest form of the abstraction. We now describe a generalization that allows inexact matching between identifiers. (A second generalization that replaces identi- fiers with a stack of identifiers is described in Section 2.5.) We assume identifiers are ❇ bits long and that there is some exact-match threshold ❉ with ❉❋❊●❇. We then say that an identifier ☎❅✸❂❍ in a trigger matches an identifier ☎❅✸ in a packet if and only if (a) ☎✦✸ and ☎✦✸❍ have a prefix match of at least ❉ bits, and (b) there is no trigger with an identifier that has a longer prefix match with ☎❅✸. In other words, a trigger identifier ☎❅✸❂❍ matches a packet identi- fier ☎❅✸ if and only if ☎❅✸❀❍ is a longest prefix match (among all other trigger identifiers) and this prefix match is at least as long as the exact-match threshold ❉. The value ❉ is chosen to be large enough so that the probability that two randomly chosen identifiers match is negligible.1 This allows end-hosts to choose the identifiers independently with negligible chance of collision. 2.3 Overview of the Design We now briefly describe the infrastructure that supports this rendezvous communication abstraction; a more in-depth description follows in Section 4. ☎✝✆ is an overlay network which consists of a set of servers that store triggers and forward packets (using IP) between ☎✝✆ nodes and to end-hosts. Identifiers and triggers have meaning only in this ☎ ✆ overlay. One of the main challenges in implementing ☎ ✆ is to efficiently match the identifiers in the packets to those in triggers. This is done by mapping each identifier to a unique ☎✝✆ node (server); at any given time there is a single ☎✝✆ node responsible for a given ☎❅✸. When a trigger ✷ ☎❅✸✠✹✺✽❀✸✾✸✾❈✒✻ is inserted, it is stored on the ☎✝✆ node responsible for ☎❅✸. When a packet is sent to ☎❅✸ it is routed by ☎✝✆ to the node responsible for ☎❅✸; there it is matched against any triggers for that ☎✦✸ and forwarded (using IP) to all hosts interested in packets sent to that identifier. To facilitate inexact matching, we require that all ☎✦✸’s that agree in the first ❉ bits be stored on the same ☎✝✆ server. The longest prefix match required for inexact matching can then be executed at a single node (where it can be done efficiently). Note that packets are not stored in ☎✝✆; they are only forwarded. ☎✝✆ provides a best-effort service like today’s Internet. ☎✝✆ implements neither reliability nor ordered delivery on top of IP. End-hosts use ■ In our implementation we choose ❇❑❏▼▲✤◆✎❖ and ❉P❏❘◗☛▲✎❙
(a)Mobility idplidsI, RI) R2|(R2 (过3R2( Y(d, R3 (b)Multicast (c)Anycast Figure 2: Communication abstractions provided by 3. (a)Mobility: The change of the receivers address from R to R' is transparent to the sender.(b)Multicast: Every packet (d, data) is forwarded to each receiver R; that inserts the trigger(d, R;).(c)Anycast The packet matches the trigger of receiver R2. d l ds denotes an identifier of size m, where dp represents the prefix of the k most significant bits, and dg represents the suffix of the m-k least significant bits. eshing to maintain their tris goers in 3. Hosts contact an ny packet that matches d is forwarded to all members of the group 3 node when sending 3 packets or inserting triggers. This 3 node as shown in Figure 2(b). We discuss how to make this approac then forwards these packets or triggers to the 3 node responsible scalable in Section 3 for the associated identifiers. Thus, hosts need only know one 2 Note that unlike ip multicast with 3 there is no difference be. node in order to use the 3 infrastructure tween unicast or multicast packets, in either sending or receiving 2.4 Communication Primitives Provided by i3 application can switch on-the-fly from unicast to multicast by sim- We now describe how e can be used by applications to achieve ly having more hosts maintain triggers with the same identifier he more general com cation abstractions of mobility, multi For example, in a telephony application this would allow multiple cast, and anycast. parties to seamlessly join a two-party conversation. In contrast, with IP, an application has to at least change the IP destination ad 2.4.1 Mobility dress in order to switch from unicast to multicas The form of mobility addressed here is when a host(e.g, a lap p) is assigned a new address when it moves from one location to 2.4.3 Anycast another. A mobile host that changes its address from R to R as a Anycast ensures that a packet is delivered to exactly one receiver result of moving from one subnet to another can preserve the end- in a group, if any. Anycast enables server selection, a basic building to-end connectivity by simply updating each of its existing triggers block for many of todays applications. To achieve this with 33, all from(d R)to(d, R'), as shown in Figure 2(a). The sending host hosts in an anycast group maintain triggers which are identical in needs not be aware of the mobile host,s current location or address. the h most significant bits. These h bits play the role of the anycast Furthermore, since each packet is routed based on its identifier to group identifier. To send a packet to an anycast group, a sender uses the server that stores its trigger, no additional operation needs to be an identifier whose k-bit prefix matches the anycast group identi- invoked when the sender moves. Thus, 3 can maintain end-to-end fier. The packet is then delivered to the member of the group whose connectivity even when both end-points move simultaneously trigger identifier best matches the packet identifier according to the With any scheme that supports mobility, efficiency is a major longest prefix matching rule(see Figure 2(c). Section 3.3 gives e the last k bits of the cached at the sender, and thus subsequent packets are forwarded directly to that server via IP. This way, most packets are forwarded 2.5 Stack of identifiers through one 3 server in the overlay network. Second to al In this section, we describe a second generalization of 3, which leviate the triangle routing problem due to the trigger being stored replaces identifiers with identifier stacks. An identifier stack is a at a server far away, end-hosts can use off-I ine heuristics to choose list of identifiers that takes the form d1,d2, e d3,..,e dk)where ggers that are stored at 3 servers close to themselves( see Sec di is either an identifier or an address. Packets p and triggers t ar thus of the form. 2.4.2 Multicast Packetp=(edstack, data) Creating a multicast group is equivalent to having all members of the group register triggers with the same identifier d. As a result
✁✁✁ ✂✁✂✁✂ ✄✁✄✁✄✁✄ ☎✁☎✁☎ ✆✁✆✁✆✁✆ ✝✁✝✁✝ ✞✁✞✁✞✁✞ ✟✁✟✁✟✁✟ ✠✁✠✁✠✁✠ ✠✁✠✁✠✁✠ ✡✁✡✁✡ ✡✁✡✁✡ ☛✁☛✁☛✁☛ ☛✁☛✁☛✁☛ ☞✁☞✁☞✁☞ ☞✁☞✁☞✁☞ ✌✁✌✁✌✁✌ ✍✁✍✁✍✁✍ (id |id , R1) p s1 (id |id , R2) p s2 (id |id , R3) p s3 (id |id , data) p s ✎✁✎✁✎✁✎ ✎✁✎✁✎✁✎ ✏✁✏✁✏ ✏✁✏✁✏ (a) Mobility ✑✁✑✁✑✁✑ ✑✁✑✁✑✁✑ ✒✁✒✁✒✁✒ ✒✁✒✁✒✁✒ ✓✁✓✁✓✁✓ ✔✁✔✁✔✁✔ receiver (R2) receiver (R1) receiver (R3) ✕✁✕✁✕✁✕ ✖✁✖✁✖ ✗✁✗✁✗✁✗ ✘✁✘✁✘✁✘ (b) Multicast (id, R2) sender (S) (R2, data) (id, R1) (id, R3) receiver (R2) (R3, data) receiver (R1) (R1, data) (id, data) receiver (R3) (c) Anycast (R2, data) sender (S) (id, data) sender (S) receiver (R’) (id, R) (id, data) sender (S) receiver (R) (R, data) (id, R’) (R’, data) Figure 2: Communication abstractions provided by ☎✝✆. (a) Mobility: The change of the receiver’s address from ✶ to ✶✚✙ is transparent to the sender. (b) Multicast: Every packet ✷ ☎✦✸✠✹✕✸✾✽❀✿❁✽❂✻ is forwarded to each receiver ✶✜✛ that inserts the trigger ✷ ☎❅✸✠✹✑✶✢✛ ✻ . (c) Anycast: The packet matches the trigger of receiver ✶✼▲. ☎❅✸✤✣✦✥ ☎❅✸★✧ denotes an identifier of size ❇, where ☎❅✸✩✣ represents the prefix of the ❉ most significant bits, and ☎✦✸✧ represents the suffix of the ❇✫✪ ❉ least significant bits. periodic refreshing to maintain their triggers in ☎✝✆. Hosts contact an ☎✝✆ node when sending ☎✝✆ packets or inserting triggers. This ☎✝✆ node then forwards these packets or triggers to the ☎✝✆ node responsible for the associated identifiers. Thus, hosts need only know one ☎✝✆ node in order to use the ☎ ✆ infrastructure. 2.4 Communication Primitives Provided by ❃✚❄ We now describe how ☎✝✆ can be used by applications to achieve the more general communication abstractions of mobility, multicast, and anycast. 2.4.1 Mobility The form of mobility addressed here is when a host (e.g., a laptop) is assigned a new address when it moves from one location to another. A mobile host that changes its address from ✶ to ✶✬✙ as a result of moving from one subnet to another can preserve the endto-end connectivity by simply updating each of its existing triggers from ✷ ☎❅✸✠✹✑✶✼✻ to ✷ ☎❅✸✠✹✑✶✢✙ ✻ , as shown in Figure 2(a). The sending host needs not be aware of the mobile host’s current location or address. Furthermore, since each packet is routed based on its identifier to the server that stores its trigger, no additional operation needs to be invoked when the sender moves. Thus, ☎✝✆ can maintain end-to-end connectivity even when both end-points move simultaneously. With any scheme that supports mobility, efficiency is a major concern [25]. With ☎ ✆, applications can use two techniques to achieve efficiency. First, the address of the server storing the trigger is cached at the sender, and thus subsequent packets are forwarded directly to that server via IP. This way, most packets are forwarded through only one ☎ ✆ server in the overlay network. Second, to alleviate the triangle routing problem due to the trigger being stored at a server far away, end-hosts can use off-line heuristics to choose triggers that are stored at ☎ ✆ servers close to themselves (see Section 4.5 for details). 2.4.2 Multicast Creating a multicast group is equivalent to having all members of the group register triggers with the same identifier ☎✦✸. As a result, any packet that matches ☎❅✸ is forwarded to all members of the group as shown in Figure 2(b). We discuss how to make this approach scalable in Section 3.4. Note that unlike IP multicast, with ☎✝✆ there is no difference between unicast or multicast packets, in either sending or receiving. Such an interface gives maximum flexibility to the application. An application can switch on-the-fly from unicast to multicast by simply having more hosts maintain triggers with the same identifier. For example, in a telephony application this would allow multiple parties to seamlessly join a two-party conversation. In contrast, with IP, an application has to at least change the IP destination address in order to switch from unicast to multicast. 2.4.3 Anycast Anycast ensures that a packet is delivered to exactly one receiver in a group, if any. Anycast enables server selection, a basic building block for many of today’s applications. To achieve this with ☎✝✆, all hosts in an anycast group maintain triggers which are identical in the ❉ most significant bits. These ❉ bits play the role of the anycast group identifier. To send a packet to an anycast group, a sender uses an identifier whose ❉-bit prefix matches the anycast group identi- fier. The packet is then delivered to the member of the group whose trigger identifier best matches the packet identifier according to the longest prefix matching rule (see Figure 2(c)). Section 3.3 gives two examples of how end-hosts can use the last ❇✭✪ ❉ bits of the identifier to encode their preferences. 2.5 Stack of Identifiers In this section, we describe a second generalization of ☎✝✆, which replaces identifiers with identifier stacks. An identifier stack is a list of identifiers that takes the form ✷ ☎✦✸ ■ ✹ ☎❅✸✯✮✎✹ ☎❅✸★✰✬✹✲✱✳✱✲✱ ✹ ☎❅✸✯✴✤✻ where ☎❅✸★✛ is either an identifier or an address. Packets ✵ and triggers ✿ are thus of the form: ✶ Packet ✵ ❏ ✷ ☎❅✸✧✕❍✸✷✺✹ ✴✒✹✺✸✾✽✾✿✕✽❂✻ ✶ Trigger ✿✢❏ ✷ ☎❅✸✠✹ ☎❅✸✯✧❁❍✸✷✲✹ ✴ ✻
23_recv(p)//upon receiving packet p d= head(p id_stack); //get head ofp's stack HIML-W ML transcoder(T lis local server responsible for id's best match? if ( is Match Local(id)=FALSE) 63-forward(p); / matching trigger stored elsewhere return: HIML-WML, idh, data pop(p id_stack); //pop id from p's stack (R) set=get_matches(id); //get all triggers matching id (a)Service composition if (p id_stack=0 MPEG-H. 263 transcoder T) drop(p)∥ nowhere i3_forward(p) while(setto)forward packet to each matching trigger t= get frigger(set_) pl=copy (p); // create new packet to sene l/.add I's stack at head of pl's stack Nid, R2)F ef(R2) prepend(t id _stack, pl id_stack) ward(p1); (b) Heterogeneous multicast 23-forward(p)//sendfonward packet p Figure 4:(a) Service composition: The sender(s)specifies that id= head(p id_stack); //get head of p's stack if (type(id)=IP-ADDR_TYPE) packets should be transcoded at server T before being delivered to the destination(R).(b) Heterogeneous multicast: Receiver IP-send(id, p); // id is an IP address send p to id via IP Rl specifies that wants to receive H 263 data, while R2 specifies else that wants to receive mpeg data. The sender sends mPeg forward(p); //forward p via overlay network data Figure 3: Pseudo-code of the receiving and forward operations executed by an 3 server For each matching trigger t, the identifier stack of the trig prepended to p's identifier stack. The packet p is then forwarded based on the first identifier in its stack The generalized form of packets allows a source to send a packet 3. USING i3 to a series of identifiers much as in sour rce routin ized form of triggers allows a trigger to send a packet to another In this section we present a few examples of how i3 can be used. identifier rather than to an address This extension allows for a We discuss service composition, heterogeneous multicast, server much greater flexibility. To illustrate this point, in Sections 3. 1, 3. 2, selection, and large scale multicast. In the remainder of the paper, and 4.3, we discuss how identifier stacks can be used to provide we say that packet p matches trigger t if the first identifier of ps service composition, implement heterogeneous multicast, and in- crease 23s robustness, respectively for ser identifer z always forwarded based on the first identifier id 3.1 Service Composition A packet ck until it reaches the server who is responsible Some applications may require third parties to process the data for storing the matching trigger(s) for p. Consider a packet p with before it reaches the destination [10]. An example is a wireless n identifier stack(idi, id2, id3). If there is no trigger in i3 whose application protocol (WAP) gateway translating HTML web pages identifier matches id1, id is popped from the stack. The process is to WML for wireless devices [35]. WML is a lightweight version of repeated until an identifier in ps identifier stack matches a trigger HTML designed to run on wireless devices with small screens and If no such trigger is found, packet p is dropped. If on the other limited capabilities. In this case, the server can forward the web hand. there is a trigger t whose identifier matches id, then idi is page to a third-party server T that implements the HTML-WML replaced by t's identifier stack. In particular, if ts identifier stack transcoding, which in turn processes the data and sends it to the is(a, ) then p,s identifier stack becomes(a, 3, id2, id3). If idi destination via WAP. is an IP address, p is sent via IP to that address, and the rest of In general, data might need to be transformed by a series of Ps identifier stack, i.e,(id2, id3)is forwarded to the application. third-party servers before it reaches the destination. In today's In- The semantics of id2 and id3 are in general application-specific. ternet, the application needs to hnow the set of servers that pe owever, in this paper we consider only examples in which the form transcoding and then explicitly forward data packets via these application is expected to use these identifiers to forward the packet servers. after it has processed it. Thus, an application that receives a packet With i3, this functionality can be easily implemented by using a ith identifier stack(id2, id3) is expected to send another packet ack of identifiers. Figure 4(a) shows how data packets containing with the same identifier stack(id2, id3). As shown in the next HTML information can be redirected to the transcoder and thus section this allows 23 to provide support for service composition. arrive at the receiver containing WML information. The sender Figure 3 shows the pseudo-code of the receiving and forward- associates with each data packet the stack(id HTM L-WML, id), ing operations executed by an 23 node. Upon receiving a packet where id represents the flow identifier. As a result, the data packet server first checks whether it is responsible for storing the trig is routed first to the server which performs the transcoding. Next, ger matching packet p. If not, the server forwards the packet at the server inserts packet(id, dat a) into 23, which delivers it to the the 23 level. If yes, the code returns the set of triggers that match receiver
☎✝✆ recv(✵) // upon receiving packet ✵ ☎❅✸ ❏✁✄✂✧✽❀✸ ✷✵ ✱ ☎✦✸ ☎ ✿✕✽✝✆❉❆✻ ; // get head of p’s stack // is local server responsible for id’s best match? if ✷ ☎✞☎✠✟✽✾✿✡✆☛✄☞✍✌✠✆✽✏✎❁✷ ☎❅✸✘✻✢❏ FALSE✻ ☎✝✆ forward(✵); // matching trigger stored elsewhere return; ✵✑✌✵ ✷✵✱ ☎❅✸ ☎ ✿✕✽✝✆✚❉❂✻ ; // pop id from p’s stack... ☎✒✂✚✿ ✿✢❏✔✓✏✂✚✿ ❇✽❀✿✕✆✖✄✂✠☎✒✷ ☎❅✸✘✻ ; // get all triggers matching id if ✷✗☎✒✂✧✿ ✿✢❏✁✘✒✻ if ✷✵ ✱ ☎❅✸ ☎ ✿✕✽✝✆❉P❏✁✘✒✻ drop(✵) // nowhere else to forward else ☎ ✆ forward ✷✵✻ ; while (☎✒✂✚✿ ✿✚✙❏✁✘) // forward packet to each matching trigger ✿ ❏✛✓✏✂✚✿ ✿✕❈☎✜✓✢✓✏✂✧❈✘✷✗☎✒✂✧✿ ✿✑✻ ; ✵◗ ❏✛✆☛✌✵✄✣ ✷✵✻ ; // create new packet to send // ... add t’s stack at head of p1’s stack ✵❆❈✤✂✵✥✂✒✦✸ ✷✜✿✺✱ ☎❅✸ ☎ ✿✕✽✝✆✚❉ ✹ ✵ ◗ ✱ ☎❅✸ ☎ ✿✕✽✝✆✚❉❂✻ ; ☎✝✆ forward(✵ ◗ ); ☎✝✆ forward(✵) // send/forward packet ✵ ☎❅✸ ❏✁✄✂✧✽❀✸ ✷✵ ✱ ☎✦✸ ☎ ✿✕✽✝✆❉❆✻ ; // get head of p’s stack if ✷✜✿✡✣✵✥✂✾✷ ☎❅✸✘✻ ❏ IP ADDR TYPE✻ IP send(☎❅✸✠✹ ✵); // id is an IP address; send p to id via IP else forward ✷✵✻ ; // forward ✵ via overlay network Figure 3: Pseudo-code of the receiving and forward operations executed by an ☎✝✆ server. The generalized form of packets allows a source to send a packet to a series of identifiers, much as in source routing. The generalized form of triggers allows a trigger to send a packet to another identifier rather than to an address. This extension allows for a much greater flexibility. To illustrate this point, in Sections 3.1, 3.2, and 4.3, we discuss how identifier stacks can be used to provide service composition, implement heterogeneous multicast, and increase ☎✝✆’s robustness, respectively. A packet ✵ is always forwarded based on the first identifier ☎❅✸ in its identifier stack until it reaches the server who is responsible for storing the matching trigger(s) for ✵. Consider a packet ✵ with an identifier stack ✷ ☎❅✸ ■ ✹ ☎❅✸✮ ✹ ☎❅✸✰ ✻ . If there is no trigger in ☎ ✆ whose identifier matches ☎❅✸ ■ , ☎❅✸ ■ is popped from the stack. The process is repeated until an identifier in ✵’s identifier stack matches a trigger ✿. If no such trigger is found, packet ✵ is dropped. If on the other hand, there is a trigger ✿ whose identifier matches ☎❅✸ ■ , then ☎✦✸ ■ is replaced by ✿’s identifier stack. In particular, if ✿’s identifier stack is ✷★✧ ✹✡✣❂✻ , then ✵’s identifier stack becomes ✷★✧ ✹✡✣ ✹ ☎❅✸✮✬✹ ☎❅✸★✰✧✻ . If ☎❅✸ ■ is an IP address, ✵ is sent via IP to that address, and the rest of ✵’s identifier stack, i.e., ✷ ☎❅✸✮✎✹ ☎❅✸★✰✧✻ is forwarded to the application. The semantics of ☎❅✸✮ and ☎✦✸✰ are in general application-specific. However, in this paper we consider only examples in which the application is expected to use these identifiers to forward the packet after it has processed it. Thus, an application that receives a packet with identifier stack ✷ ☎❅✸✮✎✹ ☎❅✸★✰☛✻ is expected to send another packet with the same identifier stack ✷ ☎✦✸✮✬✹ ☎❅✸★✰☛✻ . As shown in the next section this allows ☎ ✆ to provide support for service composition. Figure 3 shows the pseudo-code of the receiving and forwarding operations executed by an ☎✝✆ node. Upon receiving a packet ✵, a server first checks whether it is responsible for storing the trigger matching packet ✵. If not, the server forwards the packet at the ☎✝✆ level. If yes, the code returns the set of triggers that match ✩✪✩✪✩✪✩ ✫✪✫✪✫ ✬✪✬✪✬✪✬ ✭✪✭✪✭ ✮✪✮✪✮✪✮ ✯✪✯✪✯ ✰✪✰✪✰✪✰ ✱✪✱✪✱ HTML−WML transcoder (T) (id, R) (R, data) ((T, R1), data) (id, (id , R1)) MPEG−H.263 ((id , R1), data) MPEG−H.263 MPEG−H.263 transcoder (T) (id, R2) (R1, data) (R2, data) receiver (R2) receiver (R1) sender (S) (id, data) sender (S) ((T,id), data) (b) Heterogeneous multicast (a) Service composition HTML−WML MPEG−H.263 ((id , id), data) HTML−WML (id , T) (id , T) receiver (R) Figure 4: (a) Service composition: The sender (✲) specifies that packets should be transcoded at server ✳ before being delivered to the destination (✶). (b) Heterogeneous multicast: Receiver ✶◗ specifies that wants to receive H.263 data, while ✶✼▲ specifies that wants to receive MPEG data. The sender sends MPEG data. ✵. For each matching trigger ✿ , the identifier stack of the trigger is prepended to ✵’s identifier stack. The packet ✵ is then forwarded based on the first identifier in its stack. 3. USING ❃ ❄ In this section we present a few examples of how ☎✝✆ can be used. We discuss service composition, heterogeneous multicast, server selection, and large scale multicast. In the remainder of the paper, we say that packet ✵ matches trigger ✿ if the first identifier of ✵’s identifier stack matches ✿’s identifier. 3.1 Service Composition Some applications may require third parties to process the data before it reaches the destination [10]. An example is a wireless application protocol (WAP) gateway translating HTML web pages to WML for wireless devices [35]. WML is a lightweight version of HTML designed to run on wireless devices with small screens and limited capabilities. In this case, the server can forward the web page to a third-party server ✳ that implements the HTML-WML transcoding, which in turn processes the data and sends it to the destination via WAP. In general, data might need to be transformed by a series of third-party servers before it reaches the destination. In today’s Internet, the application needs to know the set of servers that perform transcoding and then explicitly forward data packets via these servers. With ☎✝✆, this functionality can be easily implemented by using a stack of identifiers. Figure 4(a) shows how data packets containing HTML information can be redirected to the transcoder, and thus arrive at the receiver containing WML information. The sender associates with each data packet the stack ✷ ☎❅✸✑✴✶✵✥✷✹✸✄✺✼✻✽✷✾✸ ✹ ☎❅✸✘✻ , where ☎❅✸ represents the flow identifier. As a result, the data packet is routed first to the server which performs the transcoding. Next, the server inserts packet ✷ ☎❅✸✠✹✺✸✾✽✾✿✕✽❂✻ into ☎✝✆, which delivers it to the receiver
3.2 Heterogeneous Multicast Figure 4(b) shows a more complex scenario in which an MPEC video stream is played back by one H 263 receiver and one MPEG To provide this functionality, we use the ability of the receiver instead of the sender(see Section 2.5), to control the transforma tions performed on data packets. In particular, the H 263 receiver Rl inserts trigger(id, (id Mhe s-HQ63, R1)), and the sender sends ets(id, data). Each packet matches R1's trigger, and as a result the packets identifier id is replaced by the triggers stack (idMhe s-H@63, T). Next, the packet is forwarded to the MPEG- 起 H 263 transcoder, and then directly to receiver Rl. In contrast, an MPEG receiver R2 only needs to maintain a trigger(id, R1) in i3 R2R3 This way, receivers with different display capabilities can subscribe to the same multicast group. Another useful application is to have the receiver insist that all data go through a firewall first before reaching it. 3.3 Server Selection use of the last m-k bits of the identifiers to encode application Figure 5: Example of a sealable multicast tree with bounded preferences. To illustrate this point consider two examples degree by using chains of triggers. In the first example, assume that there are several web servers and the goal is to balance the client requests among these servers This goal can be achieved by setting the m-k least significant bits receivers of the multicast group construct and maintain the hierar of both trigger and packet identifiers to random values. If servers hy of triggers have different capacities, then each server can insert a number of triggers proportional to its capacity. Finally, one can devise an 4. ADDITIONAL DESIGN AND PERFOR daptive algorithm in which each server varies the number of trig MANCE ISSUES gers as a function of its current load In the second example, consider the goal of selecting a server In this section we discuss some additional i3 design and per- that is close to the client in terms of latency. To achieve this goal, formance issues. The 23 design was intended to be(among other each server can use the last m-k bits of its trigger identifiers to properties)robust, self-organizing, efficient, secure, scalable, incre- mentally deployable, and compatible with legacy applications. In encode its location, and the client can use the last m-k bits in the this section we discuss these issues and some details of the design packets' identifier to encode its own location. In the simplest case, the location of an end-host (i.e, server or client) can be the zip that are relevant to them Before addressing these issues, we first review our basic design. fix matching procedure used by i3 would result then in the packet stores a subset of triggers. In the basic design, at any moment of being forwarded to a server that is relatively close to the cler time, a trigger is stored at only one server. Each end-host know. 3. 4 Large scale multicast about one or more i3 servers. When a host wants to send a packet (id, data), it forwards the packet to one of the servers it knows If The multicast abstraction presented in Section 2. 4.2 assumes the contacted server doesn't store the trigger matching(id, data) that all members of a multicast group insert triggers with identical the packet is forwarded via IP to another server. This process con- identifiers. Since triggers with identical identifier are stored at the tinues until the packet reaches the server that stores the matching same i3 server, that server is responsible for forwarding each mul ticast packet to every member of the multicast group. This solution trigger. The packet is then sent to the destination via IP obviously does not scale to large multicast groups 4.1 Properties of the Overlay One approach to address this problem is to build a hierarchy of The performance of i3 depends greatly on the nature of the un- triggers,where each member R: of a multicast group idg replaces derlying overlay network. In particular, we need an overlay net- its trigger(idg, Ri) by a chain of triggers(idg, I1),(a1, r2) (i,R). This substitution is transparent to the sender: a packet work that exhibits the following desirable properties (ids, data) will still reach R via the chain of triggers. Figure 5 Robustness: With a high shows an example of a multicast tree with seven receivers in which remains connected even in the face of massive server and no more than three triggers have the same identifier. This hierarchy communication failures of triggers can be constructed and maintained either cooperatively by the members of the multicast group, or by a third party provider. Scalability: The overlay network can handle the traffic ge In [18], we present an efficient distributed algorithm in which the erated by millions of end-hosts and applications. Recall that identifiers are m bits long and that k is the exact- Efficiency: Routing a packet to the server that stores the matching threshold packet,s best matching trigger involves a small number of SHere we assume that nodes geographically close to ther are also close in terms work distances, which lways true. One could instead use latency based encoding, Stability: The mapping between triggers and servers is rela- tively stable over time, that is, it is unlikely to change during
3.2 Heterogeneous Multicast Figure 4(b) shows a more complex scenario in which an MPEG video stream is played back by one H.263 receiver and one MPEG receiver. To provide this functionality, we use the ability of the receiver, instead of the sender (see Section 2.5), to control the transformations performed on data packets. In particular, the H.263 receiver ✶◗ insertstrigger ✷ ☎❅✸✠✹✧✷ ☎❅✸✷✁✄✂✄☎ ✺✥✴✝✆ ✮✟✞ ✰✤✹✑✶◗✧✻✺✻ , and the sender sends packets ✷ ☎❅✸✠✹✺✸✾✽✾✿✕✽❂✻ . Each packet matches ✶◗ ’s trigger, and as a result the packet’s identifier ☎✦✸ is replaced by the trigger’s stack ✷ ☎❅✸✏✷✁✄✂✄☎ ✺✥✴✠✆ ✮✟✞ ✰ ✹✡✳✻ . Next, the packet is forwarded to the MPEGH.263 transcoder, and then directly to receiver ✶◗ . In contrast, an MPEG receiver ✶✼▲ only needs to maintain a trigger ✷ ☎❅✸✠✹✑✶◗✧✻ in ☎✝✆. This way, receivers with different display capabilities can subscribe to the same multicast group. Another useful application is to have the receiver insist that all data go through a firewall first before reaching it. 3.3 Server Selection ☎✝✆ provides good support for basic server selection through the use of the last ❇✪ ❉ bits of the identifiers to encode application preferences.2 To illustrate this point consider two examples. In the first example, assume that there are several web servers and the goal is to balance the client requests among these servers. This goal can be achieved by setting the ❇ ✪ ❉ least significant bits of both trigger and packet identifiers to random values. If servers have different capacities, then each server can insert a number of triggers proportional to its capacity. Finally, one can devise an adaptive algorithm in which each server varies the number of triggers as a function of its current load. In the second example, consider the goal of selecting a server that is close to the client in terms of latency. To achieve this goal, each server can use the last ❇✁✪ ❉ bits of its trigger identifiers to encode its location, and the client can use the last ❇✁✪ ❉ bits in the packets’ identifier to encode its own location. In the simplest case, the location of an end-host (i.e., server or client) can be the zip code of the place where the end-host is located; the longest pre- fix matching procedure used by ☎✝✆ would result then in the packet being forwarded to a server that is relatively close to the client.3 3.4 Large Scale Multicast The multicast abstraction presented in Section 2.4.2 assumes that all members of a multicast group insert triggers with identical identifiers. Since triggers with identical identifier are stored at the same ☎ ✆ server, that server is responsible for forwarding each multicast packet to every member of the multicast group. This solution obviously does not scale to large multicast groups. One approach to address this problem is to build a hierarchy of triggers, where each member ✶✛ of a multicast group ☎❅✸☛✡ replaces its trigger ✷ ☎❅✸☛✡✾✹✑✶✛ ✻ by a chain of triggers ✷ ☎❅✸☛✡✾✹ ✧ ■ ✻ , ✷★✧ ■ ✹ ✧ ✮✚✻ , ✱✲✱✲✱ , ✷★✧ ✛ ✹✺✶✛ ✻ . This substitution is transparent to the sender: a packet ✷ ☎❅✸☞✡✾✹✑✸✾✽✾✿✕✽❂✻ will still reach ✶✛ via the chain of triggers. Figure 5 shows an example of a multicast tree with seven receivers in which no more than three triggers have the same identifier. This hierarchy of triggers can be constructed and maintained either cooperatively by the members of the multicast group, or by a third party provider. In [18], we present an efficient distributed algorithm in which the ✮ Recall that identifiers are ❇ bits long and that ❉ is the exactmatching threshold. ✰ Here we assume that nodes that are geographically close to each other are also close in terms of network distances, which is not always true. One could instead use latency based encoding, much as in [20]. ✏✍✏✍✏✑✍✑ ✌✍✌✍✌ ✎✍✎✍✎ ✒✍✒✍✒ ✓✍✓✍✓ ✔✍✔✍✔ ✔✍✔✍✔✕✍✕ ✕✍✕ ✖✍✖✍✖ ✖✍✖✍✖ ✗✍✗ ✗✍✗ ✘✍✘✍✘✙✍✙ id1 R2 ✚✍✚✍✚✛✍✛ R1 idg id1 id2 idg idg R5 R4 id2 R4 R5 id2 R6 id2 R6 R1 (idg, data) S R3 R2 R3 id1 Figure 5: Example of a scalable multicast tree with bounded degree by using chains of triggers. receivers of the multicast group construct and maintain the hierarchy of triggers. 4. ADDITIONAL DESIGN AND PERFORMANCE ISSUES In this section we discuss some additional ☎✝✆ design and performance issues. The ☎✝✆ design was intended to be (among other properties) robust, self-organizing, efficient, secure, scalable, incrementally deployable, and compatible with legacy applications. In this section we discuss these issues and some details of the design that are relevant to them. Before addressing these issues, we first review our basic design. ☎✝✆ is organized as an overlay network in which every node (server) stores a subset of triggers. In the basic design, at any moment of time, a trigger is stored at only one server. Each end-host knows about one or more ☎✝✆ servers. When a host wants to send a packet ✷ ☎❅✸✠✹✺✸✾✽✾✿✕✽❂✻ , it forwards the packet to one of the servers it knows. If the contacted server doesn’t store the trigger matching ✷ ☎❅✸✠✹✺✸✾✽✾✿✕✽❂✻ , the packet is forwarded via IP to another server. This process continues until the packet reaches the server that stores the matching trigger. The packet is then sent to the destination via IP. 4.1 Properties of the Overlay The performance of ☎✝✆ depends greatly on the nature of the underlying overlay network. In particular, we need an overlay network that exhibits the following desirable properties: ✶ Robustness: With a high probability, the overlay network remains connected even in the face of massive server and communication failures. ✶ Scalability: The overlay network can handle the traffic generated by millions of end-hosts and applications. ✶ Efficiency: Routing a packet to the server that stores the packet’s best matching trigger involves a small number of servers. ✶ Stability: The mapping between triggers and servers is relatively stable over time, that is, it is unlikely to change during