Analysis of the Increase and decrease Algorithms for Congestion Avoidance in Computer Networks Dah-Ming CHIU and Raj jAl 1. Introduction Digital Equipment Corporarion, 550 King Siree(LKGI-2/4A19 Littleton, MA O1460-1289, U.S.A 1. background Congestion in computer networks is becoming an important issue due to the increasing mismatch in link new technology, Recent technological advances that allow the network to recover from the congested state of high delay and low throughput. Both con gestion avoidance and congestion control mechanisms are basi ally resource management problems. They can be formulated algorithm(or control function) used by ease or decrease their load (window or rate). We abstractly aFro 979 to 1980. he was wi de class of such increase/decrease algorithms orked on applying queuing theory to and compare them using several different performance metrics. Thcy kcy metrics are cfficicncy, fairncao, convcrgence time, been with Digital Equipment Corpo- and size of oscillations. formance modeling and analysis of com It is shown that a simple additive increase and multiplicative decrease algorithm satisfies the sufficient conditions for con- tems Archi to an efficient and fair state regardless of the starting rithms. His research interests include operation systems, dis- ation in the congestion avoidance scheme recom- and optimization theories. Digital Networking Architecture and OSI Trans- Dr Chiu is a member of the ACm and the IEEE ort Class 4 Networks Raj Jain received the B.E. degree eywords, Computer Network, Congestion ctor oL, Congestion AvO dance. flow Control fairness. cambridgeoma. in19⊥ His Ph D. dissertation their"Outstanding dissertations in systems and networks including VAX Clusters, DECnet, and rchitecture and perfe a sabbatical at the ma cal area systems. For North-Holland Computer Networks and ISDN Systems 17(1989)1-14 er of the Association for Computing Machinery. and a senior member of IEEE 0169-7552/89/S3.500 1989, Elsevier Science Publishers B V(North-Holland)
2 D,M Chiu R Jain/Congestion Aooidance in Computer Networks Cliff ueues start overflowing, the response time in- The point at which the packets start getting lost is called a cliff due to the fact that the throughput falls off rapidly after this point. We use the term knee to descnbe the point alter which the increase increase in the response time results a scheme that allows the network to operate at Load the knee is called as distinguished from a congestion control scheme that tries to keep the network operating in the zone to the left of the cliff. a properly designed congestion avoidance scheme will ensure that the users are encouraged to increase their traffic load as long as this does not significantly affect ul response time, and are rcquircd to dcercase them if that happens. Thus, the network load oscillates around the knee Both congestion avoidance and congestion con- Fig. 1. Network performance as a funetion of the load. Broken trol mechanisms are dynamic resource manage curves inuiuaie perfulInance will detcatiliMliy ct nterarnval times. ment problems that can be formulated as system control problems in which the system senses its state and feeds this back to its users who adjust their controls. For the congestion problem, the sucl as lucal auea networks(LANs) and fibcr statc consists of the load on the network and the optic LANs have resulted in a significant increase control is the number of packets put into the in the bandwidths of computer network links. network by the users. Often a window mechanism However, these new technologies must coexist with is used in the transport layer protocols to limit the the old low bandwidth media such as the twisted number of packets put into the network. An alter pair. This heterogeneity has resulted in a mis- native mechanism consists of setting a limit on the match of arrival and service rates in the inter ate(packets per second or bits per second)that mediate nodes in the network causing increased can be sent by a user. In either case, the control queuing and cor (window or rate)can be dynamically adjusted as Traditional congestion control schemes help the total load on the system changes. This control, improve performance after congestion has oc- which we call the increase/decrease algorithm, is rred. Figure 1 shows general patterns of re- at the heart of all congestion avoidance mecha sponse time and throughput of a network as the nism. network load increases. If the load is small, We have investigated a number of congestion throughput generally keeps up with the load. As avoidance mechanisms, reported in a series of the load increases, throughput increases. After the papers, and this paper is a part of that serie load reaches the network capacity, throughput [1,8 U, Il]. The concept of congestion avoidance stops increasing. If the load is increased any fur- and several alternatives are described in [7]. We ther, the queues start building, potentially result hose a particular alternative called the"binary ing in packets being dropped. Throughput ma feedback scheme"which is described in detail in suddenly drop when the lo creases heyond [11]. This scheme is later extended in (101 this point and the network is said to be congested. include a "selective feedback "mechanism in which The response-time curve follows a similar pattern. the routers monitor different users and permit At first the response time increases little with some users to increase load while requesting others load. When the queues start building up, the re- to decrease load. All of our work on congestion sponse time increases linearly until finally, as the avoidance is summarized in [8]
D.M. Chiu, R Jain/ Congestion Avoidance in Camputer Network This papel contenlaLes un a detailed analys At the other end of the spectrum, we have de of the increase/decrease algorithms. This analysis centralized decision-making. In this case the deci- resulted in the selection of the increase/decrease sions are made by the users while the resources algorithms used in the binary feedback scheme feed formation regi garding current resource usage. proposed in [1ll and [10]. However, the analysis Algorithms studied by Jaffe [5] and later exten presented here is general and applies to many sions by Gaini [2] and Mosely [91 are all good other applications besides congestion avoidance examples of this approach. Briefly, the binary feedback scheme for conges- In this paper we analyze a class of decentral- tion avoidance operates as follows. The resources ized decision-making algorithms that are based on in the network monitor their usage and determine a special form of feedback, namely the feedback if they are loaded below or above an optimal from the resource is a binary signal. This binary level Depending upon the load level, the resource signal indicates whether the resource is currentI sends a binary feedback (1-overloaded, 0 erloaded or underutilized. A very good reason nderloaded) to the users who then adjust their for considering a binary form of feedback is the using an increase/ decrease algorithm. Thi motivation of making the controller/ manager of binary feedback is sent by setting a bit in the the resource as simple and efficient as possible packet header. The use of a bit in the packet The requirement of a binary feedback often min header as a feedback mechanism has been incor- mizes the work at the resource in generating the porated into the OSI connectionless networking feedback protocol standards [4]. The bit is called a"conges- n experienced bit " and is a part of a field called 1.3. Notations and Definition The abstract model assumes that all the users Figure 2 shows the assumed model of the net sharing the same bottleneck will receive the same work with n users sharing it. The congestion state feedback, Based on this feedback, the users try to of the system is determined by the number of adJust their load so that the bottleneck is effi- packets in the system. We assume a discrete time iently used as well as equally shared by all users. operation with time divided into small slots.'These In this abstracted context. we assume that the slots basically represent intervals at the beginning feedback and control loop for all users is synchro- of which the users set their load level based on the nous, that is, all users receive the same feedback network feedback received during the previous and react to it: the next feedback is then gener ted after all users have reacted to the feedback x(1), then the total load at the bottleneck re and so on. Also, we concentrate on one bottleneck source would be Ex, (t),and the state of the resource and the users that share it. Because of system is denoted by the n-dimensional vector these abstractions, we are able to demonstrate x(0) some of the subtle behavior of this type of al erating at or near the knee, all resources de- gorith. The results piescuted licle wele verified Mandel by the users are granted (this is not true by detailed simulations of real networks [7, 10. 111. at the cliff). Thus, x, (()denotes the ith user s 1.2 Past Work User J he algorithms studied here belong to a class of distributed algorithms for managing distributed pectrum of such distributed al- User 2 Xxi> goni gorithms have been studied in the literature. At onc cnd of thc spcctrum, we havc centralized decision-making. In this paradigm, information (about user demands) flows to the resource managers, and the decision of how to allocate the User n resource is made at the resource. The analysis by Sanders [12] is a good example of this approach Fig 2. A control system model of a users sharing a network
D-M. Chi,R Jain/Congestion Acoidance in Computer Networks demands by adding a constant amount to their sources. During th previous demands. The decrease is also additive. ermines its load level and sends a binary feed (3)Additive Increase/ Multiplicative Decrease back y(u), which is interpreted by the users as x (+1 follows a,+x, (t) if y(a)=0=Increase, y()-{3 1→ Decrease load i, (r) if y(t)=1=Decrease. The users cooperate with the system and change The increase is by a constant amount but the (increase of decrease)their demands by an amount decrease is by (4)Multiplicative increase /additive decrease x,(t+1) The change u () represents ith users control. It 6 x, (1 if y(1)=0=Increase, is a function of the users previous demand and ap+x( if y(o)=1- Decrease In order to evaluate the effectiveness of these u1()=f(x,(t),y(t) (2) controls, we next define a set of criteria explicitly In other words in the next section x(+1)=x,(t)+f(x,(t),y(1) 1. 4. Criteria for Selecting Controls Notice that the users are not aware of other user individual demands and, thus, cannot make u, (4) utedness, and convergence. We define them for- a function of x, (t)./*i In general, the control mally as follows function /()can be any linear or nonlinear func- (1) Eficiency: The efficiency of a resource usage on. However, we will focus first on linear con- is defined by the closeness of the total load on the trols. ihe state equations (1)reduce to resource to its knee. If X denotes the desired x(r+1) load level at the knee, then the resource is operat- ing efficiently as long as the total allocation X(r) (a1+hx, (1) if y(n)=0=Increase 2x, (4)is close to Xgoal. Overload ( X(1)> xgoal) ap+bpx,(r if y(1)=1= Decrease nderlvau(X(I)<Sos) aIe both and are considered inefficient, We consider both Here, a1, b,, an, bn are constants. The following as equally undesirable are some exam ples of the control functions Notice, that efficiency relates only to the total (1)Multiplicative Increase/Multiplicative De- allocations and thus two different allocations can crease both be efficient as long as the total allocation is x, ((+U)s/b1 x,()if y(r)=0=Increase. close to the goal. The distribution of the total allocation among individual users is measured by (, (1)if y(r)=1=Decrease the fairness criterion (2) Fairness: The fairness criterion has been their demands by multiplying their previous de- widely studied in the litcrature. When multiplc mands by a constant factor. The decrease is also users share multiple resources, the maxmin fair- multiplicative ess criterion has been widely adopted [2, 3, 5, 10 Additive Increase /Additive Decrease Essentially, the set of users are partitioned into equivalent classes according to which resource is x(+1 their primary bottleneck. The maxmin criterion ar+x, (o if y(r)=0= Increase, hen asserts that the users in the same equivalent ap+x( if y(o=l= Decrease i It is assumed that truncation is applied when ap +x(o)is Here, a>0 and ap <O. All users increase their less than zero, so that x, (r) will never become negative
D.-M. Chiu, R. Jain/Congestion Aeoidance in Computer Networks class ought to have the equal sliaie uf tlie but- Responsiveness tleneck. Thus, a system in which x, (t)=x()vi,J sharing the same bottleneck is operating fairly. If all users do not get exactly equal allocations, the system is less fair and we need an index or a Goal Smoothness function that quantifies the fairness. One such Total load on Fairness: F(r) n(∑x2) the network This index has the following properties (a) The fairness is bounded between 0 and 1(or 0% and 100%). A totally fair allocation( with Fig 3. Responsiveness and smoothness all x, 's equal) has a fairness of I and a totally unfair allocation(with all resources (4)Convergence: Finally we require the contro given to only one user)has a fairness of 1/n scheme to converge. Convergence is generally which is 0 in the limit as n tends to oo measured by the speed with which(or time taken (b)The fairness is independent of scale, i.e., till) the system approaches the goal state from any (c)The fairness is a continuous function, Any of the feedback, the system does not generally slight change in allocation shows up in the converge to a single steady state. Rather, the sys- fairness tem reaches an"equilibrium"in which it oscillates (d )If only k of n users share the resource around the optimal state. The time taken to reach equally with the remaining n-k users not this"equilibrium" and the size of the oscillations receiving any resource, then the faimess jointly deteRmine tie convergene. Te Lille dv termines the responsiveness, and the size of the For other properties of this fairness function, see oscillations determine the smoothness of the con- [6] trol. Ideally we would like the time as well as ( Distributedness: The next requirement that oscillations to be small. Thus, the controls with we put on the control scheme is that it be distrib- smaller time and smaller amplitude of oscillations uted. A centralized scheme requires complete are called more responsive and more smooth, re- knowledge of the state of the system. For example, pectively, as shown in Fig. 3. we may want to know each individual users de- Imand or their sulll. Tlis inforMation nlay be 1,5 Outline uf this Puper available at the resourcc. However, conveying this information to each and every user causes consid In this paper, we develop a simple and intuitive erable overhea d, especially since a user may be methodology to explain when and why a control g several resources at the same time. We are converges. We address the following questions thus primarily interested in control schemes that What are all the possible solutions that converge to an be implemented in real networks and, there efficient and fair states? How do we compare those fore, we assume that the system does the mini- controls that converge? mum amount of feedback. It only tells whether it The paper is organized as follows In Section is underloaded or overloaded via the binary feed e will characterize the set of all linear controls back bits. Other information such as X and thc and, thus, identify the set of feasibl number of users sharing the resource are assumed ntrols. Then we narrow down the feasible set to to be unknown by the users. This restricts the set a subset that satisfies our distributedness criterion of feasible schemes. We therefore, describe the set These subset still includes controls that have un- of feasible schemes with and without this restric- acceptable magnitudes of oscillation or those that tion converge too slowly. Then in Section 3, we discuss