IEEE/ACM TRANSACTIONS ON NETWORKING VOL I7.NO. 1 FEBRUARY 2009 Virus spread in Networks Piet Van Mieghem, Member, IEEE, Jasmina Omic, and robert Kooij Abstract-The ce of the network characteristics on the Many authors(see, e.g., 3], [6], [11], and [12])mention the read is in a new-the N-intertwined Markov existence of an epidemic threshold Tc. If the effective spreading chain-model nly approximation lies in the application rate T=(B/6)>Te, the virus persists and a nonzero fraction in detail. The -intertwined model has been compared with of the nodes are infected, whereas for. s Te the epidemic dies the exact 2-state Markov model and with previously proposed out. However, when the same model is exactly described via "homogeneous"or"local"models. The sharp epidemic threshold Markov theory as shown in Section Ill, the observation that this To, which is a consequence of mean field theory, is rigorously Markov chain(with a finite number of states) possesses an ab- shown to be equal to Tc=1/(Amax(A),where Amax(A)is the sorbing state, contradicting the existence of any threshold. Be largest eigenvalue--the spectral radius--of the adjacency matrix cause. in an irreducible Markov chain--all states are reachable A. A continued fraction expansion of the steady-state infection from each other-the existence of an absorbing state implies probability at node is presented as well as several upper bounds that all other states are transient states and that the steady state is Index Terms-Epidemic threshold, Markov theory, mean field the absorbing state. Moreover, the probability that the process is eory, spectral radius, virus spread. in a transient state exponentially tends to zero with time. How- ever, the convergence time T to the steady state can be very large, as shown in Section Ill. Ganesh et al. [9] give estimates L. INTRODUCTION complications arise. An infinite-state Markov process is consid- W甲 ding of a virus in a network that was earlier con- trated by, e. g, branching pro(1+12则ph sidered by Ganesh et al. [9] and by Wang et al.[15]in dis- bility of extinction is a characteristic feature that is not presented crete time. The model belongs to the class of susceptible-in- in a finite-state Markov chain. Although there is an absorbing fected-susceptible(SIS)models that, together with the suscep- state, in an infinite-state Markov process, there is a nonzero tible-infected-removed(SIR)models, are the standard models chance that the process never dies out. Since the exact Markov for computer virus infections. Each node in the network is ei- chain(see Section III)consists of 2--states in a network of N ther infected or healthy. An infected node can infect its neigh- nodes, features of the infinite-state Markov process rapidly pop bors with an infection rate B, but it is cured with curing rate 8. up. The apparent steady-state connected with the observation However,once cured and healthy, the node is again prone to the of an epidemic threshold is often termed the"metastable state" virus. Both infection and curing processes are independent. Re- SInce, on a sufficiently long time-scale for finite-state systems, finements like the existence of an incubation period, an infection Our major motivation is to understand the influence of graph rate that depends on the number of neighbors, a curing process characteristics on epidemic spreading. Earlier, Wang et al. [15 that takes a certain amount of time, and other sophistications presented an approximate analysis from which they concluded are not considered here, but we refer to, eg-[2],[6], [10] and that the threshold of the effective infection rate Tc equals can be applied to the spread of e-mail worms and other com- adjacency matrix A of the network. This result relates--for puter viruses, the propagation of faults or failures, and, more the first time to the best of our knowledge-the epidemiolog generally,the spread of information(e.g, news, rumors, brand ical spreading to a specific characteristic, the spectral radius awareness, and marketing of new products)and epidemic dis- Amax(A), of the network. When using mean field theory(or re- semination or/and routing in ad hoc and peer-to-peer networks. lated averaging techniques), we rigorously show in Section IV that, in the steady state, there is indeed a well-defined threshold Tc= 1/(max(A). This result relativizes the belief of the IEEE/ACM TRANSACTIONS ON NETWORKING Editor L Massoulie. First pub- physics society(see, e.g., [1] and [ 12]) that scale-free networks lished June 24. 2008: current on published February 19, 2009. This work like the Internet possess a vanishingly small epidemic threshold as supported in part by the Netherlands Organization for Scientific Research NwO) under Project 643.000.503 and by ation Infrastructures and, hence vulnerable to viruses This announcement has provok P. Van Mieghem and J Omic are with the Faculty of Electrical Engi- for scale-free complex networks, which is somehow question Mathematics and Computer Science, Delft Ur 前出抽甲mNW 600 GA Delft, The Netherlands (e-mail: P Ift. l: able In fact, since Amax(A)is smaller than the mean Omic@@ewi tudelft.nl) degree of the network, the class of connected Erdos-Renyi (e-mail: R E Kooij @ewi tudelft. nl) Digital Object Identifier 101109/TNET. 2008.925623 average degree scales with the number of nodes N, wh 1063-66922500@2008IEEE
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 17, NO. 1, FEBRUARY 2009 1 Virus Spread in Networks Piet Van Mieghem, Member, IEEE, Jasmina Omic, and Robert Kooij Abstract—The influence of the network characteristics on the virus spread is analyzed in a new—the -intertwined Markov chain—model, whose only approximation lies in the application of mean field theory. The mean field approximation is quantified in detail. The -intertwined model has been compared with the exact -state Markov model and with previously proposed “homogeneous” or “local” models. The sharp epidemic threshold , which is a consequence of mean field theory, is rigorously shown to be equal to , where is the largest eigenvalue—the spectral radius—of the adjacency matrix . A continued fraction expansion of the steady-state infection probability at node is presented as well as several upper bounds. Index Terms—Epidemic threshold, Markov theory, mean field theory, spectral radius, virus spread. I. INTRODUCTION WE FOCUS ON A simple continuous-time model for the spreading of a virus in a network that was earlier considered by Ganesh et al. [9] and by Wang et al. [15] in discrete time. The model belongs to the class of susceptible-infected-susceptible (SIS) models that, together with the susceptible-infected-removed (SIR) models, are the standard models for computer virus infections. Each node in the network is either infected or healthy. An infected node can infect its neighbors with an infection rate , but it is cured with curing rate . However, once cured and healthy, the node is again prone to the virus. Both infection and curing processes are independent. Re- finements like the existence of an incubation period, an infection rate that depends on the number of neighbors, a curing process that takes a certain amount of time, and other sophistications are not considered here, but we refer to, e.g., [2], [6], [10], and [16]. The theory of the spreads of epidemics through a network can be applied to the spread of e-mail worms and other computer viruses, the propagation of faults or failures, and, more generally, the spread of information (e.g., news, rumors, brand awareness, and marketing of new products) and epidemic dissemination or/and routing in ad hoc and peer-to-peer networks. Manuscript received March 23, 2007; revised July 20, 2007; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor L. Massoulie. First published June 24, 2008; current version published February 19, 2009. This work was supported in part by the Netherlands Organization for Scientific Research (NWO) under Project 643.000.503 and by Next Generation Infrastructures (Bsik). P. Van Mieghem and J. Omic are with the Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2600 GA Delft, The Netherlands (e-mail: P.VanMieghem@ewi.tudelft.nl; J.S.Omic@ewi.tudelft.nl). R. Kooij is with the Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2600 GA Delft, The Netherlands, and also with TNO Information Communication Technology, P.2600 GB Delft (e-mail: R.E.Kooij@ewi.tudelft.nl). Digital Object Identifier 10.1109/TNET.2008.925623 Many authors (see, e.g., [3], [6], [11], and [12]) mention the existence of an epidemic threshold . If the effective spreading rate , the virus persists and a nonzero fraction of the nodes are infected, whereas for the epidemic dies out. However, when the same model is exactly described via Markov theory as shown in Section III, the observation that this Markov chain (with a finite number of states) possesses an absorbing state, contradicting the existence of any threshold. Because, in an irreducible Markov chain—all states are reachable from each other—the existence of an absorbing state implies that all other states are transient states and that the steady state is the absorbing state. Moreover, the probability that the process is in a transient state exponentially tends to zero with time. However, the convergence time to the steady state can be very large, as shown in Section III. Ganesh et al. [9] give estimates of . When the number of states grows unboundedly, major complications arise. An infinite-state Markov process is considerably more complex than a finite-state Markov chain as illustrated by, e.g., a branching process [14, Ch. 12] where the probability of extinction is a characteristic feature that is not presented in a finite-state Markov chain. Although there is an absorbing state, in an infinite-state Markov process, there is a nonzero chance that the process never dies out. Since the exact Markov chain (see Section III) consists of —states in a network of nodes, features of the infinite-state Markov process rapidly pop up. The apparent steady-state connected with the observation of an epidemic threshold is often termed the “metastable state” since, on a sufficiently long time-scale for finite-state systems, it disappears. Our major motivation is to understand the influence of graph characteristics on epidemic spreading. Earlier, Wang et al. [15] presented an approximate analysis from which they concluded that the threshold of the effective infection rate equals , where is the largest eigenvalue of the adjacency matrix of the network. This result relates—for the first time to the best of our knowledge—the epidemiological spreading to a specific characteristic, the spectral radius , of the network. When using mean field theory (or related averaging techniques), we rigorously show in Section IV that, in the steady state, there is indeed a well-defined threshold . This result relativizes the belief of the physics society (see, e.g., [1] and [12]) that scale-free networks like the Internet possess a vanishingly small epidemic threshold and, hence, are vulnerable to viruses. This announcement has provoked a rush of investigations on immunization strategies for scale-free complex networks, which is somehow questionable. In fact, since is never smaller than the mean degree of the network, the class of connected Erdös–Rényi random graphs [14] possesses a far larger spectral radius than any scale-free graph with a same number of nodes . Most complex networks are not small-world networks such that their average degree scales with the number of nodes , which 1063-6692/$25.00 © 2008 IEEE
IEEE/ACM TRANSACTIONS ON NETWORKING. VOL 17. NO. 1 FEBRUARY 2009 neans that, for sufficiently large N, all of these complex net- variations, possess the same type of solution, specified by a works, and not only scale-free graphs, seem prone to potential"steady-state"epidemic threshold infections After a review of basic models for epidemics in Section Il, we study the matrix structure of the infinitesimal generator Q of the exact 2-state Markov chain in Section Ill and give rather pre- Since each node has(on average)the same degree, the Kephart cise fitting results for the convergence time T' in two limiting and White model is also termed a"homogeneous"model. Many raphs: the complete graph and the line graph. The major part variations on and extensions of the Kephart and White model is devoted to our new N-intertwined Markov model: Section I have been proposed(see, e.g.[13]). The Kephart and White derives the model, assesses the influence of the mean field ap- model has already appeared in earlier work(see, e.g.[3]).The proximation, and derives precise relations and upper bounds for logistic model of population growth that was first introduced by the steady-state Sections V and Vi characterize the exponential verhulst in 1838 as mentioned by Daley and Gani [6, p 20] is, dying out for T Te and the role of the spectrum of A, respec- in fact, the same as the simple kephart and White model. More tively. The accuracy of the Kephart and White model is eval- over, the simplest stochastic analogon [6, p. 56-63]-a pure uated in Section VIl, while Section vIll compares our model birth process with transition rate An, n+1 with exact computations. Section IX concludes the paper mathematically identical to the shortest path problem [14, ch. I. REVIEW OF SOME BASIC MODELS 16] in the complete graph with i.i. d. exponential link weights This observation and relation to the complete graph shows tha Here, we review basic models that may help to understand these earlier models do not take the confining way of actual virus the finer details of our N-intertwined model. All models are transport into account. The central role of the network structure phrased in our notation used in [14]. Other more ger in the spread of viruses is the focal point of this pa models for virus d in networks based on Markov theory are found in [2] and [10] B. Model of Wang et al. A. Kephart and White Model The major merit of the model of Wang et al.[15] is the in- corporation of an arbitrary network characterized by the adja Kephart and White [11] considered a connected regular cency matrix A, which generalizes the homogeneous Kephart raph' on N nodes where each node has degree k. The number and White model. where the only network characteristic of infected nodes in the population at time t is denoted by the(average)degree. The discrete-time model of Wang et I(t). If the population N is sufficiently large, we can convert belongs to the class of mean field models. Their major and in- I(t)to yt)=I(t)/N, a continuous quantity representing triguing result is that the epidemic threshold is specified by the fraction of infected nodes. Hence, the implicit assumption is that the number of states is sufficiently large such that the asymptotic regime for an infinite number of states is reached C WCWF determined by two processes: 1)infected nodes are being cured Unfortunately, this result is proved in an approximate manner and 2)susceptible nodes are infected. For process 1), the cure which questions to what extent this remarkable result holds in rate of a fraction y of infected nodes is &y. The rate at which general. In the sequel, we show that the Wang et al. model is the fraction y grows in process 2) is proportional to the fraction only accurate when the effective spreading rate T is below the of susceptible nodes, i.e., 1-y. For every susceptible node, "steady-state "epidemic threshold Te the rate of infection is the product of the infection rate B per link, the number of infected neighbors (i.e, the degree k:)of the II. EXACt 2-STATE MARKOV CHAIN node, which is ky Combining all contributions yields the time We consider the virus spread in an undirected graph G(N, L) evolution of y(t)in the Kephart and White model, described characterized by a symmetric adjacency matrix A. We assume by the differential equation that the arrival of an infection on a link and the curing process Bk(1-3)-0y of an infected node are independent Poisson processes with rate dt )B and with rate 8, respectively. As soon as a node i receives an infection at time t. it is considered to be infected and infectious whose solution is and in state Xi(t)=l. Similarly, an infected node i is cured y(t)= with rate 8, and in the healthy state Xi (t)=0at timet. At each time t. a node is in one of these two states where yo is the initial fraction of infected node The state Y(t) of the network at time t is defined by all the steady-state fraction is 3ho limy - u(t)obeying possible combinations of states in which the N nodes can be doo/dt 0. The Kephart and White differential equation(1)is the of a large class of mean field models that, apart from Y(t)=[0(t)Yi1(t) White have modeled an Erdos-Renyi random tends. for large a. to a Hence to first order in n. the of virus spread in Erdos-Renyi random graphs and regular graphs are the same. 1(0=10,2=1(2-1 人=1
2 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 17, NO. 1, FEBRUARY 2009 means that, for sufficiently large , all of these complex networks, and not only scale-free graphs, seem prone to potential infections. After a review of basic models for epidemics in Section II, we study the matrix structure of the infinitesimal generator of the exact -state Markov chain in Section III and give rather precise fitting results for the convergence time in two limiting graphs: the complete graph and the line graph. The major part is devoted to our new -intertwined Markov model: Section IV derives the model, assesses the influence of the mean field approximation, and derives precise relations and upper bounds for the steady-state. Sections V and VI characterize the exponential dying out for and the role of the spectrum of , respectively. The accuracy of the Kephart and White model is evaluated in Section VII, while Section VIII compares our model with exact computations. Section IX concludes the paper. II. REVIEW OF SOME BASIC MODELS Here, we review basic models that may help to understand the finer details of our -intertwined model. All models are rephrased in our notation used in [14]. Other more general models for virus spread in networks based on Markov theory are found in [2] and [10]. A. Kephart and White Model Kephart and White [11] considered a connected regular graph1 on nodes where each node has degree . The number of infected nodes in the population at time is denoted by . If the population is sufficiently large, we can convert to , a continuous quantity representing the fraction of infected nodes. Hence, the implicit assumption is that the number of states is sufficiently large such that the asymptotic regime for an infinite number of states is reached. The rate at which the fraction of infected nodes changes, is determined by two processes: 1) infected nodes are being cured and 2) susceptible nodes are infected. For process 1), the cure rate of a fraction of infected nodes is . The rate at which the fraction grows in process 2) is proportional to the fraction of susceptible nodes, i.e., . For every susceptible node, the rate of infection is the product of the infection rate per link, the number of infected neighbors (i.e., the degree ) of the node, which is . Combining all contributions yields the time evolution of in the Kephart and White model, described by the differential equation (1) whose solution is (2) where is the initial fraction of infected nodes whereas the steady-state fraction is obeying . The Kephart and White differential equation (1) is the basis of a large class of mean field models that, apart from some 1Kephart and White have modeled an Erdös–Rényi random graph with average degree , which tends, for large , to a regular graph. Hence, to first order in , the properties of virus spread in Erdös–Rényi random graphs and regular graphs are the same. variations, possess the same type of solution, specified by a “steady-state” epidemic threshold (3) Since each node has (on average) the same degree, the Kephart and White model is also termed a “homogeneous” model. Many variations on and extensions of the Kephart and White model have been proposed (see, e.g., [13]). The Kephart and White model has already appeared in earlier work (see, e.g., [3]). The logistic model of population growth that was first introduced by Verhulst in 1838 as mentioned by Daley and Gani [6, p. 20] is, in fact, the same as the simple Kephart and White model. Moreover, the simplest stochastic analogon [6, p. 56–63]—a pure birth process with transition rate —is mathematically identical to the shortest path problem [14, ch. 16] in the complete graph with i.i.d. exponential link weights. This observation and relation to the complete graph shows that these earlier models do not take the confining way of actual virus transport into account. The central role of the network structure in the spread of viruses is the focal point of this paper. B. Model of Wang et al. The major merit of the model of Wang et al. [15] is the incorporation of an arbitrary network characterized by the adjacency matrix , which generalizes the homogeneous Kephart and White model, where the only network characteristic was the (average) degree. The discrete-time model of Wang et al. belongs to the class of mean field models. Their major and intriguing result is that the epidemic threshold is specified by Unfortunately, this result is proved in an approximate manner which questions to what extent this remarkable result holds in general. In the sequel, we show that the Wang et al. model is only accurate when the effective spreading rate is below the “steady-state” epidemic threshold . III. EXACT -STATE MARKOV CHAIN We consider the virus spread in an undirected graph characterized by a symmetric adjacency matrix . We assume that the arrival of an infection on a link and the curing process of an infected node are independent Poisson processes with rate and with rate , respectively. As soon as a node receives an infection at time , it is considered to be infected and infectious and in state . Similarly, an infected node is cured with rate , and in the healthy state at time . At each time , a node is in one of these two states. The state of the network at time is defined by all possible combinations of states in which the nodes can be at time and
VAN MIEGHEM et al: VIRUS SPREAD IN NETWORKS 0000 Fig. 1. State diagram in a graph with N G Nodes and the binary numbering Hence, the state space of the Markov chain is organized with Fig. 2. Lower triangular part Qs of the infinitesimal generator Q {0,1 State number i node j, we obtain the probability that a node i is either healthy 00..000 0123 i=Oor infected Ci=l 00.00 010 Pr[Xi(t=j si(t) 11..,11 whe, in the index i=∑A1xk2k-1, every ak with k≠j takes both values from the set 0, 11, while for k The number of states with j infected nodes is (). Fig. 1 either 0(healthy)or 1(infected). Defining U; (t)=Pr[X,(t) shows an example of the Markov state diagram in a graph with 1], then the relation between the vectors s(t) and v(t)is N= 4 nodes The defined virus infection process is a contin u(t)=s7() Markov chain with 2: states specified by the infi generator Q with elements where the 2X N matrix M contains the states in binary nota tion but bit-reversed. as if i=j+2m- 2.N;xm=1 100 ;mn=0(4) 010 110 O<k=1≠hj,i讠= otherwise 001 Tk 2k-. The time dependence of the probabil 111 state vector s(t), with components s(t)=Pr[()=引] The binary representation of the network determines the structure of the Q matrix The upper tria Pr[i(t=[1, X2(t)= 2, .,Xn(t=inl denoted by QA, depends on the adjacency mat while the lower triangular part Qs does not. The dia and normalization 2i-0si(t)= l, obeys [14, p. 182] the ements of any Q matrix are the negative sum of the row ele- differential equation ments, such that Diag diag(goo, q11,..., q2N-1, 2N-1)with dst(t-s(t)Q 93=-Ck=1 k+i qkj as in(4). It is thus instructive to write Q Q=Q5+QA+ whose solution is ture of the matrix Q& is shown in Fig. 2, where the block matrix B(=812 x2 and the nondefined elements are zeros. This s()=s2(0)e nested structure is the consequence of the binary representation. The matrix QA is shown in Fig 3. The block matrices C( ition of si (t) as a joint probability distribution n QA are diagonal matrices of size 2x 23 with diagonal el- that, if we sum over all of the states of all nodes except fc ements depending on the adjacency matrix A. The first row of
VAN MIEGHEM et al.: VIRUS SPREAD IN NETWORKS 3 Fig. 1. State diagram in a graph with nodes and the binary numbering of the states. Hence, the state space of the Markov chain is organized with The number of states with infected nodes is . Fig. 1 shows an example of the Markov state diagram in a graph with nodes. The defined virus infection process is a continuous-time Markov chain with states specified by the infinitesimal generator with elements if if if otherwise (4) and . The time dependence of the probability state vector , with components and normalization , obeys [14, p. 182] the differential equation whose solution is The definition of as a joint probability distribution shows that, if we sum over all of the states of all nodes except for the Fig. 2. Lower triangular part of the infinitesimal generator . node , we obtain the probability that a node is either healthy or infected where, in the index , every with takes both values from the set {0,1}, while for is either 0 (healthy) or 1 (infected). Defining , then the relation between the vectors and is where the matrix contains the states in binary notation, but bit-reversed, as . . . . . . . . . . . . . . . The binary representation of the network states determines the structure of the matrix. The upper triangular part of , denoted by , depends on the adjacency matrix elements , while the lower triangular part does not. The diagonal elements of any matrix are the negative sum of the row elements, such that with as in (4). It is thus instructive to write as a sum of three matrices . The structure of the matrix is shown in Fig. 2, where the block matrix and the nondefined elements are zeros. This nested structure is the consequence of the binary representation. The matrix is shown in Fig. 3. The block matrices in are diagonal matrices of size with diagonal elements depending on the adjacency matrix . The first row of
IEEE/ACM TRANSACTIONS ON NETWORKING. VOL 17. NO. 1 FEBRUARY 2009 olor)Histogram eigenvalues )of( of the in the complete graph es of T gives the number of times an eis insert shows the spectrum of 1 11 for an extremely high T N 0 Fig 3. Upper triangular part( A of Theorem 1: For B=0, the eigenvalues of the matrix Q, de fined by (4), are A(QB-0)=-hS with multiplicity (%),where the matrix Q is zero, and, as a consequence, the largest block is 0<k<N 2-2<k, m aabsorhin av; m =0. The exact 2-state manzi y or B C(N-1). The elements of QA depend on the indexes i and 0, the infinitesimal generator ak 2k-1 as QA(i, =B 1 amkCk where Q Qdiag Qa reduces to the lower-triangular 26 Diag, whose eigenvalues are identical to the Markov chain has an absorbing state because the first row in Q diagonal elements of Diag, which are multiples of 8. In fact, is a zero row and the absorbing state is the zero state in which all the structure of @s shows that each block row j has a row su nodes are healthy. The steady-state is just this absorbing state, equal to ko for 1 s sN whose value appears(kI1times with steady-state vector soo=T 0). The proba- Hence, Q8=0 has an eigenvalue at A ko with multiplicity bility state vector requires the insights in the eigenstructure of 2=0(2 )=() These contain all of the nonzero eigen- Q because [7] values of QB-o because >k=I()=2-1 For small values of T, Q tends thus to a discrete binomial 5(t)=s(0)=丌+ spectrum. Fig. 4 illustrates that, also for larger T, the spectrum of Q for the complete graph KN is still discrete, 2 containing many eigenvalues with high multiplici where nk denotes the multiplicity of the eigenvalue Ak(with Proposition 2: For constant 8 and increasing B(and T Re Ak <0) and the vector rk m is related to the left- and right B/6), the eigenvalues of Q shift, on average to more negative eigenvector belonging to Ak and the initial conditions. Since values than those of @p=0 U(t)=(s(t)M);=2k-0 Sk Mki is a sum of certain rows Proof: We apply Gershgorin,s Theorem to Q=Q5+ of s(t), we may write Diag+QA, where QA= BTA and TA only contains(nonzero) integer elements related to the adjacency matrix A as observed from(4). Hence, gi <0 decreases with B which implies that u(t) both the center position and the possible range of each eigen- =o iEMj value Mi(Q)increases with B where M, denotes the jth column in the matrix M. Let ui be Corollary 3: The eige nvalues of Q for the complete graph the largest eigenvalue Au of the set where (Tk, m)i# 0, then smallest)possible range among all connected graphs. The Ui(t)is dominated(for not too small t) by of the real part of eigenvalues of Q for any connected graph is(-((BN+8)/2B),0 (t) Proof: From QA=BTA, defined in the proof of Theorem 2, it follows that the maximum possible sum of row elements occul urs for KN(all ai except for e min- which shows that a"bell-shape"distribution of ug(t) can only imum one for line graph (only one1-element on each row in the occur if that largest eigenvalue uu than 1 has a multiplicity larger adjacency matrix A). Gershgorin's Theorem then provides the -Random matrices of this size exhibit an almost continuous spectrum of Q For all infinitesimal generators, it holds that det Q=0 and, ( Gershgorin's Theorem shows that t wi for any infinitesimal centers bjj and radii R,N @i: Cand that the ence, the largest eigenvalue is A possible interval for real eigenvalues of( is [0, 2 maxi Cii gi
4 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 17, NO. 1, FEBRUARY 2009 Fig. 3. Upper triangular part of . the matrix is zero, and, as a consequence, the largest block is . The elements of depend on the indexes and , where as where ; ; . The exact -state Markov chain has an absorbing state because the first row in is a zero row and the absorbing state is the zero state in which all nodes are healthy. The steady-state is just this absorbing state, with steady-state vector . The probability state vector requires the insights in the eigenstructure of because [7] where denotes the multiplicity of the eigenvalue (with ) and the vector is related to the left- and right eigenvector belonging to and the initial conditions. Since is a sum of certain rows of , we may write where denotes the th column in the matrix . Let be the largest eigenvalue of the set where , then is dominated (for not too small ) by (5) which shows that a “bell-shape” distribution of can only occur if that largest eigenvalue has a multiplicity larger than 1. A. Spectrum of For all infinitesimal generators, it holds that and, hence, the largest eigenvalue is . Fig. 4. (in color) Histogram eigenvalues of of the in the complete graph for three values of gives the number of times an eigenvalue occurs. The insert shows the spectrum of for an extremely high . Theorem 1: For , the eigenvalues of the matrix , de- fined by (4), are with multiplicity , where . Proof: For , the infinitesimal generator reduces to the lower-triangular matrix , whose eigenvalues are identical to the diagonal elements of , which are multiples of . In fact, the structure of shows that each block row has a row sum equal to for whose value appears times. Hence, has an eigenvalue at with multiplicity . These contain all of the nonzero eigenvalues of because . For small values of , tends thus to a discrete, binomial spectrum. Fig. 4 illustrates that, also for larger , the spectrum of for the complete graph is still discrete,2 containing many eigenvalues with high multiplicity. Proposition 2: For constant and increasing (and ), the eigenvalues of shift, on average, to more negative values than those of . Proof: We apply Gershgorin’s Theorem3 to , where and only contains (nonzero) integer elements related to the adjacency matrix as observed from (4). Hence, decreases with which implies that both the center position and the possible range of each eigenvalue increases with . Corollary 3: The eigenvalues of for the complete graph and line graph spread over the largest (respectively, smallest) possible range among all connected graphs. The maximum possible range of the real part of eigenvalues of for any connected graph is Proof: From , defined in the proof of Theorem 2, it follows that the maximum possible sum of row elements occurs for (all except for ) and the minimum one for line graph (only one 1-element on each row in the adjacency matrix ). Gershgorin’s Theorem then provides the 2Random matrices of this size exhibit an almost continuous spectrum. 3Every eigenvalue of a matrix lies in at least one of the circular discs with centers and radii . For any infinitesimal generator , Gershgorin’s Theorem shows that and that the maximum possible interval for real eigenvalues of is .
VAN MIEGHEM et al: VIRUS SPREAD IN NETWORKS Fig. 6. Logarithm of )22 versus the number links for 0 G Fig. 5. Four largest eigenvalues of the infinitesimal generator N for the com- 00-yoxpl0xy.033 andI GNAo- plete graph with size @G N8 and 10 as a function of 0 with[ G Npd he second largest eigenvalues are increasing with 0 as 22(N)()[Im tiplicity larger than 1, creating a bell-shape. This observation agrees with the figures in Section VIll In Fig. 6, the eigenvalues of Q for all computable complete first statement. Since the maximum eigenvalue range thus oc- graphs(up to N= 13) have been numerically calculated. The curs for a complete graph, we consider in the Q-matrix for KN second largest eigenvalue seems well fitted(for T>0.05)by the ith row with h one-bits in the binary representation. The row elements, except from the diagonal element, represents the transitions from and to a state with N-k healthy and k in- fected nodes. The row sum of these positive elements equals where L=() denotes the number of links in the complete Bk (N-h)+ ho, and, hence, ii=-Bk(n-k)-ha Opti- graph KN. The dependence on T is approximately given by mizing with respect to k proves the corollary. b(r)0.177(1+2T). Assuming that the scaling law(6)of A2 As shown in the Appendix, also for the line graph, the max- holds for any N, the convergence time T' of the virus spread be in KN towards the steady state(the zero state), Yet, there are open questions regarding the spectrum of Q. r2e-IAzIT= 10-E is found as T=O(eb()L)=O(eb(T)2N) 1)Although Q is not symmetric, computations reveal that all In other words, for large size N and T>0, the convergence eigenvalues of Q are real (and negative) time T is so large that convergence towards the zero state is 2)Perturbation theory of Q for small B(or T) expresses the in reality never reached, which explains the appearance of the eigenvalues in terms of those of QB-0 and of the corre- so-called"metastable state. ponding right and left-eigenvectors of QB=0. However, Ganesh et al. [9] show that, for T Tc [a regime that is the multiplicity of the eigenvalues of QB=0 further com- not covered by(6)], the mean epidemic lifetime El] scales as plicates the perturbation analysis O(og N)while, for T>T*> T where T* is the generalized 3)The recursive block-structure( due to the binary represen- isoperimetric constant, ET]=O(e ) for some constant a tation)of Q needs to be exploited If we may extrapolate(6)to large N, it shows that the In the sequel of this section, we confine to explicit computa- a=2 for K, tion of the Q2 matrix for two extreme types of graphs: the com- 2) Line Graph: Fig. 7 plots the second largest eigen- plete graph which has the smallest average hopcount (or the value A2 of Q for the line graph. The largest eigenvalue of fastest virus penetration)and the line graph that possesses the the adjacency matrix A of the line graph, where each row largest possible average hopcount has precisely one nonzero element in the upper triangular 1)Complete Graph KN: Fig. 5 shows the four largest part of A, is Ama(A)= 2cos(T/(N+ 1))<2. Fig. 7 eigenvalues of Q for the complete graph KN for N=5.8(axis on the right) also shows the epidemic threshold of and 10. The second largest eigenvalue seems like the only the line graph Tc =(1/Amax(A))>(1/2)versus N.As eigenvalue that increases--contrary to the expectations of Ger- observed from Fig. 7, the curves A2 increase very slowly shgorin's Theorem--roughly exponentially in T and with rate with N. Via curve fitting in the range N E [ 8, 13,we increasing for increasing size N. This second largest eigenvalue found that A2(, N)N -Se-r(1.184+0.0-413N), which shows determines the speed of convergence towards the steady state. the exponential dependence on T(accurate)and the less ig. 5 also shows that, initially for small T, the third and fourth accurate dependence on N. If extrapolation to large M eigenvalue are the same and bifurcate(see dots)into distinct allowed, the convergence time T of the virus spread in values roughly around Tc 1/(Amax(A)) 1/(N-1). the line graph towards the steady-state(the zero state) Hence, (5)indicates that, below Tc, the dominant eigenval =O(1/)2)=O(er(1.181+0.0113)), which is considerably simply causing exponential decay, while above Te it has smaller than in KN, which is the other extreme case
VAN MIEGHEM et al.: VIRUS SPREAD IN NETWORKS 5 Fig. 5. Four largest eigenvalues of the infinitesimal generator for the complete graph with size , 8 and 10 as a function of with . The second largest eigenvalues are increasing with as , and . first statement. Since the maximum eigenvalue range thus occurs for a complete graph, we consider in the -matrix for the th row with one-bits in the binary representation. The row elements, except from the diagonal element, represents the transitions from and to a state with healthy and infected nodes. The row sum of these positive elements equals , and, hence, . Optimizing with respect to proves the corollary. As shown in the Appendix, also for the line graph, the maximum of the diagonal elements can be computed. Yet, there are open questions regarding the spectrum of . 1) Although is not symmetric, computations reveal that all eigenvalues of are real (and negative). 2) Perturbation theory of for small (or ) expresses the eigenvalues in terms of those of and of the corresponding right- and left-eigenvectors of . However, the multiplicity of the eigenvalues of further complicates the perturbation analysis. 3) The recursive block-structure (due to the binary representation) of needs to be exploited. In the sequel of this section, we confine to explicit computation of the matrix for two extreme types of graphs: the complete graph which has the smallest average hopcount (or the fastest virus penetration) and the line graph that possesses the largest possible average hopcount. 1) Complete Graph : Fig. 5 shows the four largest eigenvalues of for the complete graph for , 8 and 10. The second largest eigenvalue seems like the only eigenvalue that increases—contrary to the expectations of Gershgorin’s Theorem—roughly exponentially in and with rate increasing for increasing size . This second largest eigenvalue determines the speed of convergence towards the steady state. Fig. 5 also shows that, initially for small , the third and fourth eigenvalue are the same and bifurcate (see dots) into distinct values roughly around . Hence, (5) indicates that, below , the dominant eigenvalue is simply causing exponential decay, while above it has mulFig. 6. Logarithm of versus the number links in for and . tiplicity larger than 1, creating a bell-shape. This observation agrees with the figures in Section VIII. In Fig. 6, the eigenvalues of for all computable complete graphs (up to ) have been numerically calculated. The second largest eigenvalue seems well fitted (for ) by (6) where denotes the number of links in the complete graph . The dependence on is approximately given by . Assuming that the scaling law (6) of holds for any , the convergence time of the virus spread in towards the steady state (the zero state), defined by is found as . In other words, for large size and , the convergence time is so large that convergence towards the zero state is in reality never reached, which explains the appearance of the so-called “metastable state.” Ganesh et al. [9] show that, for [a regime that is not covered by (6)], the mean epidemic lifetime scales as while, for where is the generalized isoperimetric constant, , for some constant . If we may extrapolate (6) to large , it shows that the constant for . 2) Line Graph: Fig. 7 plots the second largest eigenvalue of for the line graph. The largest eigenvalue of the adjacency matrix of the line graph, where each row has precisely one nonzero element in the upper triangular part of , is . Fig. 7 (axis on the right) also shows the epidemic threshold of the line graph versus . As observed from Fig. 7, the curves increase very slowly with . Via curve fitting in the range , we found that , which shows the exponential dependence on (accurate) and the less accurate dependence on . If extrapolation to large is allowed, the convergence time of the virus spread in the line graph towards the steady-state (the zero state) is , which is considerably smaller than in , which is the other extreme case.