BRANCHING PROCESSES 1.GALTON-WATSON PROCESSES Galton-Watson processes were introduced by Francis Galton in 1889 as a simple mathemat- ical model for the propagation of family names.They were reinvented by Leo Szilard in the late 1930s as models for the proliferation of free neutrons in a nuclear fission reaction.General- izations of the extinction probability formulas that we shall derive below played a role in the calculation of the critical mass of fissionable material needed for a sustained chain reaction. Galton-Watson processes continue to play a fundamental role in both the theory and applica- tions of stochastic processes. First,an informal desription:A population of individuals(which may represent people,or- ganisms,free neutrons,etc.,depending on the context)evolves in discrete time n=0,1,2,... according to the following rules.Each nth generation individual produces a random number (possibly 0)of individuals,called offspring,in the(n+1)st generation.The offspring counts a,5B,,..for distinct individuals a,B,r,...are mutually independent,and also indepen- dent of the offspring counts of individuals from earlier generations.Furthermore,they are identically distributed,with common distribution {pk}.The state Zn of the Galton-Watson process at time n is the number of individuals in the nth generation. More formally, Definition 1.A Galton-Watson process {Zn}nzo with offspring distribution F ={pklkzo is a discrete-time Markov chain taking values in the set Z+of nonnegative integers whose transition probabilities are as follows: (1) P(Zn+1 =klZn m}=pim. Here (p)denotes the m-th convolution power of the distributionp.In other words,the conditional distribution of Z given thatZn=m is the distribution of the sum of m i.i.d. random variables each with distribution (p}.The default initial state is Zo=1. Construction:A Galton-Watson process with offspring distribution F={p&}kz can be built on any probability space that supports an infinite sequence of i.i.d.random variables all with distribution F.Assume that these are arranged in a doubly infinite array,as follows: 引2动… 吊昆务… 品… etc
BRANCHING PROCESSES 1. GALTON-WATSON PROCESSES Galton-Watson processes were introduced by Francis Galton in 1889 as a simple mathematical model for the propagation of family names. They were reinvented by Leo Szilard in the late 1930s as models for the proliferation of free neutrons in a nuclear fission reaction. Generalizations of the extinction probability formulas that we shall derive below played a role in the calculation of the critical mass of fissionable material needed for a sustained chain reaction. Galton-Watson processes continue to play a fundamental role in both the theory and applications of stochastic processes. First, an informal desription: A population of individuals (which may represent people, organisms, free neutrons, etc., depending on the context) evolves in discrete time n = 0, 1, 2,... according to the following rules. Each nth generation individual produces a random number (possibly 0) of individuals, called offspring, in the (n + 1)st generation. The offspring counts ξα,ξβ ,ξγ,... for distinct individuals α,β,γ,... are mutually independent, and also independent of the offspring counts of individuals from earlier generations. Furthermore, they are identically distributed, with common distribution {pk }k≥0 . The state Zn of the Galton-Watson process at time n is the number of individuals in the nth generation. More formally, Definition 1. A Galton-Watson process {Zn }n≥0 with offspring distribution F = {pk }k≥0 is a discrete-time Markov chain taking values in the setZ+ of nonnegative integers whose transition probabilities are as follows: (1) P {Zn+1 = k |Zn = m} = p ∗m k . Here {p ∗m k } denotes the m−th convolution power of the distribution {pk }. In other words, the conditional distribution of Zn+1 given that Zn = m is the distribution of the sum of m i.i.d. random variables each with distribution {pk }. The default initial state is Z0 = 1. Construction: A Galton-Watson process with offspring distribution F = {pk }k≥0 can be built on any probability space that supports an infinite sequence of i.i.d. random variables all with distribution F . Assume that these are arranged in a doubly infinite array, as follows: ξ 1 1 ,ξ 1 2 ,ξ 1 3 ,··· ξ 2 1 ,ξ 2 2 ,ξ 2 3 ,··· ξ 3 1 ,ξ 3 2 ,ξ 3 3 ,··· etc. 1
2 BRANCHING PROCESSES Set Zo=1,and inductively define (2) Zn+1= rn+l The independence of the random variables"guarantees that the sequence(Zn)nzo has the Markov property,and that the conditional distributions satisfy equation(1). 0 For certain choices of the offspring distribution F,the Galton-Watson process isn't very in- teresting.For example,if F is the probability distribution that puts mass 1 on the integer 17, then the evolution of the process is purely deterministic: Zn =(17)"for every n20. Another uninteresting case is when F has the form popi>0 and po+p=1. In this case the population remains at its initial size Zo=1 for a random number of steps with a geometric distribution,then jumps to 0,after which it remains stuck at 0 forever afterwards. (Observe that for any Galton-Watson process,with any offspring distribution,the state 0 is an absorbing state.)To avoid having to consider these uninteresting cases separately in every result to follow,we make the following standing assumption: Assumption 1.The offspring distribution is not a point mass(that is,there is no k>0 such that p=1),and it places positive probability on some integer k >2.Furthermore,the offspring distribution has finite mean u>0 and finite variance o2>0. 1.1.First Moment Calculation.The inductive definition(2)allows a painless calculation of the means EZ Since the random variables are independent ofZ 1(Zn =m} i=1 P(Zn =m} =∑muPiZn=-ml 司 =uEZn. Since EZo=1,it follows that (3) EZn=u". Corollary 1.Ifu<1 then with probability one the Galton-Watson process dies out eventually, i.e.,Zn=0 for all but finitely many n.Furthermore,if=min{n:Zn=0}is the extinction time, then P{r>n}≤un
2 BRANCHING PROCESSES Set Z0 = 1, and inductively define (2) Zn+1 = X Zn i=1 ξ n+1 i . The independence of the random variables ξ n i guarantees that the sequence (Zn )n≥0 has the Markov property, and that the conditional distributions satisfy equation (1). For certain choices of the offspring distribution F , the Galton-Watson process isn’t very interesting. For example, if F is the probability distribution that puts mass 1 on the integer 17, then the evolution of the process is purely deterministic: Zn = (17) n for every n ≥ 0. Another uninteresting case is when F has the form p0p1 > 0 and p0 + p1 = 1. In this case the population remains at its initial size Z0 = 1 for a random number of steps with a geometric distribution, then jumps to 0, after which it remains stuck at 0 forever afterwards. (Observe that for any Galton-Watson process, with any offspring distribution, the state 0 is an absorbing state.) To avoid having to consider these uninteresting cases separately in every result to follow, we make the following standing assumption: Assumption 1. The offspring distribution is not a point mass (that is, there is no k ≥ 0 such that pk = 1), and it places positive probability on some integer k ≥ 2. Furthermore, the offspring distribution has finite mean µ > 0 and finite variance σ2 > 0. 1.1. First Moment Calculation. The inductive definition (2) allows a painless calculation of the means E Zn . Since the random variables ξ n+1 i are independent of Zn , E Zn+1 = X∞ k=0 E Xm i=1 ξ n+1 i 1{Zn = m} = X∞ k=0 E Xm i=1 ξ n+1 i P {Zn = m} = X∞ k=0 mµP {Zn = m} = µE Zn . Since E Z0 = 1, it follows that (3) E Zn = µ n . Corollary 1. If µ < 1 then with probability one the Galton-Watson process dies out eventually, i.e., Zn = 0 for all but finitely many n. Furthermore, if τ = min{n : Zn = 0} is the extinction time, then P {τ > n} ≤ µ n .
BRANCHING PROCESSES Proof.The event {>n}coincides with the event {n>1).By Markov's inequality, P{Zm21}≤EZn=u". 口 1.2.Recursive Structure and Generating Functions.The Galton-Watson process Zn has a sim- ple recursive structure that makes it amenable to analysis by generating function methods. Each of the first-generation individuals a,B,r,...behaves independently of the others;more- over,all of its descendants (the offspring of the offspring,etc.)behaves independently of the descendants of the other first-generation individuals.Thus,each of the first-generation indi- viduals engenders an independent copy of the Galton-Watson process.It follows that a Galton- Watson process is gotten by conjoining to the single individual in the Oth generation Z1(con- ditionally)independent copies of the Galton-Watson process.The recursive structure leads to a simple set of relations among the probability generating functions of the random variables Zn: Proposition 2.Denote by n(t)=EtZn the probability generating function of the random vari- ableZn and by(t)=>ptk the probability generating function of the offspring distribu- tion.Then on is the n-fold composition of p by itself,that is, (4) po(t)=t and (5) n+(t)=(on(t))=on(o(t))Ynz0. Proof.There are two ways to proceed,both simple.The first uses the recursive structure di- rectly to deduce that Zn+1 is the sum of Zi conditionally independent copies of Zn.Thus, n+i(t)=Et2nti=Eon(t)2 (on(t)). The second argument relies on the fact the generating function of the mth convolution power p}is the mth power of the generating function (t)of {pk}.Thus, Panl=E=E2-1Z=kpZ,=利 k=0 00 g(tPZn= =on((t)). By induction on n,this is the(n+1)st iterate of the function (t). ▣ Problem 1.(A)Show that if the mean offspring numberμ:=∑kkpk<o∞then the expected size of the nth generation is EZn=u".(B)Show that if the variance o2=>(k-u)2p&<oo then the variance of Zn is finite,and give a formula for it
BRANCHING PROCESSES 3 Proof. The event {τ > n} coincides with the event {Zn ≥ 1}. By Markov’s inequality, P {Zn ≥ 1} ≤ E Zn = µ n . 1.2. Recursive Structure and Generating Functions. The Galton-Watson processZn has a simple recursive structure that makes it amenable to analysis by generating function methods. Each of the first-generation individuals α,β,γ,... behaves independently of the others; moreover, all of its descendants (the offspring of the offspring, etc.) behaves independently of the descendants of the other first-generation individuals. Thus, each of the first-generation individuals engenders an independent copy of the Galton-Watson process. It follows that a GaltonWatson process is gotten by conjoining to the single individual in the 0th generation Z1 (conditionally) independent copies of the Galton-Watson process. The recursive structure leads to a simple set of relations among the probability generating functions of the random variables Zn : Proposition 2. Denote by ϕn (t ) = E t Zn the probability generating function of the random variable Zn , and by ϕ(t ) = P∞ k=0 pk t k the probability generating function of the offspring distribution. Then ϕn is the n−fold composition of ϕ by itself, that is, ϕ0 (4) (t ) = t and ϕn+1 (t ) = ϕ(ϕn (t )) = ϕn (ϕ(t )) ∀n ≥ 0.(5) Proof. There are two ways to proceed, both simple. The first uses the recursive structure directly to deduce that Zn+1 is the sum of Z1 conditionally independent copies of Zn . Thus, ϕn+1 (t ) = E t Zn+1 = E ϕn (t ) Z1 = ϕ(ϕn (t )). The second argument relies on the fact the generating function of themth convolution power {p ∗m k } is the mth power of the generating function ϕ(t ) of {pk }. Thus, ϕn+1 (t ) = E t Zn+1 = X∞ k=0 E (t Zn+1 |Zn = k)P (Zn = k) = X∞ k=0 ϕ(t ) m P (Zn = k) = ϕn (ϕ(t )). By induction on n, this is the (n + 1)st iterate of the function ϕ(t ). Problem 1. (A) Show that if the mean offspring number µ := P k k pk < ∞ then the expected size of the nth generation is E Zn = µ n . (B) Show that if the variance σ2 = P k (k − µ) 2pk < ∞ then the variance of Zn is finite, and give a formula for it.
BRANCHING PROCESSES Properties of the Generating Function (t):Assumption 1 guarantees that (t)is not a linear function,because the offspring distribution puts mass on some integer k >2.Thus,(t)has the following properties: (A)p(t)is strictly increasing for0≤t≤l. (B)(t)is strictly convex,with strictly increasing first derivative. (C)p(1)=1. 1.3.Extinction Probability.If for some n the population size Zn=0 then the population size is 0 in all subsequent generations.In such an event,the population is said to be extinct.The first time that the population size is 0(formally,=min{n:Zn=0},or =oo if there is no such n)is called the extinction time.The most obvious and natural question concerning the behavior of a Galton-Watson process is:What is the probability P{<oo}of extinction? Proposition 3.The probability ofextinction is the smallest nonnegative root t=of the equa- tion (6) (t)=t. Proof.The key idea is recursion.Consider what must happen in order for the event t<oo of extinction to occur:Either(a)the single individual alive at time 0 has no offspring;or(b)each of its offspring must engender a Galton-Watson process that reaches extinction.Possibility(a) occurs with probability po.Conditional on the event that Z1=k,possibility (b)occurs with probability.Therefore, =%+∑pgt=pg, k=1 that is,the extinction probability is a root of the Fixed-Point Equation(6). There is an alternative proof that =(that uses the iteration formula(5)for the prob- ability generating function of Zn.Observe that the probability of the eventZn=0 is easily recovered from the generating function n(t): P{Zm=0}=pn(0). By the nature of the Galton-Watson process,these probabilities are nondecreasing in n,be- cause if Zn=0 then Zn+1=0.Therefore,the limit:=lim(0)exists,and its value is the extinction probability for the Galton-Watson process.The limit must be a root of the Fixed-Point Equation,because by the continuity of p(ξ)=p(1impn(0) n+00 =lim (n(0)) n+0∞ =lim n+1(0) = Finally,it remains to show that is the smallest nonnegative root of the Fixed-Point Equa- tion.This follows from the monotonicity of the probability generating functions n:Since
4 BRANCHING PROCESSES Properties of the Generating Function ϕ(t ): Assumption 1 guarantees that ϕ(t ) is not a linear function, because the offspring distribution puts mass on some integer k ≥ 2. Thus, ϕ(t ) has the following properties: (A) ϕ(t ) is strictly increasing for 0 ≤ t ≤ 1. (B) ϕ(t ) is strictly convex, with strictly increasing first derivative. (C) ϕ(1) = 1. 1.3. Extinction Probability. If for some n the population size Zn = 0 then the population size is 0 in all subsequent generations. In such an event, the population is said to be extinct. The first time that the population size is 0 (formally, τ = min{n : Zn = 0}, or τ = ∞ if there is no such n) is called the extinction time. The most obvious and natural question concerning the behavior of a Galton-Watson process is: What is the probability P {τ <∞} of extinction? Proposition 3. The probability ζ of extinction is the smallest nonnegative root t = ζ of the equation (6) ϕ(t ) = t . Proof. The key idea is recursion. Consider what must happen in order for the event τ < ∞ of extinction to occur: Either (a) the single individual alive at time 0 has no offspring; or (b) each of its offspring must engender a Galton-Watson process that reaches extinction. Possibility (a) occurs with probability p0 . Conditional on the event that Z1 = k, possibility (b) occurs with probability ζ k . Therefore, ζ = p0 + X∞ k=1 p k ζ k = ϕ(ζ), that is, the extinction probability ζ is a root of the Fixed-Point Equation (6). There is an alternative proof that ζ = ϕ(ζ) that uses the iteration formula (5) for the probability generating function of Zn . Observe that the probability of the event Zn = 0 is easily recovered from the generating function ϕn (t ): P {Zn = 0} = ϕn (0). By the nature of the Galton-Watson process, these probabilities are nondecreasing in n, because if Zn = 0 then Zn+1 = 0. Therefore, the limit ξ := limn→∞ ϕn (0) exists, and its value is the extinction probability for the Galton-Watson process. The limit ξ must be a root of the Fixed-Point Equation, because by the continuity of ϕ, ϕ(ξ) = ϕ( limn→∞ ϕn (0)) = limn→∞ ϕ(ϕn (0)) = limn→∞ ϕn+1 (0) = ξ. Finally, it remains to show that ξ is the smallest nonnegative root ζ of the Fixed-Point Equation. This follows from the monotonicity of the probability generating functions ϕn : Since
BRANCHING PROCESSES 3≥0, 9n(0)≤pn(3)=3 Taking the limit of each side as n→o reveals thatξ≤y. 0 It now behooves us to find out what we can about the roots of the Fixed-Point Equation(6). First,observe that there is always at least one nonnegative root,to wit,t=1,this because o(t) is a probability generating function.Furthermore,since Assumption 1 guarantees that (t)is strictly convex,roots of equation 6 must be isolated.The next proposition asserts that there are either one or two roots,depending on whether the mean number of offspring u:=>kkpk is greater than one. Definition 2.A Galton-Watson process with mean offspring number u is said to be supercritical if u>1,critical ifu=1,or subcritical if u<1. Proposition 4.Unless the offspring distribution is the degenerate distribution that puts mass 1 at k =1,the Fixed-Point Equation (6)has either one or two roots.In the supercritical case,the Fixed-Point Equation has a unique root t=<1 less than one.In the critical and subcritical cases,the only root is t =1. Together with Proposition 3 this implies that extinction is certain(that is,has probability one)if and only if the Galton-Watson process is critical or subcritical.If,on the other hand,it is supercritical then the probability of extinction is<1. Proof.By assumption,the generating function (t)is strictly convex,with strictly increasing first derivative and positive second derivative.Hence,if u=(1)<1 then there cannot be a root0s<1 of the Fixed-Point Equation=().This follows from the Mean Value theorem, whichimplies that if=()and 1=(1)then there would be apoint<<1where ()=1. Next,consider the case u>1.If po =0 then the Fixed-Point Equation has roots t=0 and t =1,and because (t)is strictly convex,there are no other positive roots.So suppose that po>0,so that (0)=po >0.Since (1)=u>1,Taylor's formula implies that o(t)<t for values of t 1 sufficiently near 1.Thus,(0)-0>0 and (t,)-t,<0 for some 0<t,1.By the Intermediate Value Theorem,there must exist(0,t)such that ()=0. 口 1.4.Tail of the Extinction Time Distribution.For both critical and subcritical Galton-Watson processes extinction is certain.However,critical and subcritical Galton-Watson processes dif- fer dramatically in certain respects,most notably in the distribution of the time to extinction. This is defined as follows: (7) t=min{n≥1:Zn=0}. Proposition 5.Let{Zn}nzo be a Galton-Watson process whose offspring distribution F has mean u<1 and variance o2<oo.Denote byt the extinction time.Then A)Ifμ<1 then there exists C=Cr∈(0,o)such that P{r>n}~Cμ"asn→oo. (B)Ifu=1 then P(t>n)~2/(o2n)as noo
BRANCHING PROCESSES 5 ζ ≥ 0, ϕn (0) ≤ ϕn (ζ) = ζ. Taking the limit of each side as n →∞ reveals that ξ ≤ ζ. It now behooves us to find out what we can about the roots of the Fixed-Point Equation (6). First, observe that there is always at least one nonnegative root, to wit, t = 1, this because ϕ(t ) is a probability generating function. Furthermore, since Assumption 1 guarantees that ϕ(t ) is strictly convex, roots of equation 6 must be isolated. The next proposition asserts that there are either one or two roots, depending on whether the mean number of offspring µ := P k k pk is greater than one. Definition 2. A Galton-Watson process with mean offspring numberµis said to be supercritical if µ > 1, critical if µ = 1, or subcritical if µ < 1. Proposition 4. Unless the offspring distribution is the degenerate distribution that puts mass 1 at k = 1, the Fixed-Point Equation (6) has either one or two roots. In the supercritical case, the Fixed-Point Equation has a unique root t = ζ < 1 less than one. In the critical and subcritical cases, the only root is t = 1. Together with Proposition 3 this implies that extinction is certain (that is, has probability one) if and only if the Galton-Watson process is critical or subcritical. If, on the other hand, it is supercritical then the probability of extinction is ζ < 1. Proof. By assumption, the generating function ϕ(t ) is strictly convex, with strictly increasing first derivative and positive second derivative. Hence, if µ = ϕ 0 (1) ≤ 1 then there cannot be a root 0 ≤ ζ < 1 of the Fixed-Point Equation ζ = ϕ(ζ). This follows from the Mean Value theorem, which implies that if ζ = ϕ(ζ) and 1 = ϕ(1)then there would be a point ζ < θ < 1 where ϕ 0 (θ ) = 1. Next, consider the case µ > 1. If p0 = 0 then the Fixed-Point Equation has roots t = 0 and t = 1, and because ϕ(t ) is strictly convex, there are no other positive roots. So suppose that p0 > 0, so that ϕ(0) = p0 > 0. Since ϕ 0 (1) = µ > 1, Taylor’s formula implies that ϕ(t ) < t for values of t < 1 sufficiently near 1. Thus, ϕ(0) − 0 > 0 and ϕ(t∗ ) − t∗ < 0 for some 0 < t∗ < 1. By the Intermediate Value Theorem, there must exist ζ ∈ (0,t∗ ) such that ϕ(ζ) − ζ = 0. 1.4. Tail of the Extinction Time Distribution. For both critical and subcritical Galton-Watson processes extinction is certain. However, critical and subcritical Galton-Watson processes differ dramatically in certain respects, most notably in the distribution of the time to extinction. This is defined as follows: (7) τ = min{n ≥ 1 : Zn = 0}. Proposition 5. Let {Zn }n≥0 be a Galton-Watson process whose offspring distribution F has mean µ ≤ 1 and variance σ2 <∞. Denote by τ the extinction time. Then (A) If µ < 1 then there exists C = CF ∈ (0,∞) such that P {τ > n} ∼ C µ n as n →∞. (B) If µ = 1 then P {τ > n} ∼ 2/(σ2n) as n →∞.