Example Consider the random experiment e of tossing a fair coin twice and observing the faces turning up. The sample space of 8 is S={(HT),(TH),(HH),(TT)} with(hT), (Th), (Hh), TT) being the elementary events belonging to S The second ingredient of E related to(b) and in particular to the various form events can take. A moment's of reflection suggested that there is no particular reason why we should be interested in elementary outcomes only. We might be interested in such event as A1-at least one H, A2-at most one H,, and these are not elementary events; in particular A1={(HT),(TH),(HH)} and A2={(HT),(TH),(TT)} are combinations of elementary events. All such outcome are called events as sociated with the same sample space S and they are defined bi ng elementary events. Understanding the concept of an event is crucial for the discussion which follows. Intuitively an event is any proposition associated with E which may occur or not at each trial. We say that event Al occurs when any one of the elementary events it comprises occurs. Thus, when a trial is made only one elementary event is observed but a large number of event may have occurred Fir example, if the elementary event(HT)occurs in a particular trial, Al and A have occurred as well Given that s is a set with members the elementary events this takes us im- mediately into the realm of set theory and event can be formally defined to be subsets of S formed by set theoretic operation ("n"-intersection, "U" -union, complementation) on the elementary events. For example A1={(HT)}U{(TH)}U{(HH)}={(TT)}cS, A2=(HT)UI(THJU((TT)=I(HHJCS
Example: Consider the random experiment E of tossing a fair coin twice and observing the faces turning up. The sample space of E is S = {(HT),(T H),(HH),(TT)}, with (HT),(T H),(HH),(TT) being the elementary events belonging to S. The second ingredient of E related to (b) and in particular to the various form events can take. A moment’s of reflection suggested that there is no particular reason why we should be interested in elementary outcomes only. We might be interested in such event as A1–’at least one H’, A2–’at most one H’, and these are not elementary events; in particular A1 = {(HT),(T H),(HH)} and A2 = {(HT),(T H),(TT)} are combinations of elementary events. All such outcome are called events associated with the same sample space S and they are defined by combining elementary events. Understanding the concept of an event is crucial for the discussion which follows. Intuitively an event is any proposition associated with E which may occur or not at each trial. We say that event A1 occurs when any one of the elementary events it comprises occurs. Thus, when a trial is made only one elementary event is observed but a large number of event may have occurred. Fir example, if the elementary event (HT) occurs in a particular trial, A1 and A2 have occurred as well. Given that S is a set with members the elementary events this takes us immediately into the realm of set theory and event can be formally defined to be subsets of S formed by set theoretic operation (” ∩ ”-intersection, ” ∪ ”-union, ” − ”-complementation) on the elementary events. For example, A1 = {(HT)} ∪ {(T H)} ∪ {(HH)} = {(TT)} ⊂ S, A2 = {(HT)} ∪ {(T H)} ∪ {(TT)} = {(HH)} ⊂ S. 6
Two special events are S itself, called the sure events and the impos sible event o defined to contain no elements of S, i.e. 0= the latter is defined for com- pleteness a third ingredient of 8 associated with(b) which Kolmogorov had to formal ed was the idea of uncertainty related to the outcome of any particular trial 8. This he formalized in the notion of probabilities attributed to the various events associated with 8, such as P(A1), P(A2), expressing the"likelihood"of occurrence of these events. Although attributing probabilities to the elementary events presents no particular mathematical problem, going the same for events in general is not as straightforward. The difficulty arise because if Al and A2 are ts, A1=S-Al, A2=S-A2, A10A2, A1 UA2, etc, are also events becaus the occurrence or non-occurrence of A1 and A2 implies the occurrence or not of these events. This implies that for the attribution of probabilities to make sense we have to impose some mathematical structure on the set of all events, say F, which reflects the fact that whichever way we combine these events, the end result is always an event. The temptation at this stage is to define f to be the set of all subsets of S, called the power set; Surely, this covers all possibilities In the above example, the power set of s take the form F={S,0,{(HT)},{(TH)},{(HH)},{(TT)},{(HT),(TH)},{(HT),(HH)},{(HT),(TT)} {(TH),(HH)},{(TH),(TT)},{(HH),(①T)},{(HT),(TH),(HH)},{(HT),(TH),(TT)}, {(TH),(HH),(TT)},{(HT),(TH),(HH),(TT)}} Sometimes we are not interested in all the subsets of s. we need to define a set independently of the power set by endowing it with a mathematical structure which ensures that no inconsistency arise. This is achieved by requiring that F in the following has a special mathematical structures, It is a a-field related to S Definition 3 Let f be a set of subsets of s. F is called a o-field if (a) if A E F, then A E F-closure under complementation (b)ifA1∈F,i=1,2,…,then(U1A)∈F- -closure under countable union Note that (a) and (b) taken together implying the following
Two special events are S itself, called the sure events and the impossible event ∅ defined to contain no elements of S, i.e. ∅ = { }; the latter is defined for completeness. A third ingredient of E associated with (b) which Kolmogorov had to formalized was the idea of uncertainty related to the outcome of any particular trial of E. This he formalized in the notion of probabilities attributed to the various events associated with E, such as P(A1), P(A2), expressing the ”likelihood” of occurrence of these events. Although attributing probabilities to the elementary events presents no particular mathematical problem, going the same for events in general is not as straightforward. The difficulty arise because if A1 and A2 are events, A1 = S −A1, A2 = S −A2, A1 ∩A2, A1 ∪A2, etc., are also events because the occurrence or non-occurrence of A1 and A2 implies the occurrence or not of these events. This implies that for the attribution of probabilities to make sense we have to impose some mathematical structure on the set of all events, say F, which reflects the fact that whichever way we combine these events, the end result is always an event. The temptation at this stage is to define F to be the set of all subsets of S, called the power set; Surely, this covers all possibilities ! In the above example, the power set of S take the form F = {S, ∅, {(HT)}, {(T H)}, {(HH)}, {(TT)}, {(HT),(TH)}, {(HT),(HH)}, {(HT),(TT)}, {(T H),(HH)}, {(T H),(TT)}, {(HH),(TT)}, {(HT),(T H),(HH)}, {(HT),(T H),(TT)}, {(T H),(HH),(TT)}, {(HT),(T H),(HH),(TT)}}. Sometimes we are not interested in all the subsets of S, we need to define a set independently of the power set by endowing it with a mathematical structure which ensures that no inconsistency arise. This is achieved by requiring that F in the following has a special mathematical structures, It is a σ-field related to S. Definition 3: Let F be a set of subsets of S. F is called a σ-field if: (a) if A ∈ F, then A ∈ F–closure under complementation; (b) if Ai ∈ F, i = 1, 2, ..., then (∪ ∞ i=1Ai) ∈ F–closure under countable union. Note that (a) and (b) taken together implying the following: 7
(c)S∈J, because AU A=S (d)a∈F(from(c)S=∈升);and (e)A1∈F,i=1,2,…,then(n1A1)∈F These suggest that a o-field is a set of subsets of s which is closed under complementation, countable unions and intersections. That is, any of these op- eration on the elements of will give rise to an element of F Example: If we are interested in events with one of each H or T there is no point in defining the a-field to be the power set, and Fe can do as well with fewer event to attributed probabilities to F={{(HT),(TH)},{(HH),(TT)},S,∞} Check if the set 万1={{(HT)},{(TH),(HH),(TT)},S,} is a o-field or not Let us turn our attention to the various collections of events(o-fields) that are relevant for econometrics Definition 4. The borel o-field B is the smallest collection of sets(called the Borel sets) that Includes (a) all open sets of R; (b) the complements B of any B in B (c) the union Un=1; of any sequences Bi) of sets in B The Borel set of R just defined are said to be generated by the open sets of IR. The same Borel sets would be generated by ball the open half-lines of R the closed half-lines of R, all the open intervals of R, or all the closed intervals of IR. The Borel sets are a rich"collection of events for which probabilities can be defined. To see how the borel set contains alomost every conceviable subset of R
(c) S ∈ F, because A ∪ A = S; (d) ∅ ∈ F (from (c) S = ∅ ∈ F); and (e) Ai ∈ F, i = 1, 2, ..., then (∩ ∞ i=1Ai) ∈ F. These suggest that a σ-field is a set of subsets of S which is closed under complementation, countable unions and intersections. That is, any of these operation on the elements of F will give rise to an element of F. Example: If we are interested in events with one of each H or T there is no point in defining the σ-field to be the power set, and Fc can do as well with fewer events to attributed probabilities to. Fc = {{(HT),(T H)}, {(HH),(TT)}, S, ∅}. Exercise: Check if the set F1 = {{(HT)}, {(T H),(HH),(TT)}, S, ∅} is a σ-field or not. Let us turn our attention to the various collections of events (σ-fields) that are relevant for econometrics. Definition 4: The Borel σ-field B is the smallest collection of sets (called the Borel sets) that includes (a) all open sets of R; (b) the complements B of any B in B; (c) the union ∪ ∞ n=1Bi of any sequences {Bi} of sets in B. The Borel set of R just defined are said to be generated by the open sets of R. The same Borel sets would be generated by ball the open half-lines of R, all the closed half-lines of R, all the open intervals of R, or all the closed intervals of R. The Borel sets are a ”rich” collection of events for which probabilities can be defined. To see how the Borel set contains alomost every conceviable subset of R 8
from the closed half-lines, consider the following example Example Let S be the real line R=a:-o0< a< oo and the set of events of interest {B2:x∈R} where B2=2: 2<a=(o0, 1. How can we construct a a-field, a(J)on R from the events B By definition BE a(J), then (1). Taking complements of B2: B2=2: 2ER,2>c)=(a,ooE a(J) (2). Taking countable unions of e:U∞=1(-∞,x-(1/m)]=(-∞,x)∈a(J) (3). Taking complements of(2):(=∞,m)=[x,∞)∈o(); (4).From(1),fory>x,lv,∞)∈a() ]U[v,∞)=(x,y)∈o(J) (6).∩=1(x-(1/mn),x]={x}∈(J) This shows not only that a()is a a-field but it includes almost every con- ceivable subset of R, that is, it coincides with the a-field generated by any set of subsets of R, which we denote by B, i.e. o()=B, or the Borel Field on R Having solved the technical problem in attributing probabilities to events by postulating the existence of a a-field F associated with the sample space S Kolmogorov went on to formalize the concept of probability itself A mapping p: F-0, 1]is a probability measures on S, F) provided that (a)P(∞)=0. (b)For any AE F, P(A)=l-P( (c)For any disjoint sequence (Ai) of sets in F(i.e, A; A,=0 for all i#j), P(U≌1A)=∑1P(A Example
from the closed half-lines, consider the following example. Example: Let S be the real line R = {x : −∞ < x < ∞} and the set of events of interest be J = {Bx : x ∈ R}, where Bx = {z : z ≤ x} = (−∞, x]. How can we construct a σ-field, σ(J) on R from the events Bx? By definition Bx ∈ σ(J), then (1). Taking complements of Bx: B¯ x = {z : z ∈ R, z > x} = (x, ∞) ∈ σ(J); (2). Taking countable unions of Bx: ∪ ∞ n=1(−∞, x − (1/n)] = (−∞, x) ∈ σ(J); (3). Taking complements of (2): (−∞, x) = [x, ∞) ∈ σ(J); (4). From (1), for y > x, [y, ∞) ∈ σ(J); (5). From (4), (−∞, x] ∪ [y, ∞) = (x, y) ∈ σ(J); (6). ∩ ∞ n=1(x − (1/n), x] = {x} ∈ σ(J). This shows not only that σ(J) is a σ-field but it includes almost every conceivable subset of R, that is, it coincides with the σ-field generated by any set of subsets of R, which we denote by B, i.e. σ(J) = B, or the Borel Field on R. Having solved the technical problem in attributing probabilities to events by postulating the existence of a σ- field F associated with the sample space S, Kolmogorov went on to formalize the concept of probability itself. Definition 5: A mapping P : F → [0, 1] is a probability measures on {S, F} provided that (a) P(∅) = 0. (b) For any A ∈ F, P(A) = 1 − P(A). (c) For any disjoint sequence {Ai} of sets in F (i.e., Ai ∩ Aj = ∅ for all i 6= j), P(∪ ∞ i=1Ai) = P∞ i=1 P(Ai). Example: 9
Since ((HT)n((HH)=8, P({(HT)}U{(HH)}=P({(HT)}+P(0{(HH)} 111 42 To summarize the argument so far, Kolmogorov formalized the condition(a) and () of the random experiment in the form of the trinity(S, F, P()com- prising the set of all outcomes S-the sample space a a-field F of events re- ated to S and a probability function P() assigning probability to events in F For the coin example, if we choose F(The first is H and the second is T) &(HT), I (TH), (HH), (TT)1,8, S to be the o-field of interest, P() is defined PS)=1,P()=0.P({(HD))=4,((TH),(Hm,(T)})=4 Because of its importance the trinity(S, F, P() is given a name Definition 6: A sample space S endowed with a a-field F and a probability measure P(is called a probability space. That is we call the triple (S, F, p) a probability As far as condition()of 8 is concerned, yet to be formalized, it will prove of paramount importance in the context of the limit theorems in Chapter 4 2.2 Conditional Probability So far we have considered probabilities of events on the assumption that no information is available relating to the outcome of a particular trial. Sometimes however, additional information is available in the form of the known occurrence of some event A. For example, in the case of tossing a fair coin twice we might know that in the first trial it was heads. what difference does this information make to the original triple(S, F, p)? Firstly, knowing that the first trial was a head, the set of all possible outcomes now becomes SA={(H),(HH)}
Since {(HT)} ∩ {(HH)} = ∅, P({(HT)} ∪ {(HH)}) = P({(HT)}) + P(∩{(HH)}) = 1 4 + 1 4 = 1 2 . To summarize the argument so far, Kolmogorov formalized the condition (a) and (b) of the random experiment E in the form of the trinity (S, F,P(·)) comprising the set of all outcomes S–the sample space, a σ-field F of events related to S and a probability function P(·) assigning probability to events in F. For the coin example, if we choose F(The first is H and the second is T)= {{(HT)}, {(T H),(HH),(TT)}, ∅, S} to be the σ-field of interest, P(·) is defined by P(S) = 1, P(∅) = 0, P({(HT)}) = 1 4 , P({(T H),(HH),(TT)}) = 3 4 . Because of its importance the trinity (S, F,P(·)) is given a name. Definition 6: A sample space S endowed with a σ-field F and a probability measure P(·) is called a probability space. That is we call the triple (S, F,P) a probability space. As far as condition (c) of E is concerned, yet to be formalized, it will prove of paramount importance in the context of the limit theorems in Chapter 4. 2.2 Conditional Probability So far we have considered probabilities of events on the assumption that no information is available relating to the outcome of a particular trial. Sometimes, however, additional information is available in the form of the known occurrence of some event A. For example, in the case of tossing a fair coin twice we might know that in the first trial it was heads. What difference does this information make to the original triple (S, F,P) ? Firstly, knowing that the first trial was a head, the set of all possible outcomes now becomes SA = {(HT),(HH)}, 10