Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher's work on estimation. As in estimation, we begin by postulating a statistical model but instead of seeking an estimator of 6 in e we consider the question whether ∈eoc6or6∈61=6-6 Do is most supported by the observed data. The discussion which follows will proceed in a similar way, though less systematically and formally, to the discussion of estimation. This is due to the complexity of the topic which arises mainly because one is asked to assimilate too many con- cepts too quickly just to be able to define the problem properly. This difficulty, however, is inherent in testing, if any proper understanding of the topic is to be attempted, and thus unavoidable 1 Testing: Definition and Concepts 1.1 The Decision rule Let X be a random variables defined on the probability space(S, F, P())and consider the statistical model associated with X (a)Φ={f(x;0),0∈6} (b)x=(X1, X2, .,Xn)' is a random sample from f(a: 0) The problem of hypothesis testing is one of deciding whether or not some conjectures about 0 of the form 0 belongs to some subset Oo of e is supported by the data a=(1, T2, ..,n. We call such a conjecture the null hypothesis and denoted it by o:∈ where if the sample realization a E Co we accept Ho, if a E Ci we reject it Since the observation space x E R, but both the acceptance region CO E R and rejection region C1 R, we need a mapping from R to R. The mapping which enables us to define Co and C we call a test statistic T(x):x-R
Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher’s work on estimation. As in estimation, we begin by postulating a statistical model but instead of seeking an estimator of θ in Θ we consider the question whether θ ∈ Θ0 ⊂ Θ or θ ∈ Θ1 = Θ − Θ0 is most supported by the observed data. The discussion which follows will proceed in a similar way, though less systematically and formally, to the discussion of estimation. This is due to the complexity of the topic which arises mainly because one is asked to assimilate too many concepts too quickly just to be able to define the problem properly. This difficulty, however, is inherent in testing, if any proper understanding of the topic is to be attempted, and thus unavoidable. 1 Testing: Definition and Concepts 1.1 The Decision Rule Let X be a random variables defined on the probability space (S, F,P(·)) and consider the statistical model associated with X: (a) Φ = {f(x; θ), θ ∈ Θ}; (b) x = (X1, X2, ..., Xn)’ is a random sample from f(x; θ). The problem of hypothesis testing is one of deciding whether or not some conjectures about θ of the form θ belongs to some subset Θ0 of Θ is supported by the data x = (x1, x2, ..., xn) 0 . We call such a conjecture the null hypothesis and denoted it by H0 : θ ∈ Θ0, where if the sample realization x ∈ C0 we accept H0, if x ∈ C1 we reject it. Since the observation space X ∈ R n , but both the acceptance region C0 ∈ R 1 and rejection region C1 ∈ R 1 , we need a mapping from R n to R 1 . The mapping which enables us to define C0 and C1 we call a test statistic τ (x) : X → R 1 . 1
Exam Let X be the random variables representing the marks achieved by students in an econometric theory paper an let the statistical model b )0={(G0=如(=) 6∈6 (b)x=(X1, X2,, Xn, n=40 is random sample from p The hypothesis to be tested is H0:6=60(e.X~N(60,64),Oo={60} against H1:≠60(ie.X~N(p,64),p≠60),O1=[0.,100-{60} or the sample realization t takes a value 'around60 then we will be inclined, Common sense suggests that if some 'good'estimator of 8, say Xn=(1/n)isai accept Ho. Let us formalise this argument The accept region takes the form:60-ε≤Xn≤60+e,E>0,or C0={x:|Xn-60≤e} and C1=a: Xn-602e, is the rejection region Formally,rifr∈Cl( reject Ho)andb∈eo( Ho is true)- type I error;ifc∈Co (accept Ho) and 0 E O1(Ho is false)-type II error. The hypothesis to be tested is formally stated as follows Ho:6∈60,eos. Against the null hypothesis Ho we postulate the alternative Hi which takes the
Example: Let X be the random variables representing the marks achieved by students in an econometric theory paper an let the statistical model be: (a) Φ = n f(x; θ) = 1 8 √ 2π exp h − 1 2 x−θ 8 2 io , θ ∈ Θ ≡ [0, 100]; (b) x = (X1, X2, ..., Xn) 0 , n=40 is random sample from Φ. The hypothesis to be tested is H0 : θ = 60 (i.e. X ∼ N(60, 64)), Θ0 = {60} against H1 : θ 6= 60 (i.e. X ∼ N(µ, 64), µ 6= 60), Θ1 = [0, 100] − {60}. Common sense suggests that if some ’good’ estimator of θ, say X¯ n = (1/n) Pn i=1 xi for the sample realization x takes a value ’around’ 60 then we will be inclined to accept H0. Let us formalise this argument: The accept region takes the form: 60 − ε ≤ X¯ n ≤ 60 + ε, ε > 0, or C0 = {x : |X¯ n − 60| ≤ ε} and C1 = {x : |X¯ n − 60| ≥ ε}, is the rejection region. Formally, if x ∈ C1 (reject H0) and θ ∈ Θ0 (H0 is true)–type I error; if x ∈ C0 (accept H0) and θ ∈ Θ1 (H0 is false)–type II error. The hypothesis to be tested is formally stated as follows: H0 : θ ∈ Θ0, Θ0 ⊆ Θ. Against the null hypothesis H0 we postulate the alternative H1 which takes the form: H1 : θ ∈ Θ1 ≡ Θ − Θ0. 2
It is important to note at the outset that Ho and Hi are in effect hypothesis about the distribution of the sample f(a, 0), i.e Ho:f(x,0).6∈60,H1:f(x,θ).6∈61 In testing a null hypothesis Ho against an alternative Hi the issue is to decide whether the sample realization a 'support' Ho or H1. In the former case we say that Ho is accepted, in the latter Ho is rejected. In order to be able te make such a decision we need to formulate a mapping which related eo to some subset of the observation space say Co, we call an acceptance region, and ts complement C1(Co∪C1=礼,ConC1=∞) we call the rejection region
It is important to note at the outset that H0 and H1 are in effect hypothesis about the distribution of the sample f(x, θ), i.e. H0 : f(x, θ). θ ∈ Θ0, H1 : f(x, θ). θ ∈ Θ1. In testing a null hypothesis H0 against an alternative H1 the issue is to decide whether the sample realization x ’support’ H0 or H1. In the former case we say that H0 is accepted, in the latter H0 is rejected. In order to be able to make such a decision we need to formulate a mapping which related Θ0 to some subset of the observation space X , say C0, we call an acceptance region, and its complement C1 (C0 ∪ C1 = X , C0 ∩ C1 = ∅) we call the rejection region. 3
1.2 Type I and Type II Errors The next question is"how do we choose E " If e is to small we run the risk of rejecting Ho when it is true; we call this type i error. On the other hand if e is too large we run the risk of accepting Ho when it is false; we call this type ii error. That is, if we were to choose e too small we run a higher risk of committing a type i error than of committing a type il error and vice versa That is, there is a trade off between the probability of type i error, i.e Pr(x∈C1;b∈60)=a, and the probability B of type Ii error, i.e Pr(x∈Co;6∈1)=. Ideally we would like a= 6=0 for all oEe which is not possible for a fixed n. Moreover we cannot control both simultaneously because of the trade-off between them. The strategy adopted in hypothesis testing where a small value of a is chosen and for a given a, B is minimized. Formally, this amounts to choose a* such that Pr(x∈C1;b∈60)=a(6)≤afo6∈6 and Pr(x∈Co;b∈O1)=B(0), is minimized for 8∈61 y choosing C1 or Co appropriately. In the case of the above example if we were to choose a, say a*=0.05, then Pr(Xn-60|>;6=60)=0.05 How do we determine E, then? " The only random variable involved in the tatement X and hence it has to be its sampling distribution. For the above
1.2 Type I and Type II Errors The next question is ”how do we choose ε ?” If ε is to small we run the risk of rejecting H0 when it is true; we call this type I error. On the other hand, if ε is too large we run the risk of accepting H0 when it is false; we call this type II error. That is, if we were to choose ε too small we run a higher risk of committing a type I error than of committing a type II error and vice versa. That is, there is a trade off between the probability of type I error, i.e. Pr(x ∈ C1; θ ∈ Θ0) = α, and the probability β of type II error, i.e. Pr(x ∈ C0; θ ∈ Θ1) = β. Ideally we would like α = β = 0 for all θ ∈ Θ which is not possible for a fixed n. Moreover we cannot control both simultaneously because of the trade-off between them. The strategy adopted in hypothesis testing where a small value of α is chosen and for a given α, β is minimized. Formally, this amounts to choose α ∗ such that Pr(x ∈ C1; θ ∈ Θ0) = α(θ) ≤ α ∗ for θ ∈ Θ0, and Pr(x ∈ C0; θ ∈ Θ1) = β(θ), is minimized for θ ∈ Θ1 by choosing C1 or C0 appropriately. In the case of the above example if we were to choose α, say α ∗ = 0.05, then Pr(|X¯ n − 60| > ε; θ = 60) = 0.05. ”How do we determine ε, then ?” The only random variable involved in the statement is X¯ and hence it has to be its sampling distribution. For the above 4
probabilistic statement to have any operational meaning to enable us to determine E, the distribution of Xn must be known. In the present case we know that 64 where which implies that for 0=60,(i.e. when Ho is true)we can ' construct'a test statistic T(x) from sample x such that /1.6 、√1.6 1.265 N(0,1), and thus the distribution of r( is known completely(no unknown parame- ters). When this is the case this distribution can be used in conjunction with the above probabilistic statement to determine E. In order to do this we need to relate IXn-60 to r(x)(a statistics)for which the distribution is known. The obvious way is to standardize the former. This suggests change the above probabilistic statement to the equivalent statement X 0.05 where The value of ca given from the N(o, 1)table is ca=1.96. This in turn implies that the rejection region for the test is xn-60>1.96}={x:|r(x)≥ 1.265 C1={x:|Xn-60≥2.48} That is, for sample realization x which give rise to Xn falling outside the in- terval(57.52, 62.48)we reject Ho Let us summarize the argument so far. We set out to construct a test for Ho: 6=60 against H1: 6 and intuition suggested the rejection region (Xn-60 8). In order to determine e we have
probabilistic statement to have any operational meaning to enable us to determine ε, the distribution of X¯ n must be known. In the present case we know that X¯ n ∼ N θ, σ 2 n where σ 2 n = 64 40 = 1.6, which implies that for θ = 60, (i.e. when H0 is true) we can ’construct’ a test statistic τ (x) from sample x such that τ (x) = X¯ n − θ √ 1.6 = X¯ n − 60 √ 1.6 = X¯ n − 60 1.265 ∼ N(0, 1), and thus the distribution of τ (·) is known completely (no unknown parameters). When this is the case this distribution can be used in conjunction with the above probabilistic statement to determine ε. In order to do this we need to relate |X¯ n − 60| to τ (x) (a statistics) for which the distribution is known. The obvious way is to standardize the former. This suggests change the above probabilistic statement to the equivalent statement Pr |X¯ n − 60| 1.265 ≥ cα; θ = 60 = 0.05 where cα = ε 1.265 . The value of cα given from the N(0, 1) table is cα = 1.96. This in turn implies that the rejection region for the test is C1 = x : |X¯ n − 60| 1.265 ≥ 1.96 = {x : |τ (x)| ≥ 1.96} or C1 = {x : |X¯ n − 60| ≥ 2.48}. That is, for sample realization x which give rise to X¯ n falling outside the interval (57.52, 62.48) we reject H0. Let us summarize the argument so far. We set out to construct a test for H0 : θ = 60 against H1 : θ 6= 60 and intuition suggested the rejection region (|X¯ n − 60| ≥ ε). In order to determine ε we have to 5