BOOTSTRAP CONFIDENCE INTERVALS 199 BCa method uses Monte Carlo bootstrapping to [So if B 2,000 and a =0.95,then T(a)is the find z,as in (3.3)and (3.5),and then maps z into 1,900th ordered T*(b).]The 100ath bootstrap-t con- an appropriate hypothesis-testing value a via for- fidence endpoint r[a]is defined to be mula (3.7).The ABC method also uses formula (3.7)[or,equivalently,(2.3)],but in order to avoid (5.5) r[a=6-6t1-), Monte Carlo computations it makes one further following (5.2). analytic approximation:itself,the point on the Figure 4 relates to the correlation coefficient for horizontal axis in Figure 3,is estimated from an the cd4 data.The left panel shows 2,000 normal- Edgeworth expansion.The information needed for theory bootstrap replications of the Edgeworth expansion is obtained from the second derivatives (4.9)-(4.11). (5.6) T= 8-0 1-2 G, √20 5.BOOTSTRAP-t INTERVALS Each replication required drawing ((Bi,Ai),..., The BCa formula strikes some people as (B2o,Azo))as in (2.1),computing 0*and *and complicated,and also "unbootstraplike"since the then calculating the bootstrap-t replication T*= estimate a is not obtained directly from bootstrap ().The percentiles (05),(05))equalled replications.The bootstrap-t method,another boot- (-1.38,2.62),giving a 0.90 central bootstrap-t in- strap algorithm for setting confidence intervals,is terval of(0.45,0.87).This compares nicely with the conceptually simpler than BCa.The method was exact interval (0.47,0.86)in Table 2. suggested in Efron(1979),but some poor numeri- Hall (1988)showed that the bootstrap-t limits cal results reduced its appeal.Hall's(1988)paper are second-order accurate,as in (2.10).DiCiccio and showing the bootstrap-t's good second-order proper- Efron(1992)showed that they are also second-order ties has revived interest in its use.Babu and Singh correct (see Section 8). (1983)gave the first proof of second-order accuracy Definition(2.17)uses the fact that (1-02)/n is a reasonable normal-theory estimate of standard er- for the bootstrap-t. Suppose that a data set x gives an estimate 6(x) ror for 0.In most situations a*must be numerically for a parameter of interest 6,and also an estimate computed for each bootstrap data set x",perhaps (x)for the standard deviation of 6.By analogy using the delta method.This multiplies the boot- with Student's t-statistic,we define strap computations by a factor of at least p+1, where p is the number of parameters in the prob- (5.1) T=0-0 ability model for x.The nonparametric bootstrap-t distribution on the right side of Figure 4 used a* and let T()indicate the 100ath percentile of T.The equal to the nonparametric delta-method estimate. upper endpoint of an a-level one-sided confidence The main disadvantage of both BCa and bootstrap- inteval for 6 is t is the large computational burden.This does not make much difference for the correlation coefficient, (5.2) 0-GT1-a) but it can become crucial for more complicated sit- This assumes we know the T-percentiles,as in the uations.The ABC method is particularly useful in usual Student's-t case where T()is the percentile complicated problems. of a t-distribution.However,the 7-percentiles are More serious,the bootstrap-t algorithm can be nu- unknown in most situations. merically unstable,resulting in very long confidence The idea of the bootstrap-t is to estimate the per- intervals.This is a particular danger in nonpara- centiles of T by bootstrapping.First,the distribu- metric situations.As a rough rule of thumb,the BCa tion governing x is estimated and the bootstrap data intervals are more conservative than bootstrap-t, sets x*are drawn from the estimated distribution. tending to stay,if anything,too close to the stan- as in (2.1).Each x*gives both a and aa*,yielding dard intervals as opposed to deviating too much. Bootstrap-t intervals are not transformation in- (5.3) T*= *-0 variant.The method seems to work better if 6 is * a translation parameter,such as a median or an a bootstrap replication of(5.1).A large number B expectation.A successful application of the type ap- of independent replications gives estimated per- pears in Efron(1981,Section 9).Tibshirani (1988) centiles proposed an algorithm for transforming 6 to a more T(@)=B.ath ordered value of translation-like parameter =m(0),before apply- (5.4) ing the bootstrap-t method.Then the resulting in- T*(b),b=1,2,..,B} terval is transformed back to the 6 scale via 6
BOOTSTRAP CONFIDENCE INTERVALS 199 BCa method uses Monte Carlo bootstrapping to find z˜, as in (3.3) and (3.5), and then maps z˜ into an appropriate hypothesis-testing value zˆ via formula (3.7). The ABC method also uses formula (3.7) [or, equivalently, (2.3)], but in order to avoid Monte Carlo computations it makes one further analytic approximation: z˜ itself, the point on the horizontal axis in Figure 3, is estimated from an Edgeworth expansion. The information needed for the Edgeworth expansion is obtained from the second derivatives (4.9)–(4.11). 5. BOOTSTRAP-t INTERVALS The BCa formula strikes some people as complicated, and also “unbootstraplike” since the estimate aˆ is not obtained directly from bootstrap replications. The bootstrap-t method, another bootstrap algorithm for setting confidence intervals, is conceptually simpler than BCa . The method was suggested in Efron (1979), but some poor numerical results reduced its appeal. Hall’s (1988) paper showing the bootstrap-t’s good second-order properties has revived interest in its use. Babu and Singh (1983) gave the first proof of second-order accuracy for the bootstrap-t. Suppose that a data set x gives an estimate θˆx for a parameter of interest θ, and also an estimate σˆ x for the standard deviation of θˆ. By analogy with Student’s t-statistic, we define 5:1 T = θˆ − θ σˆ and let Tα indicate the 100αth percentile of T. The upper endpoint of an α-level one-sided confidence inteval for θ is 5:2 θˆ − σˆ T 1−α : This assumes we know the T-percentiles, as in the usual Student’s-t case where Tα is the percentile of a t-distribution. However, the T-percentiles are unknown in most situations. The idea of the bootstrap-t is to estimate the percentiles of T by bootstrapping. First, the distribution governing x is estimated and the bootstrap data sets x ∗ are drawn from the estimated distribution, as in (2.1). Each x ∗ gives both a θˆ ∗ and a σˆ ∗ , yielding 5:3 T ∗ = θˆ ∗ − θˆ σˆ ∗ ; a bootstrap replication of (5.1). A large number B of independent replications gives estimated percentiles 5:4 Tˆ α = B · αth ordered value of T∗ b; b = 1; 2;: : :; B: [So if B = 2,000 and α = 0:95; then Tˆ α is the 1,900th ordered T∗ b.] The 100αth bootstrap-t con- fidence endpoint θˆTα is defined to be 5:5 θˆTα = θˆ − σˆ Tˆ 1−α ; following (5.2). Figure 4 relates to the correlation coefficient for the cd4 data. The left panel shows 2,000 normaltheory bootstrap replications of 5:6 T = θˆ − θ σˆ ; σˆ = 1 − θˆ2 √ 20 : Each replication required drawing B∗ 1 ; A∗ 1 ;: : :; B∗ 20; A∗ 20 as in (2.1), computing θˆ ∗ and σˆ ∗ , and then calculating the bootstrap−t replication T∗ = θˆ ∗ −θˆ/σˆ ∗ . The percentiles Tˆ 0:05 ; Tˆ 0:95 equalled −1:38; 2:62, giving a 0.90 central bootstrap-t interval of 0:45; 0:87. This compares nicely with the exact interval (0.47, 0.86) in Table 2. Hall (1988) showed that the bootstrap-t limits are second-order accurate, as in (2.10). DiCiccio and Efron (1992) showed that they are also second-order correct (see Section 8). Definition (2.17) uses the fact that 1 − θˆ2 / √ n is a reasonable normal-theory estimate of standard error for θˆ. In most situations σˆ ∗ must be numerically computed for each bootstrap data set x ∗ , perhaps using the delta method. This multiplies the bootstrap computations by a factor of at least p + 1, where p is the number of parameters in the probability model for x. The nonparametric bootstrap-t distribution on the right side of Figure 4 used σˆ ∗ equal to the nonparametric delta-method estimate. The main disadvantage of both BCa and bootstrapt is the large computational burden. This does not make much difference for the correlation coefficient, but it can become crucial for more complicated situations. The ABC method is particularly useful in complicated problems. More serious, the bootstrap-t algorithm can be numerically unstable, resulting in very long confidence intervals. This is a particular danger in nonparametric situations. As a rough rule of thumb, the BCa intervals are more conservative than bootstrap-t, tending to stay, if anything, too close to the standard intervals as opposed to deviating too much. Bootstrap-t intervals are not transformation invariant. The method seems to work better if θ is a translation parameter, such as a median or an expectation. A successful application of the type appears in Efron (1981, Section 9). Tibshirani (1988) proposed an algorithm for transforming θ to a more translation-like parameter φ = mθ, before applying the bootstrap-t method. Then the resulting interval is transformed back to the θ scale via θ =
200 T.J.DICICCIO AND B.EFRON -1.38 2.62 -1.70 2.71 1 4 20246 8 4 202 468 FIG.4.Bootstrap-t distributions relating to 6 the cd4 data correlation:(left)2,000 normal-theory bootstrap relications of T using *=(1-2/20;(right)2,000 nonparametric bootstrap replications of T using given by the nonparametric delta method;dashed lines show 5th and 95th percentiles. m-1().See DiCiccio and Romano (1995,Section Notice that T in (5.1)equals-in (5.7).The 2.b)or Efron and Tibshirani (1993.Section 12.6). ABCq method amounts to assuming that The bootstrap-t and BC methods look completely different.However,surprisingly,the ABC method (5.9) ha,oc,(T)~N(0,1) connects them. for the transformation defined by (5.7)-(5.8).In The ABC method was introduced as a non-Monte other words,ABCq uses an estimated transforma- Carlo approximation to BCa,but it can also be tion of T to get a pivotal quantity.The bootstrap-t thought of as an approximation to the bootstrap-t method assumes that T itself is pivotal,but then method The relationships in (4.13)can be reversed finds the pivotal distribution by bootstrapping.The to give the attained significance level (ASL)a for calibration method discussed in Section 7 uses both any observed data set.That is,we can find a such an estimated transformation and bootstrapping, that bABca[a]equals an hypothesized value 6 for with the result being still more accurate intervals. the parameter of interest: 6.NONPARAMETRIC CONFIDENCE INTERVALS 6→专=9-0 The BCa,bootstrap-t,and ABC methods can be 2ξ applied to the construction of nonparametric confi- dence intervals.Here we will discuss the one-sample (5.7) →入=1+(1+4C5)D nonparametric situation where the observed data 2A x=(x1,x2,...,xn)are a random sample from an →w=1+2aA)+(1+46)7 arbitrary probability distribution F, →a=(0-0)】 (6.1) 1 x2...,xnid.F. If the ABCq method works perfectly,then the ASL The sample space of the distribution can be any- as defined by (5.7)will be uniformly distributed over thing at all;is the two-dimensional Euclidean [0,1],s0 space R2in(1.7)and on the right side of Table 1,and is an extended version of R in the missing-value ex- ample below.Multisample nonparametric problems (5.8) Z=Φ-1(a) are mentioned briefly at the end of this section. The empirical distribution F puts probability 1/n will be distributed as a N(0,1)variate. on each sample point x;in x.A real-valued param-
200 T. J. DICICCIO AND B. EFRON Fig. 4. Bootstrap-t distributions relating to θ the cd4 data correlation: (left) 2,000 normal-theory bootstrap relications of T using σˆ ∗ = 1 − θˆ∗ 2 / √ 20; (right) 2,000 nonparametric bootstrap replications of T using σˆ ∗ given by the nonparametric delta method; dashed lines show 5th and 95th percentiles. m−1 φ. See DiCiccio and Romano (1995, Section 2.b) or Efron and Tibshirani (1993, Section 12.6). The bootstrap-t and BCa methods look completely different. However, surprisingly, the ABC method connects them. The ABC method was introduced as a non–Monte Carlo approximation to BCa , but it can also be thought of as an approximation to the bootstrap-t method The relationships in (4.13) can be reversed to give the attained significance level (ASL) α for any observed data set. That is, we can find α such that θˆ ABCqα equals an hypothesized value θ for the parameter of interest: 5:7 θ → ξ = θ − θˆ σˆ → λ = 2ξ 1 + 1 + 4cˆqξ 1/2 → w = 2λ 1 + 2aˆλ + 1 + 4aˆλ 1/2 → α = 8w − zˆ0 : If the ABCq method works perfectly, then the ASL as defined by (5.7) will be uniformly distributed over [0, 1], so 5:8 Z = 8 −1 α will be distributed as a N0; 1 variate. Notice that T in (5.1) equals −ξ in (5.7). The ABCq method amounts to assuming that 5:9 haˆ; zˆ0 ; cˆq T ∼ N0; 1 for the transformation defined by (5.7)–(5.8). In other words, ABCq uses an estimated transformation of T to get a pivotal quantity. The bootstrap-t method assumes that T itself is pivotal, but then finds the pivotal distribution by bootstrapping. The calibration method discussed in Section 7 uses both an estimated transformation and bootstrapping, with the result being still more accurate intervals. 6. NONPARAMETRIC CONFIDENCE INTERVALS The BCa , bootstrap-t, and ABC methods can be applied to the construction of nonparametric confi- dence intervals. Here we will discuss the one-sample nonparametric situation where the observed data x = x1 ; x2 ;: : :; xn are a random sample from an arbitrary probability distribution F, 6:1 x1 ; x2 ;: : :; xn ∼i:i:d: F: The sample space X of the distribution can be anything at all; X is the two-dimensional Euclidean space R 2 in (1.7) and on the right side of Table 1, and is an extended version of R 5 in the missing-value example below. Multisample nonparametric problems are mentioned briefly at the end of this section. The empirical distribution Fˆ puts probability 1/n on each sample point xi in x. A real-valued param-
BOOTSTRAP CONFIDENCE INTERVALS 201 eter of interest 6=t(F)has the nonparametric es- There is a simpler way to calculate the Ui and a. timate Instead of(6.5)we can use the jackknife influence function (6.2) 6=t(), (6.7) U:=(n-1)(6.-8a) also called the nonparametric maximum likelihood estimate.A nonparametric bootstrap sample x*= in (6.6),where (i)is the estimate of 6 based (xi,x2,...,x)is a random sample of size n drawn from F, on the reduced data set x(i)=(x1,x2,...,xi-1, xi+1,...,xn).This makes it a little easier to cal- (6.3) x1,x5,,x克~f culate the BC limits since the statistic (x)does not have to be reprogrammed in the functional In other words,x*equals (xj,xj,...,xj)where form =t(F). j1,j2,...,jn is a random sample drawn with re- The nonparametric BCa method is unfazed by placement from {1,2,...,n}.Each bootstrap sam- complicated sample spaces.Table 5 shows an artifi- ple gives a nonparametric bootstrap replication of 6, cial missing-data example discussed in Efron(1994). Twenty-two students have each taken five exams la- (6.4) *=t(*), belled A,B,C,D,E,but some of the A and E scores (marked "?")are missing.If there were no missing where F*is the empirical distribution of x*. data,we would consider the rows of the matrix to Nonparametric BC confidence intervals for 0 be a random sample of size n =22 from an un- are constructed the same way as the parametric known five-dimensional distribution F.Our goal is intervals of Section 2.A large number of indepen- to estimate dent bootstrap replications *(1),(2),...,*(B) are drawn according to (4.3)-(4.4),B 2,000,giv- (6.8) 6=maximum eigenvalue of ing a bootstrap cumulative distribution function G(c)=#(b)<c}/B.The BCa endpoints 0BC.[a] where is the covariance matrix of F. are then calculated from formula(2.3),plugging in An easy way,though not necessarily the best way, nonparametric estimates of zo and a. to fill in Table 5 is to fit a standard two-way additive Formula(2.8)gives 20,which can also be obtained model v+a:+B;to the non-missing scores by least from a nonparametric version of(4.12).The acceler- squares,and then to replace the missing values ation a is estimated using the empirical influence function of the statistic =t(F), TABLE 5 Twenty-two students have each taken five exams,labelled A,B, (6.5)U:=lim (1-e)F+eδ:) C,D,E.Some of the scores for A and E (indicated by "?"are i=1,2,..,n. missing.Original data set from Kent,Mardia and Bibby(1979) Student A B C D E Here 5i is a point mass on xi,so (1-s)F+ is a version of F putting extra weight on xi and 1 63 65 70 63 2 53 73 less weight on the other points.The usual nonpara- 3 51 6 6 metric delta-method estimate of standard error is 4 2 5 [U?/n211/2,this being the value used in our exam- 5 1 5 ples of the standard interval(1.1). 6 4 6 62 The estimate of a is 7 8 49 4 61 4 ?? (6.6) a= 1 ∑-1U 10 4 51 6(∑-U)2 ?? 4 4 2 2 54 4 This looks completely different than(4.9),but in fact 1 6 it is the same formula,applied here in a multino- 4 5 mial framework appropriate to the nonparametric 4 ?? situation.The similarity of(6.6)to a skewness re- 16 17 4 1 1 4 46 flects the relationship of a to the skewness of the 8 4 33 score function,(3.10).The connection of nonpara- 19 47 2 metric confidence intervals with multinomial esti- 30 34 6 18 mation problems appears in Efron(1987,Sections 7 21 2 3 32 3 21 26 20 and 8)
BOOTSTRAP CONFIDENCE INTERVALS 201 eter of interest θ = tF has the nonparametric estimate 6:2 θˆ = tFˆ ; also called the nonparametric maximum likelihood estimate. A nonparametric bootstrap sample x ∗ = x ∗ 1 ; x ∗ 2 ;: : :; x ∗ n is a random sample of size n drawn from Fˆ, 6:3 x ∗ 1 ; x ∗ 2 ;: : :; x ∗ n ∼ Fˆ: In other words, x ∗ equals xj1 ; xj2 ;: : :; xjn where j1 ; j2 ;: : :; jn is a random sample drawn with replacement from 1; 2;: : :; n. Each bootstrap sample gives a nonparametric bootstrap replication of θˆ, 6:4 θˆ ∗ = tFˆ ∗ ; where Fˆ ∗ is the empirical distribution of x ∗ . Nonparametric BCa confidence intervals for θ are constructed the same way as the parametric intervals of Section 2. A large number of independent bootstrap replications θˆ ∗ 1, θˆ ∗ 2;: : :; θˆ ∗ B are drawn according to (4.3)–(4.4), B ≈ 2,000, giving a bootstrap cumulative distribution function Gˆ c = #θˆ ∗ b < c/B. The BCa endpoints θˆBCa α are then calculated from formula (2.3), plugging in nonparametric estimates of z0 and a. Formula (2.8) gives zˆ0 , which can also be obtained from a nonparametric version of (4.12). The acceleration a is estimated using the empirical influence function of the statistic θˆ = tFˆ , 6:5 Ui = limε→0 t1 − εFˆ + εδi ε ; i = 1; 2;: : :; n: Here δi is a point mass on xi , so 1 − εFˆ + εδi is a version of Fˆ putting extra weight on xi and less weight on the other points. The usual nonparametric delta-method estimate of standard error is 6U2 i /n2 1/2 , this being the value used in our examples of the standard interval (1.1). The estimate of a is 6:6 aˆ = 1 6 Pn i=1 U3 i Pn i=1 U2 i 3/2 : This looks completely different than (4.9), but in fact it is the same formula, applied here in a multinomial framework appropriate to the nonparametric situation. The similarity of (6.6) to a skewness re- flects the relationship of aˆ to the skewness of the score function, (3.10). The connection of nonparametric confidence intervals with multinomial estimation problems appears in Efron (1987, Sections 7 and 8). There is a simpler way to calculate the Ui and aˆ. Instead of (6.5) we can use the jackknife influence function 6:7 Ui = n − 1θˆ · − θˆ i in (6.6), where θˆ i is the estimate of θ based on the reduced data set xi = x1 , x2 ;: : :; xi−1 , xi+1 ;: : :; xn . This makes it a little easier to calculate the BCa limits since the statistic θˆx does not have to be reprogrammed in the functional form θˆ = tFˆ . The nonparametric BCa method is unfazed by complicated sample spaces. Table 5 shows an artifi- cial missing-data example discussed in Efron (1994). Twenty-two students have each taken five exams labelled A, B, C, D, E, but some of the A and E scores (marked “?”) are missing. If there were no missing data, we would consider the rows of the matrix to be a random sample of size n = 22 from an unknown five-dimensional distribution F. Our goal is to estimate 6:8 θ = maximum eigenvalue of 6; where 6 is the covariance matrix of F. An easy way, though not necessarily the best way, to fill in Table 5 is to fit a standard two-way additive model ν + αi + βj to the non-missing scores by least squares, and then to replace the missing values Table 5 Twenty-two students have each taken five exams, labelled A, B, C, D, E. Some of the scores for A and E (indicated by “?”) are missing. Original data set from Kent, Mardia and Bibby 1979 Student A B C D E 1 ? 63 65 70 63 2 53 61 72 64 73 3 51 67 65 65 ? 4 ? 69 53 53 53 5 ? 69 61 55 45 6 ? 49 62 63 62 7 44 61 52 62 ? 8 49 41 61 49 ? 9 30 69 50 52 45 10 ? 59 51 45 51 11 ? 40 56 54 ? 12 42 60 54 49 ? 13 ? 63 53 54 ? 14 ? 55 59 53 ? 15 ? 49 45 48 ? 16 17 53 57 43 51 17 39 46 46 32 ? 18 48 38 41 44 33 19 46 40 47 29 ? 20 30 34 43 46 18 21 ? 30 32 35 21 22 ? 26 15 20 ?
202 T.J.DICICCIO AND B.EFRON xij by It is easy to extend the ABC method of Section 4 to nonparametric problems,greatly reducing the (6.9) i=i+ai+Bi. computational burden of the BC intervals.The The filled-in 22 x 5 data matrix has rows i,i=1, formulas are basically the same as in (4.9)-(4.14). 2,...,22,from which we can calculate an empirical but they simplify somewhat in the nonparametric- multinomial framework.The statistic is expressed covariance matrix in the functional form 6=t(F)and then reevalu- 22 22 (610)=壶∑(-(元-,庄-壶∑, ated for values of F very near F,as in (6.5).The ABC limits require only 2n+4 reevaluations of the i=1 statistic.By comparison,the BC method requires giving the point estimate some 2,000 evaluations =t(F*),where F+is a (6.11)6=maximum eigenvalue of=633.2. bootstrap empirical distribution. The nonparametric ABC algorithm“abcnon'”was How accurate is applied to the maximum eigenvalue statistic for the It is easy to carry out a nonparametric BCa student score data.After 46 reevaluations of the analysis..The“points'”x;in the data set x= statistic defined by (6.9)-(6.11),it gave 0.90 central (x1,x2,...,xn),n 22,are the rows of Table 5,for confidence interval instance x22 =(?26,15,20,?).A bootstrap data (6.13) (0aBc[0.05],8ABc[0.95])=(379,1,172), set x*(xi,x,...,x)is a 22 x 5 data matrix, each row of which has been randomly selected from nearly the same as(6.12).The Statlib program abc- the rows of Table 5.Having selected x*,the boot- non used here appears in the appendix to Efron strap replication*is computed by following the (1994);Efron (1994)also applied abcnon to the full same steps (4.9)-(4.11)that gave 0.Figure 5 is a normal theory MLE of (6.8),rather than to the ad histogram of 2,200 bootstrap replications the hoc estimator(6.9)-(6.11).The resulting ABC inter- histogram being noticeably long-tailed toward the val (353,1307)was 20%longer than(6.13),perhaps right.The 0.90 BC confidence interval for 6 is undermining belief in the data's normality. So far we have only discussed one-sample non- (6.12)(0Bc,[0.05],0Bc[.095])=(379,1,164), parametric problems.The K-sample nonparametric problem has data extending twice as far to the right of as to the left. 8=633.2 (6.14) xk1,x2,…,kn~i.d..F方 for k=1,2,...,K, for arbitrary probability distributions Fg on possi- bly different sample spaces 2.The nonparamet- ric MLE of a real-valued parameter of interest 0= t(F1,F2,.,Fx)is (6.15) 6=t(f1,产2,,Fx), where F&is the empirical distribution correspond- ing to xk=(xk1,xk2,...,xnk). It turns out that k-sample nonparametric confi- dence intervals can easily be obtained from either abcnon or bcanon,its nonparametric BC counter- part.How to do so is explained in Remarks C and H of Efron (1994). 7.CALIBRATION 200 4006008001000 1200 180 Calibration is a bootstrap technique for improving FIG.5.Histogram of 2,200 nonparametric bootstrap replications the coverage accuracy of any system of approximate of the maximum eigenvalue statistic for the student score data; confidence intervals.Here we will apply it to the bootstrap standard error estimate o =212.0.The histogram is nonparametric ABC intervals in Tables 2 and 3.The long-tailed to the right,and so is the BCa confidence interval general theory is reviewed in Efron and Tibshirani (6.12). (1993,Sections 18.3 and 25.6),following ideas of
202 T. J. DICICCIO AND B. EFRON xij by 6:9 xˆij = νˆ + αˆ i + βˆ j : The filled-in 22 × 5 data matrix has rows xˆi , i = 1, 2;: : :; 22, from which we can calculate an empirical covariance matrix 6:10 6ˆ = 1 22 X 22 i=1 xˆi − µˆ i xˆi − µˆ i 0 ; µˆ = 1 22 X 22 1 xˆi ; giving the point estimate 6:11 θˆ = maximum eigenvalue of 6ˆ = 633:2: How accurate is θˆ? It is easy to carry out a nonparametric BCa analysis. The “points” xi in the data set x = x1 ; x2 ;: : :; xn ; n = 22, are the rows of Table 5, for instance x22 = ?; 26; 15; 20; ?. A bootstrap data set x ∗ = x ∗ 1 , x ∗ 2 ;: : :; x ∗ n is a 22 × 5 data matrix, each row of which has been randomly selected from the rows of Table 5. Having selected x ∗ , the bootstrap replication θˆ ∗ is computed by following the same steps (4.9)–(4.11) that gave θˆ. Figure 5 is a histogram of 2,200 bootstrap replications θˆ ∗ , the histogram being noticeably long-tailed toward the right. The 0.90 BCa confidence interval for θ is 6:12 θˆBCa 0:05; θˆBCa :095 = 379,1,164; extending twice as far to the right of θˆ as to the left. Fig. 5. Histogram of 2,200 nonparametric bootstrap replications of the maximum eigenvalue statistic for the student score data; bootstrap standard error estimate σˆ = 212:0. The histogram is long-tailed to the right, and so is the BCa confidence interval 6:12. It is easy to extend the ABC method of Section 4 to nonparametric problems, greatly reducing the computational burden of the BCa intervals. The formulas are basically the same as in (4.9)–(4.14), but they simplify somewhat in the nonparametric– multinomial framework. The statistic is expressed in the functional form θˆ = tFˆ and then reevaluated for values of F very near Fˆ, as in (6.5). The ABC limits require only 2n + 4 reevaluations of the statistic. By comparison, the BCa method requires some 2,000 evaluations θˆ ∗ = tFˆ ∗ , where Fˆ ∗ is a bootstrap empirical distribution. The nonparametric ABC algorithm “abcnon” was applied to the maximum eigenvalue statistic for the student score data. After 46 reevaluations of the statistic defined by (6.9)–(6.11), it gave 0.90 central confidence interval 6:13 θˆ ABC0:05; θˆ ABC0:95 = 379,1,172; nearly the same as (6.12). The Statlib program abcnon used here appears in the appendix to Efron (1994); Efron (1994) also applied abcnon to the full normal theory MLE of θ, (6.8), rather than to the ad hoc estimator (6.9)–(6.11). The resulting ABC interval 353; 1307 was 20% longer than (6.13), perhaps undermining belief in the data’s normality. So far we have only discussed one-sample nonparametric problems. The K-sample nonparametric problem has data (6.14) xk1 ; xk2 ;: : :; xknk ∼i:i:d: Fk for k = 1; 2;: : :; K; for arbitrary probability distributions Fk on possibly different sample spaces Xk . The nonparametric MLE of a real-valued parameter of interest θ = tF1 ; F2 ;: : :; FK is 6:15 θˆ = tFˆ 1 ; Fˆ 2 ;: : :; Fˆ K; where Fˆ k is the empirical distribution corresponding to xk = xk1 ; xk2 ;: : :; xnkk . It turns out that K-sample nonparametric confi- dence intervals can easily be obtained from either abcnon or bcanon, its nonparametric BCa counterpart. How to do so is explained in Remarks C and H of Efron (1994). 7. CALIBRATION Calibration is a bootstrap technique for improving the coverage accuracy of any system of approximate confidence intervals. Here we will apply it to the nonparametric ABC intervals in Tables 2 and 3. The general theory is reviewed in Efron and Tibshirani (1993, Sections 18.3 and 25.6), following ideas of