Jackknife resampling

In statistics, the jackknife is a resampling technique especially useful for variance and bias estimation. The jackknife predates other common resampling methods such as the bootstrap. The jackknife estimator of a parameter is found by systematically leaving out each observation from a dataset and calculating the estimate and then finding the average of these calculations. Given a sample of size $n$ , the jackknife estimate is found by aggregating the estimates of each $(n-1)$ -sized sub-sample.

The jackknife technique was developed by Maurice Quenouille (1924-1973) from 1949, and refined in 1956. John Tukey expanded on the technique in 1958 and proposed the name "jackknife" since, like a physical jack-knife (a compact folding knife), it is a rough-and-ready tool that can improvise a solution for a variety of problems even though specific problems may be more efficiently solved with a purpose-designed tool.^[1]

The jackknife is a linear approximation of the bootstrap.^[1]

Estimation

The jackknife estimate of a parameter can be found by estimating the parameter for each subsample omitting the i-th observation.^[2] For example, if the parameter to be estimated is the population mean of x, we compute the mean ${\bar {x}}_{i}$ for each subsample consisting of all but the i-th data point:

{\bar {x}}_{i}={\frac {1}{n-1}}\sum _{j=1,j\neq i}^{n}x_{j},\quad \quad i=1,\dots ,n.

These n estimates form an estimate of the distribution of the sample statistic if it were computed over a large number of samples. In particular, the mean of this sampling distribution is the average of these n estimates:

{\bar {x}}={\frac {1}{n}}\sum _{i=1}^{n}{\bar {x}}_{i}.

A jackknife estimate of the variance of the estimator can be calculated from the variance of this distribution of ${\bar {x}}_{i}:$ ^[3]^[4]

\operatorname {Var} ({\bar {x}})={\frac {n-1}{n}}\sum _{i=1}^{n}({\bar {x}}_{i}-{\bar {x}})^{2}.

Bias estimation and correction

The jackknife technique can be used to estimate the bias of an estimator calculated over the entire sample. Say $\hat{\theta}$ is the calculated estimator of the parameter of interest based on all ${n}$ observations. Let

{\hat {\theta }}_{{\mathrm {(.)}}}={\frac {1}{n}}\sum _{{i=1}}^{n}{\hat {\theta }}_{{\mathrm {(i)}}}

where ${\hat {\theta }}_{{\mathrm {(i)}}}$ is the estimate of interest based on the sample with the i-th observation removed, and ${\hat {\theta }}_{{\mathrm {(.)}}}$ is the average of these "leave-one-out" estimates. The jackknife estimate of the bias of $\hat{\theta}$ is given by:

{\widehat {\text{Bias}}}_{\mathrm {(\theta )} }=(n-1)({\hat {\theta }}_{\mathrm {(.)} }-{\hat {\theta }})

and the resulting bias-corrected jackknife estimate of $\theta$ is given by:

{\hat {\theta }}_{\text{Jack}}=n{\hat {\theta }}-(n-1){\hat {\theta }}_{\mathrm {(.)} }

This removes the bias in the special case that the bias is $O(n^{-1})$ and to $O(n^{-2})$ in other cases.^[1]

Notes

1 2 3 Cameron & Trivedi 2005, p. 375.
↑ Efron 1982, p. 2.
↑ Efron 1982, p. 14.
↑ McIntosh, Avery I. "The Jackknife Estimation Method" (PDF). Boston University. Avery I. McIntosh. Retrieved 2016-04-30. : p. 3.

References

Cameron, Adrian; Trivedi, Pravin K. (2005). Microeconometrics : methods and applications. Cambridge New York: Cambridge University Press. ISBN 9780521848053.
Efron, Bradley; Stein, Charles (May 1981). "The Jackknife Estimate of Variance". The Annals of Statistics. 9 (3): 586–596. doi:10.1214/aos/1176345462. JSTOR 2240822.
Efron, Bradley (1982). The jackknife, the bootstrap, and other resampling plans. Philadelphia, PA: Society for Industrial and Applied Mathematics. ISBN 9781611970319.
Quenouille, Maurice H. (September 1949). "Problems in Plane Sampling". The Annals of Mathematical Statistics. 20 (3): 355–375. doi:10.1214/aoms/1177729989. JSTOR 2236533.
Quenouille, Maurice H. (1956). "Notes on Bias in Estimation". Biometrika. 43 (3–4): 353–360. doi:10.1093/biomet/43.3-4.353. JSTOR 2332914.
Tukey, John W. (1958). "Bias and confidence in not quite large samples (abstract)". The Annals of Mathematical Statistics. 29 (2): 614. doi:10.1214/aoms/1177706647.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[FOOTNOTECameronTrivedi2005375-1] 1 2 3 Cameron & Trivedi 2005, p. 375.

[FOOTNOTEEfron19822-2] Efron 1982, p. 2.

[FOOTNOTEEfron198214-3] Efron 1982, p. 14.

[4] McIntosh, Avery I. "The Jackknife Estimation Method" (PDF). Boston University. Avery I. McIntosh. Retrieved 2016-04-30. : p. 3.