Bootstrap is the most recently developed method to estimate errors and other statistics. It requires the much greater power that modern computers can provide.
The term ``bootstrap'' derives from the phrase ``to pull oneself up by one's bootstrap'' (Adventures of Baron Munchausen, by Rudolph Erich Raspe)
Example. Consider a sample , in which x_{i} is drawn from an empirical distribution . There are N^{N} possible samples, called the ideal bootstrap samples.
Consider an simple case when N = 2. The original sample yields 2^{2} = 4 ideal bootstrap samples: .
However, getting all ideal bootstrap samples becomes unrealistic as N becomes a large number and the computational tasks are incredibly heavy. Therefore we normally use the Monte Carlo approach.
The bootstrap estimate of standard error is the standard deviation of the bootstrap replications:
(5) |
Comparing (3) with (5), one can find that the factor in the jackknife's s.e. formula is roughly N times larger. This is called the inflation factor. The reason is that, unlike bootstrap samples, jackknife samples are very similar to the original sample and therefore the difference between jackknife replications is small. One can consider the special case when and verify (3).
Suppose is the mean . In this case, standard probability theory tells us that as B gets very large, formula (5) approaches
The bootstrap algorithm for estimating standard errors (see Figure 2):
Other properties of bootstrap:
The payoff for heavy computation:
Example 1: Diurnal variation of TIPP. Figure 3 shows the diurnal variation of trans-ionospheric pulse pairs (TIPPs) detected by the Blackbeard instrument aboard the ALEXIS spacecraft. The data are six-hour running averages centered on each hour. The s.e. of the mean calculated by the bootstrap method shows the statistical significance of diurnal variation.
Figure 3. Diurnal variation of TIPP detection. Data are shown for (a) central Africa, (b) Indonesia, and (c) North America (From Zuelsdorf et al. [1998]). |
Example 2: Error estimation in the mininum variance analysis. Table 1 presents the errors of n_{x}, n_{y} and B_{n}, in the minimum variance analysis problem, where is the minimum variance unit vector.
The magnetic field data set are artificially generated and the corresponding error according to the formulation is denoted as the ``true'' error.
Next to the true errors are the error estimates determined by the bootstrap method. The two columns on the right are the error estimates by Sonnerup [1971] and its modified version given by Kawano and Higuchi [1995].
The bootstrap error estimates are in best agreement with the ``true errors''.
Bootstrap software in S or S-PLUS is available via ftp at lib.stat.cmu.edu if the username statlib is given.