Next: Autoregressive (AR) Method Up: Bootstrap Methods Previous: Delete-d jackknife

## Bootstrap

Bootstrap is the most recently developed method to estimate errors and other statistics. It requires the much greater power that modern computers can provide.

The term bootstrap'' derives from the phrase to pull oneself up by one's bootstrap'' (Adventures of Baron Munchausen, by Rudolph Erich Raspe)

Example. Consider a sample , in which xi is drawn from an empirical distribution . There are NN possible samples, called the ideal bootstrap samples.

Consider an simple case when N = 2. The original sample yields 22 = 4 ideal bootstrap samples: .

However, getting all ideal bootstrap samples becomes unrealistic as N becomes a large number and the computational tasks are incredibly heavy. Therefore we normally use the Monte Carlo approach.

The bootstrap estimate of standard error is the standard deviation of the bootstrap replications:
 (5)
where .

Comparing (3) with (5), one can find that the factor in the jackknife's s.e. formula is roughly N times larger. This is called the inflation factor. The reason is that, unlike bootstrap samples, jackknife samples are very similar to the original sample and therefore the difference between jackknife replications is small. One can consider the special case when and verify (3).

Suppose is the mean . In this case, standard probability theory tells us that as B gets very large, formula (5) approaches

The bootstrap algorithm for estimating standard errors (see Figure 2):

1.
Select B independent bootstrap samples , each consisting of n data values drawn with replacement from .
2.
Evaluate the bootstrap replication corresponding to each bootstrap sample

3.
Estimate the s.e. by the sample standard deviation of the B replicates

where .

 Figure 2. The bootstrap algorithm.

Other properties of bootstrap:

• Typical values of B, the number of bootstrap samples, are for standard error estimation.
• Bootstrap methods can also assess more complicated accuracy measures, like biases, prediction errors, and confidence intervals.
• Bootstrap confidence intervals add another factor of 10 to the computational burden.

The payoff for heavy computation:

• an increase in the statistical problems that can be analyzed.
• a reduction in the assumption of the analysis.
• the elimination of the routine but tedious theoretical calculations usually associated with accuracy assessment.

Example 1: Diurnal variation of TIPP. Figure 3 shows the diurnal variation of trans-ionospheric pulse pairs (TIPPs) detected by the Blackbeard instrument aboard the ALEXIS spacecraft. The data are six-hour running averages centered on each hour. The s.e. of the mean calculated by the bootstrap method shows the statistical significance of diurnal variation.

 Figure 3. Diurnal variation of TIPP detection. Data are shown for (a) central Africa, (b) Indonesia, and (c) North America (From Zuelsdorf et al. [1998]).

Example 2: Error estimation in the mininum variance analysis. Table 1 presents the errors of nx, ny and Bn, in the minimum variance analysis problem, where is the minimum variance unit vector. The magnetic field data set are artificially generated and the corresponding error according to the formulation is denoted as the true'' error. Next to the true errors are the error estimates determined by the bootstrap method. The two columns on the right are the error estimates by Sonnerup [1971] and its modified version given by Kawano and Higuchi [1995]. The bootstrap error estimates are in best agreement with the true errors''.

 true'' errors bootstrap error estimates Sonnerup's error estimates [1971] modified version of Sonnerup [1971] nx 0.013 0.014 0.030 0.035 ny 0.047 0.047 0.101 0.119 Bn 1.9 2.0 4.1 4.8

Bootstrap software in S or S-PLUS is available via ftp at lib.stat.cmu.edu if the username statlib is given.

Next: Autoregressive (AR) Method Up: Bootstrap Methods Previous: Delete-d jackknife