lab07October 17, 20180.1 Lab 7: Resampling and the BootstrapThe British Royal Air Force wanted to know how many warplanes the Germans had (some number N, which is a population parameter), and they needed to estimate that quantity knowing only arandom sample of the planes’ serial numbers (from 1 to N). We know that the German’s warplanesare labeled consecutively from 1 to N, so N would be the t
...[Show More]

lab07

October 17, 2018

**0.1 Lab 7: Resampling and the Bootstrap**

The British Royal Air Force wanted to know how many warplanes the Germans had (some number N, which is a *population parameter*), and they needed to estimate that quantity knowing only a

random sample of the planes’ serial numbers (from 1 to N). We know that the German’s warplanes

are labeled consecutively from 1 to N, so N would be the total number of warplanes they have.

We normally investigate the random variation amongst our estimates by simulating a sampling procedure from the population many times and computing estimates from each sample that

we generate. In real life, if the RAF had known what the population looked like, they would have

known N and would not have had any reason to think about random sampling. However, they

didn’t know what the population looked like, so they couldn’t have run the simulations that we

normally do.

Simulating a sampling procedure many times was a useful exercise in *understanding random*

variation for an estimate, but it’s not as useful as a tool for practical data analysis.

Let’s flip that sampling idea on its head to make it practical. Given *just *a random sample

of serial numbers, we’ll estimate N, and then we’ll use simulation to find out how accurate our

estimate probably is, without ever looking at the whole population. This is an example of *statistical*

inference.

As usual, **run the cell below **to prepare the lab and the automatic tests.

In [1]: *# Run this cell to set up the notebook, but please don't change it.*

# These lines import the Numpy and Datascience modules.

**import numpy as np**

from datascience import *

*# These lines do some fancy plotting magic.*

**import matplotlib**

%**matplotlib **inline

**import matplotlib.pyplot as plt**

plt.style.use('fivethirtyeight')

**import warnings**

warnings.simplefilter('ignore', **FutureWarning**)

*# These lines load the tests.*

**from client.api.notebook import **Notebook

ok = Notebook('lab07.ok')

_ = ok.auth(inline=**True**)

[Show Less]