R demos

Back to all R demos
Sampling and parametric inference
You can try out the functions I used in lecture 6 to visualise sampling from a large finite population, and the advantage of making parametric assumptions in inference even if the assumptions do not hold perfectly.
  1. Download this file: popsamp.R to your computer, and save it (as popsamp.R)
  2. Start up R
  3. In R, using the Source R code command (on the File menu), navigate to the appropriate folder, and select the file popsamp.R
  4. Run the demonstration by typing popsamp().
  5. The function first displays an artificial population of 10000 data values as a QQ plot (on the left) and histogram (on the right). As you can see, the shape of the distribution is quite far from normal.
  6. The objective of the inference we will do is to estimate the probability that a randomly drawn member of the population is less than 9. The true answer to this is tau, the proportion of the population that is less than 9, whose value is printed by the histogram.
  7. Successive clicks of the mouse (above the x-axis of the right hand plot) cause repeated samples of size 25 to be drawn (without replacement) from this population. After 1, 2, 3, 4, 5, 10, 20, 50, 75, 100, 150, 200, 500 and 1000 samples, the current sample is displayed as a histogram (on the left), with the fitted normal distribution superimposed, as estimated by the method of moments. On the right, you see the cumulative distribution across all the samples to date of two different estimates of tau - on the left ('normal') is that obtained using the estimated normal distribution, on the right ('nonparametric') is the sample proportion of values less than 9 in that sample. The numerical values of the two estimates for each sample are displayed above the histogram on the left. The numbers below the boxplots on the right are the mean squared errors between the estimates and the true value.
  8. To stop the sampling early, click below the x-axis of the right hand plot.
  9. Which estimate is better? Was is a good idea to make these parametric assumptions?