Statistics 1

Mathematics home Statistics home Research People Seminars Postgrad opportunities SuSTaIn Peter Green Research > Teaching Statistics 1 home .. downloads .. end of course summary (pdf) .. sections .. using R .. R demos .. extra books Links Personal APTS Complexity science Royal Statistical Society International Society for Bayesian Analysis Royal Society

5. Sampling variation: (a) Simulation based methods, (b) Central Limit Theorem

Aims

In the previous sections we have seen several different ways of estimating a population parameter, or population quantity of interest, from a given set of sample data. However, the sample data is just one of many possible samples that could be drawn from the population. Each sample would have different values, and so would give a different value for the estimate. In this section we use simulation based methods to investigate how the value of an estimate would vary as we took different independent random samples and hence evaluate and compare the performance of different estimators.

Many estimators are based on the sum - or the mean value - of the observations in a random sample from an underlying population distribution. The exact distribution of these quantities may be difficulty to compute, and will usually vary with the underlying population distribution. The Central Limit Theorem gives a simple way of approximating the distribution of the sum or mean, that depends only on the population mean and the population variance. It also provides a plausible explanation for the fact that the distribution of many random variables studied in physical experiments are approximately Normal, in that their value may represent the overall addition of a number of individual randon factors.

Objectives

The following objectives will help you to assess how well you have mastered the relevant material. By the end of this section you should be able to:

Generate random samples from a given standard distribution using the random number generator for that distribution in a statistical package such as R.

Understand how the performance of an estimator can be related to systematic and random error through the bias and variance of the estimator.

Evaluate the performance of an estimator for a single quantity of interest, both qualitatively from a boxplot or histogram of estimates from repeated samples and quantitatively or numerically from summary statistics derived from the repeated samples.

Recall the statement and the implications of the Central Limit Theorem.

Apply the Central Limit Theorem to find the approximate distribution of the sum or mean of a random sample from a population distribution with known mean and variance.

Apply a continuity correction to improve the approximation given by the Central Limit Theorem when the underlying variable is an integer valued random variable representing counts.

Handouts and Problem Sheets

Copies of Handouts, Problem Sheets and Solution Sheets for the unit will be made available each week here.

Handout for Section 5 | Problem sheet 6 | Solution sheet 6

Copyright notice

All material in these pages is copyright of the University unless explicitly stated otherwise. It is provided exclusively for educational purposes at the University and is to be downloaded or copied for your private study only, and not for distribution to anyone else.

Please also note that material from previous years' delivery of this unit is not necessarily a reliable indicator of what will be covered or examined this year.

Questions - set this week

PROBLEM SHEET 6 -- Questions 1, 2, 4

Interesting links

R demo 1 - the simple function I used in lecture 10 to visualise estimation of the population median given a sample from the Uniform distribution, and R demo 2 - the function I used in lecture 11 to visualise the distribution of the sum of i.i.d. rv's, suggesting the Central Limit Theorem.

Rice Virtual Lab in Statistics
This site was introduced in the web page for section 1, and contains is a very nice collection of applets, simulations, demonstrations and information on many aspects of statistics. Particularly worth visiting this week is an applet which explores various aspects of sampling distributions. When the applet begins, a histogram of a normal distribution is displayed at the top of the screen, and you can see how sample quantities relate to population quantities for different distributions, different statistics, different sample sizes, different numbers of repeated samples and so on.

Also worth visiting this week is an applet which illustrates the Central Limit Theorem by exploring how the Normal distribution can be used to approximate the Binomial distribution.

Vestac
The Vestac site, also introduced in section 1, has some simple applets for visualising the distribution of a sample mean and the distribution of a sample variance. First, select the Basics link; then select the appropriate picture icon (continuous pdf or discrete pmf) above the required distribution; then choose the type of distribution (Normal, Uniform, Binomial, Poisson etc.).

Note that I have no control over the content or availability of these external web pages. The links may be slow to load, or may sometimes fail altogether - please email me to report if a link goes down. Similarly applets may be slow to load or run, but beware that you may experience problems if you try to exit them before they have finished loading.

Professor Peter Green, School of Mathematics, University of Bristol, Bristol, BS8 1TW, UK.
Email link Telephone: +44 (0)117 928 7967; Fax: +44 (0)117 928 7999