dna model -> Statistics m34 galaxy -> Research 2xfj protein -> Links protein gel -> Personal Bristol Balloon Fiesta SuSTaIn UoB
Mathematics home Statistics home Research People Seminars Postgrad opportunities SuSTaIn Peter Green Research > Teaching Statistics 1 home .. downloads .. end of course summary (pdf) .. sections .. using R .. R demos .. extra books Links Personal APTS Complexity science Royal Statistical Society International Society for Bayesian Analysis Royal Society

1. Exploratory Data Analysis & Computing in R

Aims | Objectives | Reading | Handouts & Problem Sheets | Questions | Links

Return to the Statistics 1 home page

Aims

This section introduces a selection of simple graphical and numerical methods for exploring and summarizing single data sets. These methods generally form part of an approach called Exploratory Data Analysis. Such analysis and evaluation can be informative in its own right, but also forms an essential first step before any detailed statistical analysis is performed on the data.

The section also introduces the statistical package R through its use for simple graphical and numerical computation of plots and summary statistics.


Objectives

The following objectives will help you to assess how well you have mastered the relevant material. By the end of this section you should be able to:

  • Construct simple graphical plots of data sets (stem-and-leaf plot, histogram, boxplot and (if appropriate) time-plot).
  • Use simple graphical plots to comment on the overall pattern of data in a data set, and identify and comment on any striking deviations from this pattern.
  • Calculate simple measures of location for a data set (median, mean and trimmed mean).
  • Calculate simple measures of spread for a data set (variance, standard deviation, hinges, quartiles and inter-quartile range).
  • Use the statistical package R to produce simple graphical plots and compute simple measures of location and spread for a given set of real-valued data.
  • Compute the order statistics for a given set of real-valued data.

Suggested Reading

RiceChapter 10Sections 10.1-10.6

Handouts and Problem Sheets

Handout for Section 1 | Problem sheet 1 | Problem sheet 2 | Solution sheet 1 | Solution sheet 2


Copyright notice

© University of Bristol 2011

All material in these pages is copyright of the University unless explicitly stated otherwise. It is provided exclusively for educational purposes at the University and is to be downloaded or copied for your private study only, and not for distribution to anyone else.

Please also note that material from previous years' delivery of this unit is not necessarily a reliable indicator of what will be covered or examined this year.


Questions set

Week 1: PROBLEM SHEET 1 -- Questions 1 and 2
Week 2: PROBLEM SHEET 2 -- Questions 1 and 4


Interesting links

Weighing the Earth
The story of Henry Cavendish's 1797-98 experiment to 'weigh the earth', as he put it, using experimental apparatus in his laboratory. I may use his data in the lectures.

Rice Virtual Lab in Statistics
This is a very nice collection of applets, simulations, demonstrations and information on many aspects of statistics, and I site I will encourage you to visit at several points in the course. Particularly worth visiting for material relating to this week's lectures is the page on histograms.

MathWorld
Eric Weisstein's World of Mathematics is a comprehensive and interactive mathematics encyclopedia, made up of an interlinked framework of mathematical exposition and illustrative examples, which claims to be the web's most complete mathematics resource. There are sections on the various branches of mathematics, including Probability and Statistics, but it may take you a while to find your way round the site. For material related to this week's material you might perhaps start by looking at Boxplots (which the encyclopaedia calls a Box and Whiskers Plot), and take it from there.

Ask Dr Math
Ask Dr. Math is a question and answer service for math students and their teachers. It operates at various levels - the link above is to a list of college level topics, or you can go directly to the statistics questions. For this week you might find the discussion of quartiles particularly interesting.

A searchable archive is available by level and topic, as well as summaries of Frequently Asked Questions (the Dr. Math FAQ), and there is a history of how the site came about at The History of Dr. Math.

Rossman-Chance
This Rossman-Chance site has one or two nice applets. Particularly interesting for this week is one that illustrates the effect of bin width on the resulting histogram.

Note that I have no control over the content or availability of these external web pages. The links may be slow to load, or may sometimes fail altogether - please email me to report if a link goes down. Similarly applets may be slow to load or run, but beware that you may experience problems if you try to exit them before they have finished loading.

Professor Peter Green, School of Mathematics, University of Bristol, Bristol, BS8 1TW, UK.
Email link Telephone: +44 (0)117 928 7967; Fax: +44 (0)117 928 7999
Peter in Chinese characters email as QR barcode