Bayesian Modelling B: MATH34920, MATHM4920

If you are in any doubt about the practical value of learning applied statistics, you should read Statistician projected as top 10 fastest-growing job. What is even more gratifying is that, from a mathematical point of view, the theory behind modern applied statistics is elegant and powerful.

Lecturer	Jonathan Rougier, j.c.rougier@bristol.ac.uk
Level	H/6 and M/7, 10cp
Weeks	TB2, TW13 to TW18 (23 Jan to 3 Mar)
Official unit page	level H/6, level M/7
Timetable	1300-1350 Thu, MATH SM3
	1700-1750 Thu, MATH SM2
	1400-1450 Fri, MATH SM4
	All lectures start promptly on the hour and last for fifty minutes. Modifications to the timetable appear below, under announcements.
Office Hour	1300-1350 Fri, MATH PC2 (Portacabin 2).
	Please come at the start of the hour. One or two people cannot make this time: please talk to me about alternative arrangements.

Navigation: Course information, details, homework.

Click on 'details' to see a lecture-by-lecture summary.

Announcements

Final office hour.Tue 30 May, 10am, SM2.
Office Hours. Mon 8 May, 12 noon, PC2. Tue 16 May, 4pm, SM2.
Bayesian Modelling B 3/4 assignment now set.
Feedback available on Homework 2.
Summer Intern opportunity: http://www.uhbristol.nhs.uk/research-innovation/our-research/bristol-nutrition-bru/vacancies/.
23 Feb: Lectures at 1000 (SM4) and 1300 (SM3), no lecture at 1700.

Course information

Outline

The course is divided into three parts:

Statistical modelling,
MCMC Theory, and
Practice and computing.

Each of these parts takes about six lectures. There is a detailed lecture plan.

Reading

The course book is

D. Lunn et al., 2013, The BUGS Book: A Practical Introduction to Bayesian Analysis, CRC Press.

You can buy this book from wordery.co.uk. Lectures will be supplemented with additional readings from this book. There is a webpage for this book.

You should also consult

Peter D. Hoff, 2009, A First Course in Bayesian Statistical Methods, Springer.

This book is available from the Library (electronic download). Webpage for this book.

Additionally, you will find it helpful to dip into

A. Gelman et el., 2013, Bayesian Data Analysis, 3rd edn, CRC Press.

You can buy this book from wordery.co.uk. There is a webpage for this book. Andrew Gelman has a well-known and interesting blog.

For statistical theory, at the appropriate level of difficulty, see

G. Casella and R.L. Berger, 2002, Statistical Inference, 2nd edn, Brooks/Cole.

When you are ready to produce beautiful visualisations of your data and your results, you will need

H. Wickham, 2009, ggplot2, Use R!, Springer.

I don't currently use ggplot2 myself, but I am a dinosaur; hoping to upgrade to a mammal in due course.

Finally, there will be handouts to cover the more technical material in the second part of the course. The homeworks will contain code snippets which you can adapt.

Software

The computing software for this course is JAGS, which you must download for your computer. Here is the JAGS v4.0.0 user manual.

We will run JAGS from within R. Make sure your version of R is up-to-date, e.g. by visiting CRAN. To run JAGS from within R you will need to install (from inside R) the rjags package, either using the GUI or by using the install.packages function. Here is the rjags reference manual.

If you prefer, you can use the computers in room G9 of the Maths Dept main building. Login and select 'All Programs/R/R x64 3.2.3'. JAGS and rjags are already installed.

You should brush up on your basic data wrangling skills in R. Read this excellent paper by Hadley Wickham. I find the dplyr package in R very useful.

Comment on the exam

Previous exam papers are available on Blackboard. You should be aware that the course continues to evolve, and these questions cannot be taken as a reliable guide to the questions that will be set this year.

Answers to previous exam papers will not be made available. The exam is designed to assess whether you have attended the lectures, read and thought about your lecture notes and the handouts, done the homework, and read a bit more widely in the textbooks. Diligent students who have done the above will gain no relative benefit from studying the answers to previous exam questions. On the other hand, less diligent students may suffer the illusion that they will do well in the exam, when probably they will not.

Instead, I will supply 'exam-style' questions in the homeworks for revision purposes.

Finally, please note that in the exam ALL questions will be used for assessment. The number of marks will not necessarily be the same for each question.

Course details

Here is a summary of the course, looking as far ahead as seems prudent. This plan is subject to revision. There will be some time at the end for revision of the major themes.

For background, have a read through Statistics: Another short introduction during your first couple of weeks.

Statistical modelling

Handout: Statistical modelling. Reading from the BUGS book: Preface, chs 1 and 3. If you are struggling to follow the handout or the reading, have a look at chs 1-4 of Hoff (2009).

26 Jan, 1pm. Introduction, 'Bayesian conditionalization'. We learn about predictands X using observations y^obs by computing p^*(x) = p(x | y^obs) ∝ p(x, y^obs) — for the time being we will not worry about the missing normalizing constant. (Preliminary reading material, secs 2.1 and 2.2)
26 Jan, 5pm. Conditional probability for propositions: definition, the Extension theorem, the Telescope theorem. (Sec 2.3)
27 Jan. Conditional independence of random quantities: definition, notation, equivalence theorem. Mutual conditional independence (conditionally IID is a special case), 'Unconditional' independence. (Sec 2.4)
2 Feb, 1pm. The Maxim of statistical modelling: introduce parameters to simplify the specification of your model using conditional independence. Directed Acyclic Graphs (DAGs) as a visual representation of the conditional independence described in the Telescope representation of the joint distribution. (Sec 2.1 again, sec 2.5)
2 Feb, 5pm. Marginalizing over parameters creates a DAG with lots of extra edges. Thus introducing parameters Θ, modelling (Θ, X) jointly with conditional independence, then marginalizing over Θ is a a good way to specify an interesting probability distribution over X. (Sec 2.6)
3 Feb. Volcanoes! Example of a hierarchical model for the eruptions of a set of volcanoes which are similar but not identical. Plates for DAGs. Extension to groups of volcanoes.
Further reading: Sec 2.7. Make sure you understand the difference between a DAG and a CIG, and understand how the Moralization Theorem converts a DAG into a CIG.
MCMC theory

Handout: Markov Chain Monte Carlo (some material still to come in section 3.5). Reading from the BUGS book: Chapter 4, sections 1 and 2. You might also find sections 10.4 and 10.5 of Hoff (2009) helpful. My handout is more rigorous than either of these books.
9 Feb, 1pm. Markov chains, in finite state spaces and discrete time. Transition matrices, stationary distributions, Cesàro averages. Read more about Brouwer's Fixed Point Theorem.
9 Feb, 5pm. Convergence in Mean Square—a strong form of convergence. Proof of Theorem 3.2. You need to know Theorem 3.2, but the proof is not examinable. This is a 'magical' proof.
10 Feb. Proof of Theorem 3.3. You need to know Theorem 3.3, but the proof is not examinable. This is a tedious proof. Actually, I take that back: it has an interesting plot. Read more about the Big-O notation.
Weekend reading: complete section 3.2 by going through the proof of Theorem 3.4. You need to know this proof but it is not examinable.
16 Feb, 1pm. The world famous Metropolis-Hastings (MH) algorithm, which allows us to turn a proposal h(i → j) into a transition matrix with our target distribution as its stationary distribution. Does not care about normalizing constants: very attractive for Bayesians.
16 Feb, 5pm. Gibbs sampling is a special case of MH, in which the choice of h(i → j) is taken out of our hands, and the acceptance ratio is certain to be 1. Requires full conditional distributions, but these can often be deduced symbolicly, if the model is specified symbolicly.
17 Feb. Review of where we have got to. The three steps of a Bayesian approach: (i) Maxim of statistical modelling (introduce parameters, use conditional independence); (ii) update beliefs by conditionalization; (iii) express beliefs as values for expectations (the 'no telepathic ferret' condition). The two main challenges from n ≠ ∞: convergence to the target distribution, and errors in the estimates.
Practice and computing

Reading: BUGS Book, sections 4.3, 4.4, and 4.5. There are more handouts on the way … Here's one: practical issues, the end of the MCMC handout.
JAGS for rats handout. Here is the data: Rats.csv, Rats.xls, Rats.xlsx.
23 Feb, 10am. Compute-along lecture, JAGS for rats!
23 Feb, 1pm. Computing workshop, come along if you'd like support with the JAGS for rats worksheet.
24 Feb. Another compute-along lecture, covering convergence diagnostics.
2 Mar, 1pm. Monte Carlo Standard Errors (MCSEs). More compute-along, plus a MCMC revision worksheet on simulating Benford's Law.
2 Mar, 5pm. Finishing Benford's Law, general wrap-up.
3 Mar. Review lecture of the unit, reminding ourselves in particular about the first two stages: Statistical Modelling and MCMC Theory.

Homework

There will be a homework every week. Hand-in dates and hand-back dates will usually be Thu 2pm for hand-in, hand-back in Fri lecture. Hand-in in the box in the Maths Dept foyer, or at the lecture.

You are strongly encouraged to do the homeworks and to hand in your efforts, to be commented on and marked.

26 Jan. Homework 1.
3 Feb. Homework 2. Due on Thu 9 Feb.
Here are the answers. (Minor update Fri 10 Feb).
Feedback. Q4 and Q5 were poorly done, with many answers showing lack of basic mathematical technique. By this stage, all students should know how to prove an equivalence, and how to inspect a proof to see that it is sound. The homework handed in is not the scratch-pad, where the proof is worked out, but the shop window where the proof is displayed. The same is true in an exam.
Basic mathematical technique (not just for exams, these are also used by professional mathematicians).
1. Give definitions of all concepts.
2. State what you need to show in order to answer the question.
3. Write sentences throughout, explaining the non-symbolic steps.
4. Finish properly with a "as was to be shown" or a "which completes the proof".
9 Feb. Homework 3. Due on Thu 16 Feb.
Here are the answers. (Minor update Fri 17 Feb).
Feedback.There are tips on drawing DAGs in the answer to Q2a. In Q2c, remember that a DAG can only tell you about the conditional independence of X_i and earlier X's in the ordering. So the only one that is inferrable from the DAG is (iii).
We covered Q4 in the Office Hour. We'll do another example in a homework or revision sheet.
16 Feb. Homework 4. Due on 23 Feb.
Here are the answers. (Minor update Fri 24 Feb).
Feedback. Not really a large enough sample to provide feedback.
23 Feb. Homework 5. Due on 2 Mar.
Here are the answers.
Feedback. Not really a large enough sample to provide feedback, but no one attempted Q3 and Q4 and these are really important for your understanding of what goes on 'under the hood' of a Gibbs sampler.
23 Feb. Homework 6. Just for fun.
Here are the answers.

Bayesian Modelling B 3/4 assignment

Is now available.