Stochastic Optimisation

Lecturers:

Ayalvadi Ganesh (a.ganesh@bristol.ac.uk) and Vladislav Tadic (v.b.tadic@bristol.ac.uk)

Office hours: Tuesdays 12-1pm, Weeks 2-5. Fry Building, Room 1.40

Texts (first half of unit):

Lecture notes will be provided. In addition, the following texts are recommended:

·         S. Bubeck and N. Cesa-Bianchi, Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Foundations and Trends in Machine Learning, 2012.

·         T. Lattimore and C. Szepesvari, Bandit Algorithms, Cambridge University Press.

 

Homework policy: Homework is an important part of learning the material on this course and you are strongly encouraged to attempt all the homework problems. You may discuss the problems, but you should write out the solutions on your own.

You will have two items of assessed coursework, which will each count for 5% of the final mark.

Lecture videos

Lectures will be in-person. In addition, you can find recorded video lectures here covering the first half of the unit. These may not map 1-1 onto the in-person lectures.

Lecture notes

Introduction

The UCB algorithm

Thompson sampling

 

Homework problems

Problem Sheet 1                      Solutions

Problem Sheet 2                      Solutions

Problem Sheet 3                      Solutions