Stochastic Optimisation

Lecturers:

Ayalvadi Ganesh (a.ganesh@bristol.ac.uk) and Vladislav Tadic (v.b.tadic@bristol.ac.uk)

Office hours (second half of unit): Thursday 11am-1pm, Fry Bldg, Room 1.40.

Texts (second half of unit):

Lecture notes will be provided. In addition, the following texts are recommended:

·         S. Bubeck and N. Cesa-Bianchi, Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Foundations and Trends in Machine Learning, 2012.

·         T. Lattimore and C. Szepesvari, Bandit Algorithms, Cambridge University Press.

 

Homework policy: Homework is an important part of learning the material on this course and you are strongly encouraged to attempt all the homework problems. You may discuss the problems, but you should write out the solutions on your own.

You will have two items of assessed coursework, which will each count for 10% of the final mark.

Lecture videos

Lectures will be in-person. In addition, you can find recorded video lectures here covering the second half of the unit. These may not map 1-1 onto the in-person lectures.

Lecture notes

Introduction

The UCB algorithm

Thompson sampling

 

Homework problems

Problem Sheet 1                      Due Fri, 17 Nov                      Solutions

Problem Sheet 2                      Due Wed, 29 Nov                   Solutions

Problem Sheet 3                      Due Mon, 11 Dec                   Solutions