During terms 2 and 3 of their first year students are required to undertake three elective courses from a selection of courses provided by Oxford and Imperial.

List of elective courses (Term 2: Jan-March 2021)

Oxford Mathematical Institute

Expand All

Course Overview: 

This course will serve as an introduction to optimal transportation theory, its application in the analysis of PDE, and its connections to the macroscopic description of interacting particle systems.

Learning Outcomes: 

Getting familar with the Monge-Kantorovich problem and transport distances. Derivation of macroscopic models via the mean-field limit and their analysis based on contractivity of transport distances. Dynamic Interpretation and Geodesic convexity. A brief introduction to gradient flows and examples.

Course Synopsis: 
  1. Interacting Particle Systems & PDE (2 hours)
    • Granular Flow Models and McKean-Vlasov Equations.
    • Nonlinear Diffusion and Aggregation-Diffusion Equations.
  2. Optimal Transportation: The metric side (4 hours)
    • Functional Analysis tools: weak convergence of measures. Prokhorov’s Theorem. Direct Method of Calculus of Variations. (1 hour)
    • Monge Problem. Kantorovich Duality. (1.5 hours)
    • Transport distances between measures: properties. The real line. Probabilistic Interpretation: couplings.(1.5 hours)
  3. Mean Field Limit & Couplings (4 hours)
    • Dobrushin approach: derivation of the Aggregation Equation. (1.5 hour)
    • Sznitmann Coupling Method for the McKean-Vlasov equation. (1.5 hour)
    • Boltzmann Equation for Maxwellian molecules: Tanaka Theorem. (1 hour)
  4. Gradient Flows: Aggregation-Diffusion Equations (6 hours)
    • Brenier’s Theorem and Dynamic Interpretation of optimal tranport. Otto’s calculus. (2 hours)
    • McCann’s Displacement Convexity: Internal, Interaction and Confinement Energies. (2 hours)
  5. Gradient Flow approach: Minimizing movements for the (McKean)-Vlasov equation. Properties of the variational scheme. Connection to mean-field limits. (2 hours)
Reading List: 
  1. F. Golse, On the Dynamics of Large Particle Systems in the Mean Field Limit, Lecture Notes in Applied Mathematics and Mechanics 3. Springer, 2016.
  2. L. C. Evans, Weak convergence methods for nonlinear partial differential equations. CBMS Regional Conference Series in Mathematics 74, AMS, 1990.
  3. F. Santambrogio, Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling, Progress in Nonlinear Differential Equations and Their Applications, Birkhauser 2015.
  4. C. Villani, Topics in Optimal Transportation, AMS Graduate Studies in Mathematics, 2003

    Please note that e-book versions of many books in the reading lists can be found on SOLO

    Further Reading: 
    1. L. Ambrosio, G. Savare, Handbook of Differential Equations: Evolutionary Equations, Volume 3-1, 2007.
    2. C. Villani, Optimal Transport: Old and New, Springer 2009

    General Prerequisites: Basic linear algebra (such as eigenvalues and eigenvectors of real matrices), multivariate real analysis (such as norms, inner products, multivariate linear and quadratic functions, basis) and multivariable calculus (such as Taylor expansions, multivariate differentiation, gradients).

    Course Overview: The solution of optimal decision-making and engineering design problems in which the objective and constraints are nonlinear functions of potentially (very) many variables is required on an everyday basis in the commercial and academic worlds. A closely-related subject is the solution of nonlinear systems of equations, also referred to as least-squares or data fitting problems that occur in almost every instance where observations or measurements are available for modelling a continuous process or phenomenon, such as in weather forecasting. The mathematical analysis of such optimization problems and of classical and modern methods for their solution are fundamental for understanding existing software and for developing new techniques for practical optimization problems at hand.

    more details: https://courses.maths.ox.ac.uk/node/42762

    Learning Outcomes:

    Students will learn how some of the various different ensembles of random matrices are defined. They will encounter some examples of the applications these have in Data Science, modelling Complex Quantum Systems, Mathematical Finance, Network Models, Numerical Linear Algebra, and Population Dynamics. They will learn how to analyse eigenvalue statistics, and see connections with other areas of mathematics and physics, including combinatorics, number theory, and statistical mechanics.

    Course Synopsis: 

    Introduction to matrix ensembles, including Wigner and Wishart random matrices, and the Gaussian and Circular Ensembles. Overview of connections with Data Science, Complex Quantum Systems, Mathematical Finance, Network Models, Numerical Linear Algebra, and Population Dynamics (1 Lecture)

    Statement and proof of Wigner’s Semicircle Law; statement of Girko’s Circular Law; applications to Population Dynamics (May’s model). (3 lectures)

    Statement and proof of the Marchenko-Pastur Law using the Stieltjes and R-transforms; applications to Data Science and Mathematical Finance. (3 lectures)

    Derivation of the Joint Eigenvalue Probability Density for the Gaussian and Circular Ensembles;
    method of orthogonal polynomials; applications to eigenvalue statistics in the large-matric limit;
    behaviour in the bulk and at the edge of the spectrum; universality; applications to Numerical Linear
    Algebra and Complex Quantum Systems (5 lectures)

    Dyson Brownian Motion (2 lectures)

    Connections to other problems in mathematics, including the longest increasing subsequence
    problem; distribution of zeros of the Riemann zeta-function; topological genus expansions. (2

    Reading List: 
    1. ML Mehta, Random Matrices (Elsevier, Pure and Applied Mathematics Series)
    2. GW Anderson, A Guionnet, O Zeitouni, An Introduction to Random Matrices (Cambridge Studies in Advanced Mathematics)
    3. ES Meckes, The Random Matrix Theory of the Classical Compact Groups (Cambridge University Press)
    4. G. Akemann, J. Baik & P. Di Francesco, The Oxford Handbook of Random Matrix Theory (Oxford University Press)
    5. G. Livan, M. Novaes & P. Vivo, Introduction to Random Matrices (Springer Briefs in Mathematical Physics)

    Please note that e-book versions of many books in the reading lists can be found on SOLO

    Further Reading: 
    1. T. Tao, Topics in Random Matrix Theory (AMS Graduate Studies in Mathematics)

    General Prerequisites: Integration and measure theory, martingales in discrete and continuous time, stochastic calculus. Functional analysis is useful but not essential.

    Course Overview: Stochastic analysis and partial differential equations are intricately connected. This is exemplified by the celebrated deep connections between Brownian motion and the classical heat equation, but this is only a very special case of a general phenomenon. We explore some of these connections, illustrating the benefits to both analysis and probability.

    Course Synopsis: Feller processes and semigroups. Resolvents and generators. Hille-Yosida Theorem (without proof). Diffusions and elliptic operators, convergence and approximation. Stochastic differential equations and martingale problems. Duality. Speed and scale for one dimensional diffusions. Green's functions as occupation densities. The Dirichlet and Poisson problems. Feynman-Kac formula.

    More details: https://courses.maths.ox.ac.uk/node/42876

    General Prerequisites: Part B Graph Theory and Part A Probability. C8.3 Combinatorics is not as essential prerequisite for this course, though it is a natural companion for it.

    Course Overview: Probabilistic combinatorics is a very active field of mathematics, with connections to other areas such as computer science and statistical physics. Probabilistic methods are essential for the study of random discrete structures and for the analysis of algorithms, but they can also provide a powerful and beautiful approach for answering deterministic questions. The aim of this course is to introduce some fundamental probabilistic tools and present a few applications.

    Course Synopsis: First-moment method, with applications to Ramsey numbers, and to graphs of high girth and high chromatic number. Second-moment method, threshold functions for random graphs. Lovász Local Lemma, with applications to two-colourings of hypergraphs, and to Ramsey numbers. Chernoff bounds, concentration of measure, Janson's inequality. Branching processes and the phase transition in random graphs. Clique and chromatic numbers of random graphs.

    More details: https://courses.maths.ox.ac.uk/node/42891

    General Prerequisites: Analysis: Basic knowledge of differential equations, measure and integration, basic complex analysis, conformal map theory (you might consider taking the C4.8 course in the first term or read the lecture notes). Probability: Martingales, Itô formula. Some knowledge about lattice models such as percolation, loop-erased random walk, Ising model etc. will be beneficial but not required. All the necessary parts will be covered in the lectures.

    Course Overview: The Schramm-Loewner Evolution (SLE) was introduced in 1998 in order to describe all possible conformally invariant scaling limits that appear in many lattice models of statistical physics. Since then the subject has received a lot of attention and developed into a thriving area of research in its own right which has a lot of interesting connections with other areas of mathematics and physics. Beyond the aforementioned lattice models it is now related to many other areas including the theory of `loop soups', the Gaussian Free Field, and Liouville Quantum Gravity. The emphasis of the course will be on the basic properties of SLE and how SLE can be used to prove the existence of a conformally invariant scaling limit for lattice models.

    Course Synopsis:

    1) (2 lectures) A quick recap of the necessary background from complex analysis. We will go through the Riemann mapping theorem and basic properties of univalent functions both in the unit disc and in the upper half plane. In these lectures I will give the main results and connections between them but not the proofs.

    2) (4 lectures) Half-plane capacity, Beurling estimates and (deterministic) Loewner Evolution. We will show that any 'nice' curve can be described by a Loewner Evolution and will study the main properties of a Loewner Evolution driven by a measure.

    3) (6 hours) Definition of Schramm-Loewner Evolution and Schramm's principle stating that SLEs are the only conformally invariant random curves satisfying the so called 'domain Markov property'. We will study the main properties of SLE. In particular, we will study its phase transitions and two special cases when SLE has the 'locality' property and the 'restriction' property.

    4) (4 lectures) Show the crossing probability for the critical percolation on the triangular lattice has a conformally invariant scaling limit (Cardy's formula). Prove that this implies that the percolation interfaces converge to SLE curves.

    More details: https://courses.maths.ox.ac.uk/node/44462

    General Prerequisites: Part A Probability and Part A Integration are required. B8.1 (Measure, Probability and Martingales), B8.2 (Continuous Martingales and Stochastic Calculus) and C8.1 (Stochastic Differential Equations) are desirable, but not essential.

    Course Overview: The convergence theory of probability distributions on path space is an essential part of modern probability and stochastic analysis allowing the development of diffusion approximations and the study of scaling limits in many settings. The theory of large deviation is an important aspect of limit theory in probability as it enables a description of the probabilities of rare events. The emphasis of the course will be on the development of the necessary tools for proving various limit results and the analysis of large deviations which have universal value. These topics are fundamental within probability and stochastic analysis and have extensive applications in current research in the study of random systems, statistical mechanics, functional analysis, PDEs, quantum mechanics, quantitative finance and other applications.

    Course Synopsis:

    1) (2 lectures) We will recall metric spaces, and introduce Polish spaces, and probability measures on metric spaces. Weak convergence of probability measures and tightness, Prohorov's theorem on tightness of probability measures, Skorohod's representation theorem for weak convergence.

    2) (2 lectures) The criterion of pre-compactness for distributions on continuous path spaces, martingales and compactness.

    3) (4 hours) Skorohod's topology and metric on the space D[0,∞) of right-continuous paths with left limits, basic properties such as completeness and separability, weak convergence and pre-compacness of distributions on D[0,∞) . D. Aldous' pre-compactness criterion via stopping times.

    4) (4 lectures) First examples - Cramér's theorem for finite dimensional distributions, Sanov's theorem. Schilder's theorem for the large deviation principle for Brownian motion in small time, law of the iterated logarithm for Brownian motion.

    5) (4 lectures) General tools in large deviations. Rate functions, good rate functions, large deviation principles, weak large deviation principles and exponential tightness. Varadhan's contraction principle, functional limit theorems.

    More details: https://courses.maths.ox.ac.uk/node/44461


    Lecture 1: Modelling: least squares, matrix completion, sparse inverse covariance estimation, sparse principal components, sparse plus low rank matrix decomposition, support vector machines.
    Lecture 2: Further modelling: logistic regression, deep learning. Mathematical preliminaries: Global and local optimisers, convexity, subgradients, optimality conditions.
    Lecture 3: Preliminaries: Proximal operators, convergence rates.
    Lecture 4: Steepest descent method and its convergence analysis in the general case, the convex case and the strongly convex case.
    Lecture 5: Prox-gradient methods.
    Lecture 6: Accelerating gradient methods: heavy ball method, Nesterov acceleration.
    Lecture 7: Oracle complexity and the stochastic gradient descent algorithm.
    Lecture 8: Variance reduced stochastic gradient descent.


    • S.J.Wright. “Optimization Algorithms for Data Analysis”, http://www.optimization-online.org/DB_FILE/2016/12/5748.pdf
    • L. Bottou, F.E. Curtis, and J. Nocedal. “Optimization methods for large-scale machine learning. SIAM Review, 59(1): 65-98, 2017.
    • Z. Allen-Zhu. Katyusha: The first direct acceleration of stochastic gradient methods. The Journal of Machine Learning Research, 18(1): 8194-8244, 2017.

    Department of Statistics, University of Oxford

    Expand All

    The aim of the lectures is to introduce modern stochastic models in mathematical population genetics and give examples of real world applications of these models.

    Stochastic and graph theoretic properties of coalescent and genealogical trees are studied in the first eight lectures.

    Diffusion processes and extensions to model additional key biological phenomena are studied in the second eight lectures.

    More Details

    Aims and Objectives: Many data come in the form of networks, for example friendship data and protein-protein interaction data. As the data usually cannot be modelled using simple independence assumptions, their statistical analysis provides many challenges. The course will give an introduction to the main problems and the main statistical techniques used in this field. The techniques are applicable to a wide range of complex problems. The statistical analysis benefits from insights which stem from probabilistic modelling, and the course will combine both aspects.


    Exploratory analysis of networks. The need for network summaries. Degree distribution, clustering coefficient, shortest path length. Motifs.

    Probabilistic models: Bernoulli random graphs, geometric random graphs, preferential attachment models, small world networks, inhomogeneous random graphs, exponential random graphs.

    Small subgraphs: Stein’s method for normal and Poisson approximation. Branching process approximations, threshold behaviour, shortest path between two vertices.

    Statistical analysis of networks: Sampling from networks. Parameter estimation for models. Inference from networks: vertex characteristics and missing edges. Nonparametric graph comparison: subgraph counts, subsampling schemes, MCMC methods. A brief look at community detection.

    Examples: protein interaction networks, social ego-networks.

    More details:

    Recommended Prerequisites: The course requires a good level of mathematical maturity. Students are expected to be familiar with core concepts in statistics (regression models, bias-variance tradeoff, Bayesian inference), probability (multivariate distributions, conditioning) and linear algebra (matrix-vector operations, eigenvalues and eigenvectors). Previous exposure to machine learning (empirical risk minimisation, dimensionality reduction, overfitting, regularisation) is highly recommended. Students would also benefit from being familiar with the material covered in the following courses offered in the Statistics department: SB2.1 (formerly SB2a) Foundations of Statistical Inference and in SB2.2 (formerly SB2b) Statistical Machine Learning.

    Aims and Objectives: Machine learning is widely used across many scientific and engineering disciplines, to construct methods to find interesting patterns and to predict accurately in large datasets. This course introduces several widely used data machine learning techniques and describes their underpinning statistical principles and properties. The course studies both unsupervised and supervised learning and several advanced topics are covered in detail, including some state-of-the-art machine learning techniques. The course will also cover computational considerations of machine learning algorithms and how they can scale to large datasets.


    Convex optimisation and support vector machines. Loss functions. Empirical risk minimisation.

    Kernel methods and reproducing kernel Hilbert spaces. Representer theorem. Representation of probabilities in RKHS.

    Nonlinear dimensionality reduction: kernel PCA, spectral clustering.

    Probabilistic and Bayesian machine learning: mixture modelling, information theoretic fundamentals, EM algorithm, Probabilistic PCA. Variational Bayes. Laplace Approximation.

    Collaborative filtering models, probabilistic matrix factorisation.

    Gaussian processes for regression and classification. Bayesian optimisation.

    + Latent Dirichlet allocation [if time allows]


    More details: SC4 Advanced Topics in Statistical Machine Learning

    Imperial College London

    Expand All

    The goal of the module is to develop thorough understanding of how trades occur in financial markets. The main market types will be described as well as traders’ main motives for why they trade.

    Market manipulation and high-frequency trading strategies have received a lot of attention in the press recently, so the module will illustrate them and examine recent developments in regulations that aim to limit them. Liquidity is a key theme in market microstructure, and the students will learn how to measure it and to recognise the recent increase in liquidity fragmentation and hidden, “dark” liquidity.

    The Flash Crash of 6 May 2010 will be analysed as a case study of sudden loss of liquidity.

    The remaining part of the module focuses on statistical analysis of market microstructure, concentrating on statistical modelling of tick-by-tick data, measurement of price impact and volatility estimation using high-frequency data.


    • Electronic Markets and the limit Order Book
    • Stochastic Optimal Control (a review)
    • Optimal Execution with Continuous Trading
    • Optimal Execution with Limit and Market Orders
    • Market Making
    • Statistical Arbitrage in High-Frequency Settings (if time permits)

    This is an introductory course on the theory and applications of random dynamical systems and ergodic theory. Random dynamical systems are (deterministic) dynamical systems driven by a random input. The goal will be to present a solid introduction and, time permitting, touch upon several more advanced developments in this field. The contents of the module are:

    • Random dynamical systems; definition in terms of skew products and elementary examples (including iterated function systems, discrete time dynamical systems with bounded noise and stochastic differential equations).
    • Introduction to random dynamical systems theory in iterated function systems context.
    • Background on measure theory and probability theory.
    • Introduction to Ergodic Theory: Birkhoff Ergodic Theorem and Oseledets Ergodic Theorem.
    • Dynamics of random circle maps: synchronisation.
    • Chaos in random dynamical systems.

    This course is in two halves: machine learning and complex networks. We will begin with an introduction to the R language and to visualisation and exploratory data analysis. We will describe the mathematical challenges and ideas in learning from data. We will introduce unsupervised and supervised learning through theory and through application of commonly used methods (such as principal components analysis, k-nearest neighbours, support vector machines and others). Moving to complex networks, we will introduce key concepts of graph theory and discuss model graphs used to describe social and biological phenomena (including Erdos-Renyi graphs, smallworld and scale-free networks). We will define basic metrics to characterise data-derived networks, and illustrate how networks can be a useful way to interpret data.

    Malliavin Calculus is an extremely powerful tool in stochastic analysis, extending the classical notion of derivative to the space of stochastic processes.

    A certain number of results arising from this theory turn out to provide the right framework to analysis several problems in mathematical finance.

    The module will be divided into two parts:

    the first one will concentrate on developing the theoretical tools of Malliavin Calculus, including analysis on Wiener space, the Wiener chaos decomposition, the Ornstein-Uhlenbeck semigroup and hypercontractivity, the Malliavin derivative operator and the divergence operator, and Sobolev spaces and equivalence of norms.

    The second part of the module will focus on understanding how these tools come in handy in order toprice and hedge financial derivatives, and to compute their sensitivities.

    RRough path theory was developed in the 1990s in order to understand the structure and information content of a given path (be it a financial time series, a hand-drawn character or the route taken by a vehicle).

    It turned out to be one of the key developments in stochastic analysis over the past 20 years andhas allowed for a better understanding (and new proofs) to many problems in this field.

    The goal of this module is to provide students with a flavour of this powerful theory and to understand how it can efficiently be applied in machine learning, one of the fast-developing techniques in the financial industry nowadays. One of the key elements in this exploration is the so-called signature of a path, of which we shall study the algebraic properties, the faithfulness, as well as the inversion and asymptotic properties. We shall further see how this signature is in fact a feature set in machine learning andillustrate these results in mathematical finance (in particular to predict financial time series), as well as in other areas (handwriting recognition, computer vision, classification problems in medical data).


    This course develops the analysis of boundary value problems for elliptic and parabolic PDE’s using the variational approach. It is a follow-up of ‘Function spaces and applications’ but is open to other students as well provided they have sufficient command of analysis. An introductory Partial Differential Equation course is not needed either, although certainly useful. The course consists of three parts. The first part (divided in two chapters) develops further tools needed for the study of boundary value problem, namely distributions and Sobolev spaces. The following two parts are devoted to elliptic and parabolic equations on bounded domains. They present the variational approach and spectral theory of elliptic operators as well as their use in the existence theory for parabolic problems. The aim of the course is to expose the students some important aspects of Partial Differential Equation theory, aspects that will be most useful to those who will further work with Partial Differential Equations be it on the Theoretical side or on the Numerical one.

    The syllabus of the course is as follows:

    1. Distributions.The space of test functions. Definition and examples of distributions. Differentiation. Convolution. Convergence of distributions.

    2. Sobolev spaces: The space H1. Density of smooth functions. Extension lemma. Trace theorem. The space H10. Poincare inequality. The Rellich-Kondrachov compactness theorem (without proof). Sobolev imbedding (in the simple case of an interval of R). The space Hm. Compactness and Sobolev imbedding for arbitrary dimension (statement without proof).

    3. Linear elliptic boundary value problems: Dirichlet and Neumann boundary value problems via the Lax-Milgram theorem. The maximum principle. Regularity (stated without proofs). Classical examples: elasticity system, Stokes system.

    4) Spectral Theory : compact operators in Hilbert spaces. The Fredholm alternative. Spectral decomposition of compact self-adjoint operators in Hilbert spaces. Spectral theory of linear elliptic boundary value problems.

    5. Linear parabolic initial-boundary value problems. Existence and uniqueness by spectral decomposition on the eigenbasis of the associated elliptic operator. Classical examples (Navier-Stokes equation).

    The module will introduce a variety of computational approaches for solving partial differential equations, focusing mostly on finite difference methods, but also touching on finite volume and spectral methods. Students will gain experience implementing the methods and writing/modifying short programs in Matlab or other programming language of their choice. Applications will be drawn from problems arising in Mathematical Biology, Fluid Dynamics, etc. At the end of the module, students should be able to solve research-level problems by combining various techniques. Assessment will be by projects, probably 3 in total. The first project will only count for 10-20% and will be returned quickly with comments, before students become committed to completing the module. Typically, the projects will build upon each other, so that by the end of the module a research level problem may be tackled. Codes will be provided to illustrate similar problems and techniques, but these will require modification before they can be applied to the projects. The use of any reasonable computer language is permitted.

    Topics (as time permits).

    - Finite difference methods for linear problems: order of accuracy, consistency, stability and convergence, CFL condition, von Neumann stability analysis, stability regions; multi-step formula and multi-stage techniques.

    - Solvers for elliptic problems: direct and iterative solvers, Jacobi and Gauss-Seidel method and convergence analysis; geometric multigrid method. - Methods for the heat equation: explicit versus implicit schemes; stiffness.

    - Techniques for the wave equation: finite-difference solution, characteristic formulation, non-reflecting boundary conditions, one-way wave equations, perfectly matched layers. Lax-Friedrichs, Lax-Wendroff, upwind and semiLagrangian advection schemes.

    - Domain decomposition for elliptic equations: overlapping alternating Schwarz method and convergence analysis, non-overlapping methods.

    Scientific computing is an important skill for any mathematician. It requires both knowledge of algorithms and proficiency in a scientific programming language. The aim of this module is to expose students from a varied mathematical background to efficient algorithms to solve mathematical problems using computation.

    The objectives are that by the end of the module all students should have a good familiarity with the essential elements of the Python programming language, and be able to undertake programming tasks in a range of common areas (see below).

    There will be four sub-modules:

    1. A PDE-module covering elementary methods for the solution of timedependent problems.

    2. An optimization-module covering discrete and derivative-free algorithms.

    3. A patternrecognition-module covering searching and matching methods.

    4. A statistics-module covering, e.g., MonteCarlo techniques.

    Each module will consist of a brief introduction to the underlying algorithm, its implementation in the python programming language, and an application to real-life situations.

    he module covers both the theoretical underpinnings of convex optimisation and its applications to important problems in mathematical finance. A brief outline of the course reads as follows:

    • Fundamental properties of convex sets and convex functions
    • The basics of convex optimisation with special emphasis on duality theory
    • Markowitz portfolio theory and the CAPM model
    • Expected utility maximisation and no arbitrage
    • Convexity in continuous time hedging