Prof Jared Tanner (Oxford)
Deep learning is the dominant method for machines to perform classification tasks at reliability rates exceeding that of humans, as well as outperforming world champions in games such as go. Alongside the proliferating application of these techniques, the practitioners have developed a good understanding of the properties that make these deep nets effective, such as initial layers learning weights similar to those in dictionary learning, while deeper layers instantiate invariance to transforms such as dilation, rotation, and modest diffeomorphisms. There are now a number of theories being developed to give a mathematical theory to accompany these observations; this course will explore these varying perspectives.
Students will become familiar with the variety of architectures for deep nets, including the scattering transform and ingredients such as types of nonlinear transforms, pooling, convolutional structure, and how nets are trained. Students will focus their attention on learning a variety of theoretical perspectives on why deep networks perform as observed, with examples such as: dictionary learning and transferability of early layers, energy decay with depth, Lipschitz continuity of the net, how depth overcomes the curse of dimensionality, constructing adversarial examples, geometry of nets viewed through random matrix theory, and learning of invariance.