Learning of narrow neural networks in high dimensions
Hugo Chao Cui (Assistant-doctorant, Laboratoire de physique statistique des systèmes computationnels)
Hugo Chao Cui
Title: Learning of narrow neural networks in high dimensions
Abstract: This talk explores the interplay between neural network architectures and data structure through the lens of high-dimensional asymptotics. We introduce the sequence multi-index model, which encompasses as special instances several previously studied models of feed-forward fully-connected networks, but also auto-encoder and attention architectures. In the limit of large data dimension and comparably large number of samples, but finite number of hidden units, we derive a tight asymptotic characterization of the learning of this generic model. As an illustration, we discuss how this characterization enables the analysis of the learning of a dot-product attention layer. We show how the latter can learn to implement either a positional attention mechanism (with tokens attending to each other based on their respective positions), or a semantic attention mechanism (with tokens attending to each other based on their meaning), and evidence a phase transition with sample complexity from positional to semantic learning.
Website: https://people.epfl.ch/hugo.cui
In Room M7 101, 1st floor, Monod campus, ENSL.