UMR 5672

Vous êtes ici : Accueil / Séminaires / Machine Learning and Signal Processing / Deep learning theory through the lens of diagonal linear networks

Deep learning theory through the lens of diagonal linear networks

Scott Pesme (post-doc in the Thoth team at Inria Grenoble)

Quand ?	Le 10/12/2024, de 12:45 à 13:45
Participants	Scott Pesme
Ajouter un événement au calendrier	vCal iCal

Scott Pesme

Title: Deep learning theory through the lens of diagonal linear networks

Abstract: Surprisingly, many optimisation phenomena observed in complex neural networks also appear in so-called 2-layer diagonal linear networks. This rudimentary architecture—a two-layer feedforward linear network with a diagonal inner weight matrix—has the advantage of revealing key training characteristics while keeping the theoretical analysis clean and insightful.

In this talk, I’ll provide an overview of various theoretical results for this architecture, while drawing connections to experimental observations from practical neural networks. Specifically, we’ll examine how hyperparameters such as the initialisation scale, step size, and batch size impact the optimisation trajectory and influence the generalisation performances of the recovered solution.

Website: https://scottpesme.github.io/

In Room M7 101, 1st floor, Monod campus, ENSL.

Navigation

Navigation

UMR 5672

Deep learning theory through the lens of diagonal linear networks

Scott Pesme

Title: Deep learning theory through the lens of diagonal linear networks

Contacts

Plan d'accès

Navigation