Agenda de l'ENS de Lyon

Laplacian Powers for Graph-Based Semi-Supervised Learning

mer 27 nov 2019



Soutenance de thèse de M. Esteban BAUTISTA RUIZ du LIP, sous la direction de M. Paulo GONCALVEES

Langue(s) des interventions
Description générale

Graph-Based Semi-Supervised Learning (G-SSL) learns from labelled and unlabelled data to build better classifiers. Despite successes, its performance can still be improved, particularly in challenging data settings or unbalanced scenarios. To address such limitations, this thesis proposes a novel G-SSL method based on the positive $\gamma$-th powers of the graph Laplacian matrix, referred to as the $L^\gamma$-PageRank method. Its theoretical study shows that (i) for $\gamma < 1$, it corresponds to an extension of the standard PageRank algorithm to L\'evy processes: where random walkers can now perform far-distant jumps in a single step; and (ii) for $\gamma > 1$, it operates on signed graphs: where nodes belonging to one same class are more likely to share positive edges while nodes from different classes are more likely to be connected with negative edges. We show the existence of an optimal $\gamma$ value that maximizes performance, for which a method for its automatic estimation is devised and assessed. Experiments on several datasets demonstrate that the Lévy flight random walkers can enhance the detection of classes with complex local structures and that the signed graphs can significantly improve classification performance and also override the issue of unbalanced labelled data. In addition, we study efficient implementations of $L^\gamma$-PageRank. Extensions of Power iteration and Gauss-Southwell, successful algorithms to efficiently compute standard PageRank, are derived for $L^\gamma$-PageRank. Moreover, the dynamic versions of such algorithms, which can update the solution of the standard PageRank algorithm in sublinear time when the graph evolves or new data arrive, are also extended to $L^\gamma$-PageRank. Lastly, we apply $L^\gamma$-PageRank in the context of internet routing under BGP protocol to successfully solve the issue of identifying the autonomous system (AS) of inter-AS links from the network of IP addresses and AS public registers.


Mots clés