Machine Learning

Machine Learning

Course offered in the second semester of the M1.

This course gives a general introduction to Machine Learning, from algorithms to theoretical aspects in
Statistical Learning Theory.

Topics covered:
• General introduction to Machine Learning: learning settings, curse of dimensionality, overfitting/underfitting, etc.
• Overview of Supervised Learning Theory: True risk versus empirical risk, loss functions, regularization,
bias/variance trade-off, complexity measures, generalization bounds.
• Linear/Logistic/Polynomial Regression: batch/stochastic gradient descent, closed-form solution.
• Sparsity in Convex Optimization.
• Support Vector Machines: large margin, primal problem, dual problem, kernelization, etc.
• Neural Networks, Deep Learning.
• Theory of boosting: Ensemble methods, Adaboost, theoretical guarantees.
• Non-parametric Methods (K-Nearest-Neighbors)
• Domain Adaptation
• Metric Learning
• Optimal Transport

Teaching methods: Lectures and Lab sessions.
Form(s) of Assessment: written exam (50%) and project (50%)

References:
– Statistical Learning Theory, V. Vapnik, Wiley, 1998
– Machine Learning, Tom Mitchell, MacGraw Hill, 1997
– Pattern Recognition and Machine Learning, M. Bishop, 2013
– Convex Optimization, Stephen Boyd & Lieven Vandenberghe, Cambridge University Press, 2012.
– On-line Machine Learning courses: https://www.coursera.org/

Expected prior knowledge: basic mathematics and statistics – convex optimization

Data Bases and Data Mining

Databases and Data Mining

Course offered in the second semester of the M1.

This course gives a general introduction of data management through relational databases and data mining from algorithms to theoretical aspects.

Topics covered:

Databases:

  • Relational model, relational calculus, SQL
  • Relational algebra, equivalence between relational calculus and algebra, indexation and optimisation
  • Functional dependencies, axiomatisation, Armstrong’s relations
  • Inclusion dependencies, data exchange, chase, query rewriting
  • Openness to current issues

Data Mining:

  • Introduction to data mining (itemset/rule mining, enumeration algorithms)
  • constraint-based pattern mining: study of constraint properties and pruning
  • Assessment of results (statistical significance, relevancy w.r.t. background knowledge, user prior)
  • Advanced pattern mining: formal concept analysis, sequences, graphs, dynamics graphs, attributed graphs
  • Clustering: general aims, distances, similarities, algorithms (kmeans, EM, density-based clustering, spectral clustering, hierarchical clustering), graph clustering
  • Anomaly detection
  • Current issues and challenges

Teaching methods: Lectures and Lab sessions (TP).
Form(s) of Assessment: written exam (50%) and project(s) (50%)

References:

  • M. Levene, G. Loizou. A Guided Tour of Relational Databases and Beyond. Springer, 1999.
  • I S. Abiteboul, R. Hull et V. Vianu. Fondements des bases de données. Addison-Wesley, 1995.
  • I R. Ramakrishnam et J. Gehrke. Database Management Systems. 2003 (third edition).

    Material available on pages.cs.wisc.edu/~dbbook/)

  • On line course: https://cs.stanford.edu/people/widom/DB-mooc.html
  • Charu C. Aggarwal, Data Mining, Springer, 2015.
  • Mohammed J. Zaki, Wagner Meira, Jr. Data Mining and Analysis Fundamental Concepts and Algorithms. Cambridge University Press, 2014.

Expected prior knowledge:

Link:

https://perso.liris.cnrs.fr/ecoquery/dokuwiki/doku.php?id=enseignement:dbdm:start

Distributed Systems

Distributed Systems

Course offered in the second semester of M1.

Course contents:

Abstract: In this class, through many examples and training we will discover the main distributed algorithms. From the modelization to the implementation. Wave Algorithms, Transversal Algorithms, Leader Election, Mutual Exclusion are introduced.

In a second part of the class we will focus on an advanced distributed systems approach (this advanced part change over the years). We built distributed system using Erlang during the practical sessions. A distributed middleware will be introduced too.

Prerequisites:

Basic knowledge in algorithmics.

Teaching material:

Slides will be published in a private room due to the fact than updates are usual.

Proofs and Programs

Proofs and Programs

Contents of the course.

“Reasoning and programming are the two sides of the same coin”.

The core of this relation relies on the discovery of a very deep correspondence between Gentzen presentation for proofs of elementary statements from propositional calculus and typing Church’s λ-terms, named Curry-Howard correspondence (1980). Since then, the correspondence has been widely extended, which led toward powerful proof assistants and finer understanding of mathematical proofs; and correctness of programs as well.

Yet, until recently, equality was not properly understood, leaving on one side major mathematical techniques (congruences, quotients, …). In 2006 [Vladimir Voevodsky](Vladimir Voevodsky) envisioned a connection between equality proofs and homotopy. Since then, the relation between proofs and programs has regained a huge amount of attention. Recently (2015), a first contribution to providing a computational interpretation to the extended Type Theory HoTT has been proposed. We get closer to be able to do maths by programming!

References.

  • Lectures on C.-H. isomorphim, by M. H. Sørensen and P. Urzyczyn.
  • Homotopy Type Theory, collaborative book (http://homotopytypetheory.org/)

Prerequisites.

  • Logic (mainly Proof Theory, as opposed to Model Theory)
  • Basics in Lambda-calculus

Webpage of the course: https://perso.ens-lyon.fr/philippe.audebaud/PnP/

Cryptography and Security

Cryptography and Security

Course offered in the second semester of M1.

Course contents:

This course is an introduction to the different facets of modern cryptography.

The following topics will be addressed:

  • Symmetric cryptography
    Pseudo-random number generators
    Pseudo-random functions
    Message Authentication Codes
    Stream ciphers and block ciphers
    Security against chosen plaintext/ciphertext attacks
    Hash functions
  • Asymmetric cryptography
    Discrete logarithm, decision Diffie-Hellman
    Factoring, RSA problem
    Key exchange
    Digital signatures
    Public-key encryption
    The random oracle methodology
  • Other topics possibly covered:
    Zero-knowledge proofs
    Secret sharing
    Yao’s garbling circuits

Computer Algebra

This class shall survey some of the central topics of computer algebra
: arithmetics, effective linear algebra, polynomial system solving.
The following topics will be studied :

I. Arithmetic(s) — mostly polynomial.
– Representation of polynomials, linear operations (addition, evaluation).
– Multiplication : schoolbook method, Karatsuba method, Toom-Cook method
Fast Fourier Transform.
– Division : schoolbook method, Newton method, recursive division
– GCD : Euclidean algorithm, extended GCD, application to CRT; introduction
to fast GCD.
– Fast multiple point evaluation, fast interpolation, fast CRT.

II. Linear algebra over a field
– Matrix/vector product, naive matrix/matrix product
– Row echelon form & applications : kernel, image, rank, determinant,
linear system solving, etc.
– Fast matrix/matrix product : Strassen ; fast inversion/determinant.
– Structured matrices: circulant, Hankel, Toeplitz, Cauchy, and related
algorithms.

Information Theory

Information Theory

Course objectives:

This course will cover the basic questions of information theory as
well as applications of information theory in computer science. It
will cover the following topics:

  • Shannon theory:

    • – Data compression
    • – Channel coding
    • – Applications of entropies in computer science
    • Error correcting codes:

      • – Linear codes
      • – Reed-Solomon codes
      • – Capacity achieving codes with efficient encoding and decoding

    Pre-requisites

    Familiarity with basic probability and linear algebra. The necessary background for finite fields will be introduced.

    References

    1. Cover and Thomas, Elements of Information Theory
    2. Guruswami, Rudra and Sudan, Essential Coding Theory

    See the course website, and the webpage for the TDs.

Optimisation and Approximation

Optimisation and Approximation

The goal of this course is to present optimization techniques with a main focus on linear relaxation. Applications to approximation algorithms will be highlighted.

Contents of the course.

  1. Linear Programming.
    • Polytopes.
    • Simplex algorithm.
    • Duality and complementary slackness.
    • Sensitivity analysis.
    • Integer polytopes (flows, transportation, totally unimodular matrices).
    • Integrality gap, Erdos-Posa property.
    • VC-dimension.
  2. Approximation via linear relaxation.
    • Primal-Dual algorithms.
    • Branch and bound.
  3. Non linear optimization.
    • Semidefinite programming.
    • Convex optimization.

References.

  • Understanding and Using Linear Programming, Matousek and Gartner (2006).
  • Linear Programming, Chvatal (1983).

  • Applied Mathematical Programming, Bradley, Hax, and Magnanti (1977).
  • The Design of Approximation Algorithms David P. Williamson and David B. Shmoys (2010).

Prerequisites for the course.
Students should know the basics from Calculus and Linear Algebra.

The webpage of the course.

Compilation and Program Analysis

Compilation and Program Analysis

Contents of the course.

This course covers basic topics in compilation and program analysis: the differents steps of machine language production, as well as program analysis for optimisation and verification:

  • lexical and syntactic analysis;
  • typing, semantics;
  • code optimisation and code generation;
  • imperative languages, functional langages;
  • abstract interpretation.

Prerequisites : familiarity with formal languages (regular expressions, grammars), and induction reasoning, as well as basic notions of computer architecture and in Python programming.

The course has 1 course session (2hours) and 1 lab session (2 hours) per week during the first semester of M1. Students will develop their own compiler as well as a basic abstract interpretation tool.

Page for 2018-19 : http://laure.gonnord.org/pro/teaching/compilM1.html

Parallel and Distributed Algorithms and Programs (PDAP/APPD)

Contents of the course.

Parallelism is everywhere in computer science, since nowadays, each processor
has multiple cores. In this course, we will teach how to design efficient
parallel and distributed algorithms, and how to implement
them through lab sessions. Classes discuss theoretical models
used for the design and analysis and study of algorithms (complexity,
algorithm definition and analysis, approximation algorithms, …).
There will also be tutorials and lab sessions to understand the concepts
seen in class, and in particular there will be an initiation to
parallel and distributed programs through MPI (Message Passing Interface).
The evaluation is done through on-table exams and a programming homework.

The covered topics include sorting networks, PRAMs, algorithms on processor
rings or grids, distributed algorithms, and task graph scheduling.

References.

– Parallel Algorithms. H. Casanova, A. Legrand, Y. Robert. Chapman and
Hall/CRC Press, 2008.

– Introduction to Distributed Algorithms. Gerard Tel. Second Edition,
Cambridge University Press, 2000.

– Distributed Algorithms: An Intuitive Approach. Wan Fokkink. MIT
Press, 2013.

– Scheduling and Automatic Parallelization. A. Darte, Y. Robert, F.
Vivien. Birkhäuser, 2000.

Prerequisites.

Basic knowledge in algorithm design and NP-completeness, and programming skills.

Performance Evaluation and Networks

Performance Evaluation and Networks

Course focused on the performance evaluation in communication networks. “Performance evaluation” is a label that covers many topics, including the design of mathematical models (very often with probabilistic assumptions), the analysis of such models (mathematically or using simulation), the design of experimental protocols (measuring tools, data collection, validation of theoretical models), the analysis of data (statistics on the real experiments or in silico experiments).
This course will address these different items in the framework of communication networks. Some illustrations will be also picked from related systems like transport networks, logistics or automatic control.

References :

  • Performance Evaluation of Computer and Communication Systems. J.-Y. Le
    Boudec. EPFL Press, 2011.
  • Markov Chains, Gibbs Fields, Monte Carlo Simulation and Queues. P.
    Brémaud. Springer-Verlag, 1999.

Prerequisites : Requires basic knowledge in probability. Some knowledge about networks may be useful but not essential.

Current webpage : http://perso.ens-lyon.fr/eric.thierry/Perf2018/

Computational Geometry and Digital Images

Contents of the course

The objective of this course is to introduce fundamental notions of image processing, digital geometry, computational geometry and computer graphics. The first lectures will be dedicated to image processing (filtering, smoothing, morphological mathematics, …) and shape representation. Then, we will focus on theoretical and algorithmic issues involved in the analysis of digital shapes. During this analysis of digital geometry processing tools, we will have to present and integrate some tools from various fields: discrete mathematics, combinatorics, arithmetic, computational geometry… The lecture ends with some basic knowledge on computer graphics, geometry processing and rendering.

Table of contents

  • Image/Shape representation
  • Image processing (filtering, segmentation)
  • Digital Geometry for shape analysis
  • Computational geometry and data structures
  • Computer Graphics / Rendering

Prerequisites

basic notions of algorithms

Books

  • « Géométrie discrète et images numériques », D. Coeurjolly, A. Montanvert and J.-M. Chassery, Ouvrage collectif, Traité IC2, Hermès, 416 pages, 2007
  • « Digital Geometry: Geometric Methods for Digital Picture Analysis », Reihnard Klette, Azriel Rosenfeld, Morgan Kaufmann, 2004
  • « Computational Geometry: Algorithms and Applications », Mark de Berg, TU Eindhoven (the Netherlands) Otfried Cheong, KAIST (Korea),Marc van Kreveld, Mark Overmars, Utrecht University (the Netherlands), Springer-Verlag

Integrated Project

Integrated project

Course offered in M1.

Course contents:

The main goal of this class is to give to you the keys to manage an ambitious software project. Based on existing methods in software engineering and management, students will write a proposal and prove the feasibility of the project. Students will defend all their choices (software architecture, technical choice, etc.) and implement the project.

Students will be familiar with project roadmap. Work assessment come from reports during the project and a public final demonstration of the software.

Team management will be a key of success.

Only original and ambitious project with a maximum of students are selected.

A final and public demo is given.

It’s more than a class, it’s a great human and scientific adventure.

Prerequisites:

No limit. Come with all your background.

Webpages:

To discover the previous Integrated Projects, see
https://graal.ens-lyon.fr/~ecaron/Teaching/ensproject.html

Computational Complexity

Contents of the course

What computing resources do we need to solve a computational problem? The computational complexity theory studies this question from a structural point of view. Building up on the Turing machine model, we will discover and study deterministic and nondeterministic complexity classes, space and time complexity and hierarchy, completeness, randomized algorithms classes, boolean circuits and counting classes.

 

Bibliography:

Computational Complexity: A Modern Approach, Sanjeev Arora and Boaz Barak (mostly chapters 1 to 7)

more about computational complexity

 

De quelles ressources avons nous besoins pour résoudre un problème, faire un calcul ? La théorie de la complexité algorithmique étudie ce problème d’un point de vue structurel. En nous appuyant sur le modèle des machines de Turing, nous étudierons les classes de complexité déterministes et non déterministes, en temps et en espace, les hiérarchies entre ces classes, les problèmes complets, les algorithmes et classes probabilistes, les classes définies par les circuits ainsi que quelques notions sur les classes de comptage.

 

Bibliographie :

Computational Complexity: A Modern Approach, Sanjeev Arora and Boaz Barak, principalement les chapitres 1 à 7

Pour en savoir plus sur la théorie de la complexité algorithmique

https://fr.wikipedia.org/wiki/Théorie_de_la_complexité_(informatique_théorique)

Page du cours (for 2013-2014)

Teaching personnel