Databases and Data Mining
Course offered in the second semester of the M1.
This course gives a general introduction of data management through relational databases and data mining from algorithms to theoretical aspects.
Topics covered:
Databases:
- Relational model, relational calculus, SQL
- Relational algebra, equivalence between relational calculus and algebra, indexation and optimisation
- Functional dependencies, axiomatisation, Armstrong’s relations
- Inclusion dependencies, data exchange, chase, query rewriting
- Openness to current issues
Data Mining:
- Introduction to data mining (itemset/rule mining, enumeration algorithms)
- constraint-based pattern mining: study of constraint properties and pruning
- Assessment of results (statistical significance, relevancy w.r.t. background knowledge, user prior)
- Advanced pattern mining: formal concept analysis, sequences, graphs, dynamics graphs, attributed graphs
- Clustering: general aims, distances, similarities, algorithms (kmeans, EM, density-based clustering, spectral clustering, hierarchical clustering), graph clustering
- Anomaly detection
- Current issues and challenges
Teaching methods: Lectures and Lab sessions (TP).
Form(s) of Assessment: written exam (50%) and project(s) (50%)
References:
- M. Levene, G. Loizou. A Guided Tour of Relational Databases and Beyond. Springer, 1999.
- I S. Abiteboul, R. Hull et V. Vianu. Fondements des bases de données. Addison-Wesley, 1995.
-
I R. Ramakrishnam et J. Gehrke. Database Management Systems. 2003 (third edition).
Material available on pages.cs.wisc.edu/~dbbook/)
- On line course: https://cs.stanford.edu/people/widom/DB-mooc.html
- Charu C. Aggarwal, Data Mining, Springer, 2015.
- Mohammed J. Zaki, Wagner Meira, Jr. Data Mining and Analysis Fundamental Concepts and Algorithms. Cambridge University Press, 2014.
Expected prior knowledge:
Link:
https://perso.liris.cnrs.fr/ecoquery/dokuwiki/doku.php?id=enseignement:dbdm:start