STATISTICAL MACHINE LEARNING

[573EC]
a.a. 2025/2026

2° Anno - Secondo Semestre

Frequenza Obbligatoria

  • 6 CFU
  • 48 ore
  • INGLESE
  • Sede di Trieste
  • Obbligatoria
  • Convenzionale
  • Orale
  • SSD INF/01
Curricula: DATA SCIENCE PER L'ASSICURAZIONE E LA FINANZA
Syllabus

In this course you will learn how to deal with complex data sets and build predictors and classifiers, using state of the art machine learning
approaches, and combine different methods to improve results. We will take the probabilistic perspective, modelling machine learning problems in a probabilistic framework, and deriving solutions and algorithms consistent with probability theory.

Knowledge and understanding: basic and some advanced topics in graphical models, exact and approximate inference, bayesian methods, kernel-based methods, and deep generative modelling, probabilistic programming languages.

Applying knowledge and understanding: being capable of dealing with a complex dataset and build effective predictors. Combine several methods of supervised and unsupervised learning to improve predictions. Being able to use state of the art tools, including machine learning Python libraries and probabilistic programming languages, to model and solve learning problems and apply approximate inference techniques.

Making Judgement: being capable of applying and combine machine learning methodologies in a critical way, identifying the most effective approaches to solve a given problem. Being able to critically compare different methods to evaluate their effectiveness.

Communication skills: being able to explain the basic ideas of probabilistic machine learning methods and communicate the results to a literate public.

Learning skills: being capable of exploring literature of machine learning to find alternative approaches and combine them to solve complex problems.

Basic knowledge of Python and scientific Python. Knowledge of statistics and machine learning, as from introductory courses of statistics and machine learning.

Probabilistic and Bayesian linear regression and classification. Kernel based methods and Gaussian Processes. Graphical models and exact inference. Sampling methods. Approximate inference for models with latent variables. Generative modelling.

Recommended
1. C. M. Bishop, Pattern recognition and machine learning. New York, NY: Springer, 2009.
2. K. P. Murphy, Machine learning: a probabilistic perspective. Cambridge, MA: MIT Press, 2012.

Other good textbooks
3. J. Friedman, T. Hastie, and R. Tibshirani, The elements of statistical learning, vol. 1. Springer series in statistics Springer, Berlin, 2001.

0. Empirical risk minimisation and PAC learning (hints)
1. Probabilistic linear regression and classification.
2. Bayesian regression and classification.
3. Kernel based methods and Gaussian Processes for regression and classification.
4. Graphical models and exact inference (Bayesian Networks, Markov Random Fields, Hidden Markov Models).
5. Sampling methods (basic sampling, Markov Chain Monte Carlo).
6. Approximate inference for models with latent variables (EM, Variational Inference).
7. Generative modelling (VAE, diffusion-based generative models).

Frontal lectures and hands on sessions, both individual and in groups. The balance will be roughly 70-80% of frontal lectures and 20-30% of hands-on sessions. Hands on activity typically involve experimenting with Python libraries for machine learning and developing or using/testing tools implementing the methodologies seen during lectures. During the lectures, homework exercises, both theoretical and coding-based, will be given.

Bring your own laptop

The exam will consist of two parts: 1. a group project work, in groups of 2 to 3 students. Each group will work on a well defined set of tasks, typically analysing a complex dataset or investigating and experiments a methodology not seen during the lectures in detail, and will have to give a brief presentation (10-12 minutes), with supporting slides, explaining the work done and provide commented code upon request. The topic of the project have to be proposed by the group of students and validated by the lecturer. Main points of evaluation are clarity and comprehensiveness of the presentation, understanding of the topic and depth and originality of the performed analyses. 2. an oral interview where few questions will be asked to asses the individual contributions in the project and the level of understanding on the topics of the course. Main points of evaluation are clarity and precision of answers, technical understanding of the methods and understanding of their conditions of applicability. The two parts can be done in the same session or in separate sessions, but the group presentation requires all group members to be present. The final mark is obtained by averaging the score for the project and the oral part. Laude can be given for an exceptional exam, typically in presence of successfully submitted homework. The exam will be evaluated according to the following criteria: - clarity and completeness of the exposition of the project - degree of originality with respect to course contents of the project, i.e. how much novel material is discussed - degree of understanding of the theoretical and practical aspects of the subject, as emerging from the project and the oral exam - clarity and precision of the exposition in the oral exam