MACHINE LEARNING

[456MI]
a.a. 2025/2026

2° Year of course - First semester

Frequency Not mandatory

  • 6 CFU
  • 48 hours
  • INGLESE
  • Trieste
  • Obbligatoria
  • Oral Exam
  • SSD ING-INF/05
Curricula: SISTEMI

Is part of:

Syllabus

Knowledge and understanding. - Know and understand the main kinds of problems which can be tackled with ML. - Know the terminology and common mathematical notation for the key concepts of ML systems. - Know and understand the main supervised and unsupervised ML techniques. - Know and understand the phases of design, development, and assessment of a ML system. - Know and understand the main assessment metrics and procedures suitable for supervised and unsupervised ML systems; know and understand how to evaluate ML systems effectiveness, efficiency, applicability, intepretability. Applying knowledge and understanding. - Formulate a formal problem statement, using the proper terminology and mathematical notation, for simple practical problems in order to tackle them with ML techniques. - Design and develop simple end-to-end ML systems, possibly re-using existing software libraries. - Experimentally assess simple end-to-end ML systems in terms of effectiveness, efficiency, applicability, interpretability. Making judgements. - Judge if a problem can be tackled with ML. - Judge the technical soundness of a ML system. - Judge the technical soundness of the assessment of a ML system. Communication skills. - Describe, both in written and oral form, the motivations behind choices in the design, development, and assessment of a ML system, using the proper terminology and possibly exploiting simple plots. Learning skills. - Retrieve information from scientific publications about ML techniques not explicitly presented in this course.

- Basics of statistics: basic graphical tools of data exploration; summary measures of variable distribution (mean, variance, quantiles); fundamentals of probability.
- Basics of linear algebra: vectors, matrices, matrix operations.
- Basics of programming and data structures: algorithm, data types, loops, recursion, parallel execution, tree.
- Familiarity with manipulation of mathematical notation.

- Definition of Machine Learning; examples of applications of ML; taxonomy of ML problems; phases of design, development, and assessment of a ML system; terminology and mathematical notation for the key concepts.
- Supervised learning.
- Assessment.
- Tree-based methods.
- Support Vector Machines (SVM).
- Naive-Bayes classification.
- The K-nearest neighbors classifier.
- Unsupervised learning.
- Clustering-
- Applying ML to text (text mining)
- Sentiment analysis.
- Features for text mining: bag of words, tf-idf, ngrams.
- Common pre-processing steps.

Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. An Introduction to Statistical Learning, with applications in R. Springer, Berlin: Springer Series in Statistics, 2014.

- Definition of Machine Learning; examples of applications of ML; taxonomy of ML problems; phases of design, development, and assessment of a ML system; terminology and mathematical notation for the key concepts.
- Supervised learning.
- Assessment.
- Axes of assessment: effectiveness, efficiency, interpretability/explainability, and applicability.
- Bounds for effectiveness: random classifier, dummy classifier, Bayes classifier.
- Metrics for binary classification (accuracy, error, FPR, FNR, precision, recall, EER, AUC, other), multiclass classification (accuracy, error, weighted accuracy), regression (MAE, MSE, RMSE, MAPE)
- Methods of assessment with respect to data division
- Tree-based methods.
- Decision and regression trees: learning and prediction; role of the parameters and overfitting.
- Hyperparameter tuning.
- Trees aggregation: bagging, Random Forest.
- No free lunch theorem.
- Support Vector Machines (SVM).
- Separating hyperplane: maximal margin classifier; support vectors; learning as an optimization problem; maximal margin classifier limitations.
- Soft margin classifier: learning, role of the parameter C.
- Non linearly separable problems; kernel: brief background and main options (linear, polynomial, radial); intuition behind radial kernel; SVM,
- Multiclass classification with SVM.
- Naive-Bayes classification.
- The K-nearest neighbors classifier.
- Unsupervised learning.
- Clustering-
- Hierarchical methods-
- Partitional methods (k-means algorithm).
- Applying ML to text (text mining)
- Sentiment analysis.
- Features for text mining: bag of words, tf-idf, ngrams.
- Common pre-processing steps.

Approx. 80% of lectures: frontal lessons with projection and explanation of teacher’s slides (which are publicly available). Approx. 20%: exercises, under teacher’s supervision, in dealing with simple problems with ML techniques: the students will work in small groups and their laptops to design, implement, and assess solutions to small ML problems assigned by the teacher.

The exam can be done in two ways: (a) a project and an written test; (b) a written test only.
In the (a) case, the final grade is the average of the two grades: the exam is considered failed if at least one of the two grades is <18.
Students must register for the exam session of their interest using the online system (esse3). Note that there are deadlines for registration (usually 1 week before the session date).
Written test: questions on theory and application with short open answers. Each test consists of approx 3 questions with a longer answer and 3 questions with a short answer. Some of the questions are on the “theory”: they require to give the definition of a key concept, to tell the differences between a few key concepts, or list the suitable options for some given case. Some of the questions are on the “practice”: they require to describe an algorithm, possibly in the form of pseudocode, sketch a design of an ML system for a given case, solve some simple numerical problem involving quantities. The grade of the written test is the weighted average of the grades obtained for each question, with "long questions" weighting double the "short questions".
Project (home assignment): the student chooses a problem among a closed, teacher-defined set of problems and proposes a solution based on ML or EC techniques. The expected outcome is a written document (few pages) including: the problem statement; one or more performance indexes able to capture any solution ability to solve the problem; a description of the proposed solution from the algorithmic point of view; the results and a discussion about the experimental assessment of the solution with, if applicable, information about used data. Students may form groups for the project: in this case, the document must show, for each student of the group, which activities the student took part in. The project is evaluated according to clarity (~ 50%), technical soundness (~ 33%), and results (~ 17%).
In the final overall grade, honors (lode) are awarded if and only if the grade for every part is greater of equal to 30/30 and the average of all the parts exceeds 30/30.

In any type of content produced by the student for admission to or participation in an exam (projects, reports, exercises, tests), the use of Large Language Model tools (such as ChatGPT and the like) must be explicitly declared. This requirement must be met even in the case of partial use.
Regardless of the method of assessment, the teacher reserves the right to further investigate the student's actual contribution with an oral exam for any type of content produced.

This course explores topics closely related to one or more goals of the United Nations 2030 Agenda for Sustainable Development (SDGs).

icona 4 icona  9