NATURAL LANGUAGE PROCESSING

[478SM]
a.a. 2025/2026

2° Year of course - First semester

Frequency Not mandatory

  • 6 CFU
  • 48 hours
  • English
  • Trieste
  • Opzionale
  • Standard teaching
  • Oral Exam
  • SSD INF/01
Curricula: FOUNDATIONS OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
Syllabus

In this course you will learn how to deal with textual data coming from
different sources (formal and informal) by understanding how to process
them, which are the best representation techniques and which are the
most popular tasks where they are needed.
Knowledge and understanding: text processing and representation, methods to approach Natural Language processing tasks (traditional
approaches, machine learning, and deep learning), evaluation metrics.

Applying knowledge and understanding: being capable of processing and
representing textual data, recognizing Natural Language Processing
tasks, which solution to apply and how to evaluate it.

Making judgements: being able to navigate in the context of the analysis
of textual data, with vigilant attention to the collection and constitution of
good corpora, and to the multiple facets of feasible Natural Language
Processing tasks.

Communication skills: being able to motivate and present methodologies
and results to both experts and to non-experts.

Learning skills: being capable of understanding the core ideas and
improvements of state-of-the art research

- Proficiency in Python
- Linear Algebra
- Basic Probability and Statistics
- Foundations of Machine Learning

Processing text data and a statistical toolkit for NLP:
i) Corpus constitution, levels of analysis and preparation of text data
ii) Content analysis by correspondence analysis, clustering and topic
detection
iii) Keyness analysis and explainable text classification
iv) Chronological corpora and temporal content mapping

Deep Neural Network for NLP:
i) Foundations: multi-layer perceptrons for text classification and
language modelling;
ii) Word embeddings from deep-learning (word2vec, glove);
iii) Recurrent neural network, encoder-decoder architectures, and
attention mechanism;
iv) Transformers: multi-head self-attention layer, self-supervised tasks for
language, and encoder or/decoder decoder models

Recommended
1. Dirk Hovy. Text Processing with Python for Social Scientists.
2. Dan Jurafsky and James H. Martin. Speech and Language Processing.
3. Yoav Goldberg. A Primer on Neural Network Models for Natural
Language Processing.
4. I. J. Goodfellow, Y. Bengio, and A. C. Courville, Deep Learning. MIT
press, 2016

Processing text data and a statistical toolkit for NLP:
i) Corpus constitution, levels of analysis and preparation of text data
ii) Content analysis by correspondence analysis, clustering and topic
detection
iii) Keyness analysis and explainable text classification
iv) Chronological corpora and temporal content mapping

Deep Neural Network for NLP:
i) Foundations: multi-layer perceptrons for text classification and
language modelling;
ii) Word embeddings from deep-learning (word2vec, glove);
iii) Recurrent neural network, encoder-decoder architectures, and
attention mechanism;
iv) Transformers: multi-head self-attention layer, self-supervised tasks for
language, and encoder or/decoder decoder models

Frontal lectures (50%) and hands on sessions (50%), both individual and
in groups. Each lecture will consist of a first part of frontal teaching and a
following part of hands-on training

Bring your own laptop.

The exam will consist of:
I) A written test to assess the understanding and ability of connecting and
interpreting the main topics.
The grade is based on the degree of understanding of the theoretical and
practical aspects of the subject and the degree of individual deepening
and personal processing of knowledge.

II) -1. An individual or group project work (groups of max of 3 students).
Each group should choose a domain and dataset, implement a NLP
solution and evaluate the results. The student/s should write a short
report and give a brief presentation about the project.
-2. An oral exam consisting of three theoretical questions on the topics
of the course.
The final grade averages I) and II) evaluations.