Skip to main content
CODE 102300
ACADEMIC YEAR 2020/2021
CREDITS
SCIENTIFIC DISCIPLINARY SECTOR SECS-S/01
TEACHING LOCATION
  • GENOVA
SEMESTER 2° Semester
TEACHING MATERIALS AULAWEB

OVERVIEW

Experts introduce or present advances on statistical techniques that they use in their work by illustrating their applications through concrete examples.

AIMS AND CONTENT

LEARNING OUTCOMES

Provide statistical tools relevant to specific applications and the experience of on-field experts. 

AIMS AND LEARNING OUTCOMES

Measurement models in psychometrics (16 hours of in-presence lectures)
The course introduces to statistical issues in psychometric theory and to the use of statistical software (R) for carrying out basic psychometric analyses.

Demography in Italy and in the world: topics, data and measurements (8 hours front lectures)
To illustrate via a complex example the issues related to the communication of demographic data to the general population.

Official statistics (8 hours front lectures)
The system of official statistics.

Multiple statistical tests in biomedical research (8 hours of front lectures)
To understand the problem related to performing multiple statistical hypothesis tests and to learn how to handle it critically.

Further seminar activities (not evaluated) could be organised each year. Usually they are presented by data scientists who work in applied contexts such as companies, consumer companies, public bodies. 

TEACHING METHODS

Combination of traditional lectures and lab sessions with the softwaresMatlab and R.

SYLLABUS/CONTENT

Pattern recognition and applications 
After an overview on pattern recognition and the criteria for applications, the following topics are addressed.

  • Bayesian decision theory. Maximum a posterior probability. Classification and regression. Naïve Bayes. Construction of optimal classifier. Parameter estimation. Performance evaluation. Cross-validation.
  • General statistical classifiers. Gaussian mixtures and EM algorithm. Outlier detection. Some simple non parametric techniques. Introduction to Bayesian networks and inference on graphs.
  • Dimensionality reduction. Feature selection. Genetic methods. Linear transformation of the sample space: PCA/LDA/ICA. Non linear maps (t-SNE).
  • Decision trees. The CART method. Bagging and random forest. Boosting. Statistical modelling with trees.
  • Neural nets for classification. Multi-strata models and learning algorithm. Devising a neural classifier. Neural nets as generalised approximators. Introduction to deep learning (convolutional networks and stacked autoencoders)

Theoretical lectures are interwinedwith examples of applications such as

  • Optical character recognition. Construction of classifiers with different levels of complexity (from Naive Bayes to convolutional network) for the recognition of handwritten or printed text.
  • Automatic counting systems and event detectors. Imagine analysis for the detection of faces, people, vehicles, … Identification of key features and definition of an optimal binary acceptance test via boosting techniques.
  • Statistical modelling of complex machineries. Definition of non-linear input-output relationships for the forecasting of a target variable (e.g. energy consumption) from instrumental data with random forest and neural nets.
  • Quality control and predictive maintenance. Probabilistic distribution of sensor data andanomaly/outlier detection. Estimation of the system residual lifetime (TTF).

The application will be illustrate with the aid of suitable Matlab toolbox and original data during guided hands-onsessions.

Psychometrics
Classical test theory
Psychological variables or constructs
Definition of the content domain of a construct and its operationalizations
Measurement models in psychology: reflective indicators models and formative indicators models
Item analysis and reliability
Exploratory factor analysis
Confirmatory factor analysis
Structural Equation Models
Applications in R (packages 'psych', 'lavaan' e 'semPlot') will be shown.

Demography

The course is based on the careful reading and analysis of the volume Tutto quello che non vi hanno mai detto sull'immigrazione (2015, Laterza) by Gianpiero Dalla Zuanna and Stefano Allievi. Collecting, analyzing and presenting data for helping the society tot ransform opportunities into new realities.

 

RECOMMENDED READING/BIBLIOGRAPHY

Pattern recognition and applications
Handouts available at the web site http://www.onairweb.com/corsoPR/
Further reading:
R.Duda, P.Hart, D.Stork, Pattern Classifcation, Wiley, (2001)
S.Theodoridis, K.Koutroumbas, Pattern Recognition, Academic Press, (2006)
C.Bishop, Pattern Recognition and Machine Learning, Springer, (2007)
S.Theodoridis, Machine Learning, a Bayesian and Optimization Perspective, Academic Press, (2015)

Psychometry
Rust, J. & Golombok, S. (2009). Modern psychometrics, 3rd ed. Hove: Routledge (chapters 1, 2, 3, 4, and 7). 
Handouts and other teaching material (e.g., R codes) will be shared online.  

Demografia
Gianpiero Dalla Zuanna e Stefano Allievi (2015). Tutto quello che non vi hanno mai detto sull'immigrazione, Laterza.

TEACHERS AND EXAM BOARD

Exam Board

EVA RICCOMAGNO (President)

MARTA NAI RUSCONE

CARLO CHIORRI (President Substitute)

MARIA PIERA ROGANTIN (Substitute)

LESSONS

LESSONS START

The class will start according to the academic calendar.

Class schedule

The timetable for this course is available here: Portale EasyAcademy

EXAMS

EXAM DESCRIPTION

Pattern recognition and applications
Written exam with multiple-choice questions and its discussion

Psychometrics
Written exam and its discussion.

Demography
Written exam with multiple choice or open questions

The final mark is the weighted average of the marks of the three parts. The weights are proportional to the hours of classroom lectures.

ASSESSMENT METHODS

Pattern recognition and applications
The exam consists of 25 questions with multiple-choice answers, regarding all topics discussed during the course. Answers can be numeric, true/false and might require elementary calculations. Lecture notes or other material are not allowed. A pocket calculator may be useful but not essential. The duration is 45 minutes. The correction takes place just after the exam. It is possible to motivate some answers by providing a suitable reasoning scheme.

Psychometrics: Students will be presented with the R output of some statistical analyses carried out on real data. In the written part of the exam the ability of the students to apply what they have learnt through the lectures and the course materials will be tested, as they will be asked to interpret and comment the results and detect flaws of the statistical analyses. In the oral discussion issues with the answers to the written exam will be reviewed and discussed and knowledge of psychometric theory will be tested.

Demography: The acquired stills to identify in a complex text specific information and data as well as the supporting statistical analysis underlying them.

Exam schedule

Data appello Orario Luogo Degree type Note
10/05/2021 09:00 GENOVA Scritto + Orale
14/06/2021 09:00 GENOVA Scritto + Orale
16/07/2021 09:00 GENOVA Scritto + Orale

FURTHER INFORMATION

Web pageshttp://www.onairweb.com/corsoPR/ https://www.dropbox.com/s/groq642v7rbviha/Lezioni%20SMID%202016.zip?dl=0

Prerequisites: Applied Statistics 1

Attendance is highly recommended.