CODE  52480 

ACADEMIC YEAR  2021/2022 
CREDITS 

SCIENTIFIC DISCIPLINARY SECTOR  SECSS/01 
LANGUAGE  Italian 
TEACHING LOCATION 

SEMESTER  2° Semester 
TEACHING MATERIALS  AULAWEB 
The course introduces the student to the exploratory statistical analysis of multivariate data by pointing out the mathematical aspects and by developing the essential skills for the interpretation of the data under investigation. Laboratory sessions provide students with the opportunity to analyse, discuss, and solve real problems.
To provide the main concepts and methodologies for the exploratory analysis of univariate and multivariate data.
Exploratory analysis of uni and bivariate data.
Qualitative/categorical variables. Counts and frequencies, distribution of a variable. Joint and marginal distributions of two variables, conditional distributions (row and column profiles). Independence. Graphical representations.
Quantitative variables. Distribution and cumulative distribution functions, quantile function, and their graphical representations. Measures of centrality and dispersion based on moments and quantiles; their properties and L1 and L2 metrics. Covariance and correlation between two quantitative variables. Geometrical interpretation of variance, covariance and correlation.
Exploratory analysis of multivariate data.
Cluster analysis. Hierarchical clustering: linkages based on distance and inertia; dendogram; induced ultrametric; variable clustering. Kmeans clustering: initialization and stop of algorithm, stable clusters.
Principal component analysis. ``Best’’ representation of multivariate data (row points of data matrix) in a vector space with lower dimension; accuracy of representation. Change of base (eigenvectors of the correlation matrix). Properties of principal components. Geometrical representation of correlations.
Multiple regression. Vector space generated by the explanatory variables (column points of data matrix). Linear least square method and geometrical meaning of residual minimization. Variance decomposition of the response variable. Descriptive goodnessoffit: residual plots and Rsq index (with geometrical interpretation). Oneway ANOVA (analysis of variance) and between/within variance decomposition.
Pratical sections in lab using software R
M. P. Rogantin (2016) Statistica descrittiva
(available on AulaWeb and at http://www.dima.unige.it/~rogantin/StDescrittiva2/StatDescrittiva.pdf)
Maindonald J., Braun W. J, (2010). Data analysis and graphics using R: an examplebased approach. 3. ed. Cambridge University press
I.T. Jolliffe (2002). Principal Component Analysis. Springer Series in Statistics
Office hours: By appointment
FRANCESCO PORRO (President)
SARA SOMMARIVA
ALBERTO SORRENTINO (President Substitute)
The class will start according to the academic calendar.
Date  Time  Location  Type  Notes 

19/01/2022  09:00  GENOVA  Scritto + Orale  solo per gli studenti che hanno frequentato l'insegnamento nell'a.a.2020/21 o in a.a. precedenti 
04/02/2022  09:00  GENOVA  Scritto + Orale  solo per gli studenti che hanno frequentato l'insegnamento nell'a.a.2020/21 o in a.a. precedenti 
10/06/2022  09:00  GENOVA  Scritto  
14/06/2022  09:00  GENOVA  Orale  
13/07/2022  09:00  GENOVA  Scritto  
15/07/2022  09:00  GENOVA  Orale  
01/09/2022  09:00  GENOVA  Scritto  
02/09/2022  09:00  GENOVA  Orale 