Salta al contenuto principale della pagina

## MATHEMATICAL STATISTICS

CODE 52503 2018/2019 11 credits during the 3nd year of 8766 Mathematical Statistics and Data Management (L-35) GENOVA 7 credits during the 1st year of 9011 Mathematics (LM-40) GENOVA 7 credits during the 2nd year of 9011 Mathematics (LM-40) GENOVA MAT/06 Italian (English on demand) GENOVA (Mathematical Statistics and Data Management) 1° Semester Prerequisites You can take the exam for this unit if you passed the following exam(s): Mathematical Statistics and Data Management 8766 (coorte 2018/2019) PROBABILITY 87081 Mathematical Statistics and Data Management 8766 (coorte 2016/2017) PROBABILITY 87081 Mathematical Statistics and Data Management 8766 (coorte 2017/2018) PROBABILITY 87081 AULAWEB

## OVERVIEW

An introduction to the classical theory of statistical models (model identification and estimation, parametric and not parametric models, exponential models), point estimation (moment method, likelihood method and invariant estimators) and methods of evaluating estimators (UMVUE estimators, Fisher information, Cramer-Rao inequality).

In the second part, the above theory is applied to a class of statistical models fundamental for the applications. Lab sessions with the softwares SAS and R are integral part of the course.

## AIMS AND CONTENT

### LEARNING OUTCOMES

To formalise estimation problems (parametric and non-parametric) and statistical hypothesis testing in a rigorous mathematical framework, to formulate and apply appropriate regression models to various typologies of data sets.

### AIMS AND LEARNING OUTCOMES

At the end of the course students will be able to

• recognise estimation problems (both parametric and non parametric) in applied contexts
• formulate them in a rigorous mathematical framework
• identify suitable regression models for data analysis, to analyse the data with advanced software, to summarise results of the analysis in a report including the interpretation of the results of the analysis and of their reliability.

### TEACHING METHODS

Classroom lectures and exercise sessions. In the second part there will be computer laboratory sessions whose aim is to practice the application of the theoretical models learnt during classroom lectures, to describe and predict a phenomenon of interests based on real case studies and data sets. During the lab sessions the student will be able to verify his/her level on understanding of the theory and its application.

### SYLLABUS/CONTENT

Program of the first part of the course:
Review of essential probability including the notion of conditional probability and multivariate normal distribution.

Statistical models and statistics|: the ideas of data sample and of statistical model, identifiability and regular models, the exponential family. Statistics and their distributions. Sufficient, minimal and sufficient, ancillary, complete statistics. The lemma of Neyman-Fisher. The Basu theorem.

Point estimators and their properties: methods to find point estimators: moment methods, least square method, maximum likelihood method, invariant estimators. Methods to evaluate estimators: theorems of Rao-Blackwell and Lehmann-Scheffé. UMVU estimators. Expected Fisher information, Cramer-Rao inequality and efficient estimators.

Statistical hypothesis testing: theorem of Neyman-Pearson for simple hypothesis, likelihood ration test.

Introduction to Bayesian statistics: prior and posterior probability distributions, conjugate priors, improper and flat priors, comparison with the frequentist approach to estimation.

At most one of the last two topics is part of the course for each given year.

Program of the second part of the course:
General linear models. ANOVA: crossed and nested factors; unbalanced data. Overparametrised models: reparametrization and generalised inverse function: theoretical considerations and practical implications. Multivariate linear regression models and models for repeated measures.

Generalised linear model. Exponential family. Link function. Models for categorical data (binomial, multinomial and Poisson models). Iterative methods for coefficients’ estimation: Newton-Raphson, scoring. Asymptotic distributions for likelihood based statistics. Statistical hypothesis testing and goodness of fit criteria: deviance, chi-squared. Residuals. Tests and confidence intervals for (subsets of) the models parameters. Odds-ratio and log-odd ratios. Models for ordinal data and contingency tables.

Lab sessions based on the softwares SAS and R.

Prima Parte/First part:

Testi consigliati/Text books:

G. Casella e R.L. Berger, Statistical inference, Wadsworth 62-2002-02  62-2002-09
D. A. Freedman, Statistical Models, Theory and Practice, Cambridge 62-2009-05

L. Pace e A. Salvan, Teoria della statistica, CEDAM 62-1996-01
M. Gasparini, Modelli probabilistici e statistici, CLUT 60-2006-08
D. Dacunha-Castelle e M. Duflo, Probabilites et Statistiques, Masson 60-1982-18/19/26 e 60-1983-22/23/24
A.C. Davison, Statistical Models, Cambridge University Press, Cambridge, 2003

David J. Hand, A very short introduction to Statistics, Oxford 62-2008-05
L. Wasserman. All of Statistics, Springer
J. Protter, Probability Essentials, Springer 60-2004-09
S.L. Lauritzen, Graphical models, Oxford University press 62-1996-14
D. Williams, Probability with Martingales, Cambridge Mathematical Textbooks, 1991

Appunti distribuiti a lezione/Handouts

Seconda Parte/Second part:

Dobson A. J. (2001). An Introduction to Generalized Linear Models 2nd Edition. Chapman and Hall.
Rogantin M.P. (2010). Modelli lineari generali e generalizzati. In rete.

## TEACHERS AND EXAM BOARD

### Exam Board

EVA RICCOMAGNO (President)

MARIA PIERA ROGANTIN (President)

EMANUELA SASSO

## LESSONS

### TEACHING METHODS

Classroom lectures and exercise sessions. In the second part there will be computer laboratory sessions whose aim is to practice the application of the theoretical models learnt during classroom lectures, to describe and predict a phenomenon of interests based on real case studies and data sets. During the lab sessions the student will be able to verify his/her level on understanding of the theory and its application.

### LESSONS START

The class will start according to the academic calendar.

### Class schedule

MATHEMATICAL STATISTICS

## EXAMS

### EXAM DESCRIPTION

The two parts of the course are examined together. There is an oral and a written exam.  The mark of each single question and the available time (usually three hours) are on the exam paper.

### ASSESSMENT METHODS

In the written exam there are three or four exercises. One of the exercises consists of commenting the output of an analysis done with statistical software. Past exams with solutions are available on the websites of the two parts of the course. The oral exam consists of questions on both parts of the course. The course work done during the lab sessions might be subject of the oral exam (thus bring with you at the exams that course work).

### Exam schedule

Date Time Location Type Notes
15/01/2019 09:00 GENOVA Scritto
16/01/2019 09:00 GENOVA Orale
04/02/2019 09:00 GENOVA Scritto
05/02/2019 09:00 GENOVA Orale
06/06/2019 09:00 GENOVA Scritto
12/07/2019 09:00 GENOVA Scritto
09/09/2019 09:00 GENOVA Scritto

### FURTHER INFORMATION

Pagina web dell'insegnamento:
Prima parte: http://www.dima.unige.it/~riccomag/Teaching/StatisticaMatematica.html
Seconda parte:  http://www.dima.unige.it/~rogantin/ModStat/

Prerequisiti Prima Parte: Analisi Matematica I e 2. Calcolo delle Probabilità .
Prerequisiti Seconda Parte: Argomenti di Statistica inferenziale e della prima parte di Statistica Matematica (quest'ultima svolta in parallelo) con corrispondenti prerequisiti.

Web pages of the couse are
for the first part: http://www.dima.unige.it/~riccomag/Teaching/StatisticaMatematica.html
for the second part: http://www.dima.unige.it/~rogantin/ModStat/

Prerequisite for the first part: Mathematical Analysis 1 and 2, Probability
Prerequisite for the second part: Statistical inference and in parallel the first part of Mathematical Statistics.