Provide the students with the basic skills for extracting knowledge and knowledge from large data sets.
Develop the basic skills for extracting knowledge and knowledge from large data sets, in particular by forming an
At the end of the course students will
Combination of traditionals lectures and lab sessions
First part: introduction to aata mining and applications in fraud detection Introduction to Data Mining, Data science and big data analytics Main techniques The Data Mining Process - CRISP Seven Class of Algorithms Supervised Learning – Classification Unsupervised Learnimg – Clustering Outliers detection Regression Reinforced Learning Ranking Deep Learning Top ten data mining algorithms Examples and application using WEKA Application to marketing, finance and medicine Big Data and Hadoop The NOSql paradigm
Second part: Machine Learning Algorithms for Data mining Introduction to Data Mining and Machine Learning. Taxonomy of the Data Mining problems Statistical Inference Support Vector Machines (extension to kernels) Support Vector Regression (extension to kernels) K-means and Spectral Clustering Decision Trees and Random Forests Model Selection and Error Estimation
Ricevimento: By appointment arranged by email with Luca Oneto luca.oneto@unige.it and Fabrizio Malfanti <fabrizio.malfanti@intelligrate.it> For organizational issues contact by email Eva Riccomagno <riccomagno@dima.unige.it>
FABRIZIO MALFANTI (President)
EVA RICCOMAGNO (President)
LUCA ONETO
25 September 2018
DATA MINING
To take the exam, you must sign up online. The examination of the first part consists of the discussion of a group project on a topic agreed with the lecturer and of a written examination on which the oral examination can be based. The examination of the second part consists of the discussion of a project on a topic agreed with the lecturer and developed autonomously by the student. The final mark is the weighted average of the marks of the two parts with weights the number of ECTS of each part, namely 3 ECTS for each part.
The exam will check if the student has learned the methodologies and techniques for extracting knowledge from a big set of data through a small project which requires the solution of a real world data mining problem.
The web page of the first part of the course is https://sites.google.com/view/lucaoneto/teaching/dm-smid