CODE | 61884 |
---|---|
ACADEMIC YEAR | 2017/2018 |
CREDITS | 9 credits during the 1st year of 9014 Computer Science (LM-18) GENOVA |
SCIENTIFIC DISCIPLINARY SECTOR | INF/01 |
LANGUAGE | English |
TEACHING LOCATION | GENOVA (Computer Science) |
SEMESTER | 1° Semester |
TEACHING MATERIALS | AULAWEB |
When the size of structured and unstructured data exceeds the capacity of conventional database management systems, advanced tools and methods are required for capturing, storing and managing data. Such huge amounts of data are usually stored in large-scale distributed environments, processed using specific advanced data processing environments, may be already available or arrive as a stream at processing time, and specific tools for their management are usually required.
Students will be provided with a sound grounding on theoretical, methodological, and technological fundamentals concerning data management for advanced data processing architectures, with a specific reference to large-scale distributed environments. Students will learn key elements of NoSQL and stream-based systems as well as basic issues in parallel and distributed query processing, multi-query processing, and high-throughput transactional systems. Students will be involved in project activities.
Class, project and outside preparation
Introduction to data management in distributed systems
Introduction to Big Data
Introduction to distributed archtectures
Principles of large scale data management
Architectural approaches for large scale data management
Environments for large scale data processing (data-intensive computing)
Batch processing and MapReduce paradigm
From (Hadoop) MapReduce to Spark
High level languages for large scale data processing
Systems for large-scale data management
Introduction to NoSQL systems
NoSQL data models
Column-family data stores
Graph-based data stores
Stream-based data management
Introduction to stream data management
Models and languages for stream-data management
Large-scale stream data management
Serge Abiteboul, Ioana Manolescu, Philippe Rigaux, Marie-Christine Rousset, Pierre Senellart. Web Data Management. Cambridge University Press, 2011.
Martin Kleppmann. Designing Data-Intensive Applications. O'Reilly, 2017.
+
Material and references provided by the instructors.
Office hours: Appointment by email Office: Valle Puggia – 301
Office hours: Appointment by email Office: Valle Puggia – 328
BARBARA CATANIA (President)
LAURA DI ROCCO
GIOVANNA GUERRINI
ELENA ZUCCA
Class, project and outside preparation
Tuesday, October 17th 2017
Written examination, oral examination (including project discussion).
Details on how to prepare for the examination and the required degree of knowledge for each topic will be provided during the lessons.
During the semester, we will propose some groupworks as well as a project, whose development should be delivered just before the written examination.
In case of positive rate of the exercizes:
In case of negative rate of the exercizes:
Date | Time | Location | Type | Notes |
---|---|---|---|---|
16/02/2018 | 09:00 | GENOVA | Esame su appuntamento | |
27/07/2018 | 09:00 | GENOVA | Esame su appuntamento | |
21/09/2018 | 09:00 | GENOVA | Esame su appuntamento | |
28/02/2019 | 09:00 | GENOVA | Esame su appuntamento |