|SCIENTIFIC DISCIPLINARY SECTOR||INF/01|
When the size of structured and unstructured data exceeds the capacity of conventional database management systems, advanced tools and methods are required for capturing, storing and managing data. Such huge amounts of data are usually stored in large-scale distributed environments, processed using specific advanced data processing environments, and specific tools for their management are usually required. Semantic information plays a relevant role in this context and ethical issues need to be taken into account.
Learning the theoretical, methodological, and technological fundamentals of data management for advanced data processing architectures, with a specific reference to large-scale distributed environments, like key elements of NoSQL and stream-based systems as well as basic issues in parallel and distributed query processing, multi-query processing, and high-throughput transactional systems.
DESCRIBE the principles for data management in distributed systems, environments for large-scale data processing, systems for large-scale data management and approaches for data stream management
UNDERSTAND the differences between traditional data processing and management and large-scale (semantic) data processing and management
UNDERSTAND the differences between the presented approaches for large-scale (semantic) data management
UNDERSTAND the ethical implications of large-scale (semantic) data management
SELECT the system and the methodology for large-scale (semantic) data management, suitable in a given application context
USE some of the presented systems for large-scale (semantic) data management, for solving simple problems
USE at least one of the presented systems for large-scale (semantic) data management for solving non-trivial problems
ANSWER questions related to large-scale (semantic) data management
SOLVE exercizes related to the data design in one of the presented systems and the interaction with such systems, through the available languages
Prerequisites correspond to basic notions of data management in traditional systems:
Data model, notion of schema and instance
Conceptual data model
Relational model (logical model)
Basics of normalization theory
For LM in Computer Science: Class, project and outside preparation
For LM in Computer Engineering: Class, project (optional) and outside preparation
Recap on large scale distributed architectures and data-intensive computing
Recap on big data and distributed archtectures
Principles of large scale data management
Architectural approaches for large scale data management
Recap on environments for large scale data processing (MapReduce, Spark)
Systems for large-scale data management
Introduction to NoSQL systems
NoSQL data models
Key-value data stores
Document-based data stores
Column-family data stores
Graph-based data stores
Semantic data management [only for LM in Computer Science]
The role of semantics in data management
Models, languages, and systems for semantic data management
Knowledge graphs and ontologies for data integration
Ethic-based data management
Principles of ethic-based data management
Techniques for ethic-based data management
Office hours: Appointment by email or by Microsoft Teams Office: Valle Puggia – 327
Office hours: Appointment by email or by Microsoft Teams Office: Valle Puggia – 301
BARBARA CATANIA (President)
GIOVANNA GUERRINI (President Substitute)
All class schedules are posted on the EasyAcademy portal.
Written examination, oral examination (including project discussion).
Details on how to prepare for the examination and the required degree of knowledge for each topic will be provided during the lessons.
During the semester, we will propose some groupworks as well as a project, to be developed on one of the presented systems. The project is mandatory.
The written exam consists of a set of questions and exercizes on basic topics of the course; the goal of this test is to verify the understanding of the main issues addressed during the lessons.
The oral exam consists of an in-depth discussion of the solutions developed by the student for the given project, in order to assess whether the student has reached an appropriate level of knowledge.
For students that do not successfully complete the assignments, the oral exam will also include theoretical questions and / or practices of the course topics.