Salta al contenuto principale della pagina

ADVANCED DATA MANAGEMENT

CODE 61884
ACADEMIC YEAR 2022/2023
CREDITS
  • 6 cfu during the 1st year of 11160 COMPUTER ENGINEERING (LM-32) - GENOVA
  • 9 cfu during the 2nd year of 10852 COMPUTER SCIENCE (LM-18) - GENOVA
  • SCIENTIFIC DISCIPLINARY SECTOR INF/01
    LANGUAGE English
    TEACHING LOCATION
  • GENOVA
  • SEMESTER 1° Semester
    TEACHING MATERIALS AULAWEB

    OVERVIEW

    When the size of structured and unstructured data exceeds the capacity of conventional database management systems,  advanced tools and methods are required for capturing, storing and managing data. Such huge amounts of data are usually stored in large-scale distributed environments, processed using specific advanced data processing environments, and specific tools for their management are usually required. Semantic information plays a relevant role in this context and ethical issues need to be taken into account.

    AIMS AND CONTENT

    LEARNING OUTCOMES

    Learning the theoretical, methodological, and technological fundamentals of data management for advanced data processing architectures, with a specific reference to large-scale distributed environments, like key elements of NoSQL and stream-based systems as well as basic issues in parallel and distributed query processing, multi-query processing, and high-throughput transactional systems.

    AIMS AND LEARNING OUTCOMES

    DESCRIBE the principles for data management in distributed systems, environments for large-scale data processing,  systems for large-scale data management and approaches for data stream management

    UNDERSTAND the differences between traditional data processing and management and large-scale (semantic) data processing and management

    UNDERSTAND the differences between the presented approaches for large-scale (semantic) data management  

    UNDERSTAND the ethical implications of large-scale (semantic) data management

    SELECT the system and the methodology for large-scale (semantic) data management, suitable in a given application context

    USE some of the presented systems for  large-scale (semantic) data management, for solving simple problems

    USE at least one of the presented systems for large-scale (semantic) data management for solving non-trivial problems

    ANSWER questions related to large-scale (semantic) data management

    SOLVE exercizes related to the data design  in one of the presented systems and the interaction with such systems, through the available languages

     

    PREREQUISITES

    Prerequisites correspond to basic notions of data management in traditional systems:

    Data model, notion of schema and instance
    Conceptual data model
    Relational model (logical model)
    Conceptual design
    Logical design
    Basics of normalization theory
    Relational algebra
    SQL
    Index
    Transaction

    TEACHING METHODS

    For LM in Computer Science: Class, project and outside preparation

    For LM in Computer Engineering: Class, project (optional) and outside preparation

    SYLLABUS/CONTENT

    Recap on large scale distributed architectures and data-intensive computing

    Recap on big data and distributed archtectures
    Principles of large scale data management
    Architectural approaches for large scale data management
    Recap on environments for large scale data processing (MapReduce, Spark)

    Systems for large-scale data management

    Introduction to NoSQL systems
    NoSQL data models
    Key-value data stores
    Document-based data stores
    Column-family data stores
    Graph-based data stores

    Semantic data management [only for LM in Computer Science]

    The role of semantics in data management
    Models, languages, and systems for semantic data management
    Knowledge graphs and ontologies for data integration

    Ethic-based data management

    Principles of ethic-based data management
    Techniques for ethic-based data management

    RECOMMENDED READING/BIBLIOGRAPHY

    • Serge Abiteboul, Ioana Manolescu, Philippe Rigaux, Marie-Christine Rousset, Pierre Senellart. Web Data Management. Cambridge University Press, 2011.
    • P.J. Sadalage, M.Fowler. Nosql Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison Wesley, 2013
    • Jeff Carpenter, Eben Hewitt,  Cassandra: The Definitive Guide, O'Reilly Media, 2016
    • Ian Robinson, Jim Webber & Emil Eifrem. Graph Databases, New Opportunities  for Connected Data, 2nd Edition, O’Reilly, 2015
    • Additional material and references provided by the instructors.

    TEACHERS AND EXAM BOARD

    Exam Board

    BARBARA CATANIA (President)

    DANIELE TRAVERSARO

    GIOVANNA GUERRINI (President Substitute)

    LESSONS

    Class schedule

    All class schedules are posted on the EasyAcademy portal.

    EXAMS

    EXAM DESCRIPTION

    Written examination, oral examination (including project discussion).

     

    ASSESSMENT METHODS

    Details on how to prepare for the examination and the required degree of knowledge for each topic will be provided during the lessons.

    During the semester, we will propose some groupworks as well as a project, to be developed on one of the presented systems. The project is mandatory.

    The written exam consists of a set of questions and exercizes on basic topics of the course; the goal of this test is to verify the understanding of the main issues addressed during the lessons.

    The oral exam consists of an in-depth discussion of the solutions developed by the student for the given project, in order to assess whether the student has reached an appropriate level of knowledge.

    For students that do not successfully complete the assignments, the oral exam will also include theoretical questions and / or practices of the course topics.

    Exam schedule

    Date Time Location Type Notes
    09/01/2023 09:00 GENOVA Scritto
    09/01/2023 09:00 GENOVA Scritto
    10/02/2023 09:00 GENOVA Scritto
    10/02/2023 09:00 GENOVA Scritto
    12/06/2023 09:00 GENOVA Scritto
    12/06/2023 09:00 GENOVA Scritto
    18/07/2023 09:00 GENOVA Scritto
    18/07/2023 09:00 GENOVA Scritto
    13/09/2023 09:00 GENOVA Scritto
    13/09/2023 09:00 GENOVA Scritto