Code: 14PD-E Data processing
Lecturer: Ing. Mgr. Michal Jeřábek Ph.D. Weekly load: 2P+4C Completion: A, EX
Department: 16114 Credits: 6 Semester: S
Description:
Students will learn about tools for data processing and analysis, using practical examples to try out the most common options used in data processing, including advanced options for presenting the results of analyses. In advanced methods, students will also perform specific analysis using Bayesian networks. Students will then independently perform data analysis on data from existing open systems.
Contents:
Part 1 introduces data processing tools and is divided into 3 blocks:
Block 1: introduction to R - environment, concept, basics, simple examples, basic libraries, examples and usage (students install R)
Block 2: applied R - applied examples from practice, map library, data retrieval from different sources and their modification (GIS, RDBMS, CSV, etc.)
Block 3: advanced R - interactive presentation module (shiny), other modules by agreement
Part 2 deals with a specific model for data processing, Bayesian networks and is also divided into 3 blocks:
Block 1: Basics of Bayesian networks, specialized software for Bayesian networks, modeling, basics of graph theory and probability.
Block 2: Preparing data for subsequent use of Bayesian networks, plotting the first Bayesian network, algorithms for network learning, parameters, inference; linking with GeNia.
Block 3: Performing inference in Bayesian networks.
Seminar contents:
Part 1 introduces data processing tools and is divided into 3 blocks:
Block 1: introduction to R - environment, concept, basics, simple examples, basic libraries, examples and usage (students install R)
Block 2: applied R - applied examples from practice, map library, data retrieval from different sources and their modification (GIS, RDBMS, CSV, etc.)
Block 3: advanced R - interactive presentation module (shiny), other modules by agreement
Part 2 deals with a specific model for data processing, Bayesian networks and is also divided into 3 blocks:
Block 1: Basics of Bayesian networks, specialized software for Bayesian networks, modeling, basics of graph theory and probability.
Block 2: Preparing data for subsequent use of Bayesian networks, plotting the first Bayesian network, algorithms for network learning, parameters, inference; linking with GeNia.
Block 3: Performing inference in Bayesian networks.
Recommended literature:
Jan Rauch, Milan Šimůnek: Dobývání znalostí z databází, LISp-Miner a GUHA. Praha: Oeconomica VŠE, 2014.
Petr Berka: Dobývání znalostí z databází. Praha: Academia, 2003.
Irena Holubová, Karel Minařík, David Novák, Jiří Kosek: Big Data a NoSQL databáze.
Arun K. Somani, Ganesh Chandra Deka: Big Data Analytics. CRC Press, 2017.
Keywords:
Bayesian networks, data processing, data analysis

Abbreviations used:

Semester:

Mode of completion of the course:

Weekly load (hours per week):