Data Engineering I

7.5 credits

Syllabus, Master's level, 1TD069

A revised version of the syllabus is available.
Code
1TD069
Education cycle
Second cycle
Main field(s) of study and in-depth level
Computational Science A1N, Computer Science A1N, Data Science A1N, Technology A1N
Grading system
Pass with distinction, Pass with credit, Pass, Fail
Finalised by
The Faculty Board of Science and Technology, 17 October 2022
Responsible department
Department of Information Technology

Entry requirements

120 credits in science/engineering including 80 credits in computer science and mathematics, of which at least 20 credits in computer science and 30 credits in mathematics. Computer science is to include at least 10 credits programming and Database Design I. Mathematics is to include linear algebra and probability and statistics. Proficiency in English equivalent to the Swedish upper secondary course English 6.

Learning outcomes

On completion of the course the student shall be able to:

  • Use public and private cloud infrastructure;
  • Discuss key concepts in cloud computing such as Infrastructure as a Service (IaaS), Platform as a service (PaaS) och Software as a Service (SaaS);
  • Apply cloud security best practices in solutions;
  • Use modern systems for handling massive datasets;
  • Analyze properties of data-intensive applications and based on this suggest suitable strategies and architectures to meet application needs;
  • Implement software based on analysis as in the previous point and using technology presented in the course;
  • Use container technology for automated deployment and continuous integration;
  • Critically analyse, discuss and present solutions and implementations in writing and orally.

Content

The course is an application-oriented introduction to cloud computing and data engineering. Basic concepts in cloud computing, such as virtualization, service layers, and basic security. Practical use of cloud infrastructure. Different storage management solutions and their advantages and disadvantages, including cloud-based dynamic allocation of volumes, object storage, distributed file systems and SQL and NoSQL databases. Design and development of batch analysis pipelines for large datasets. The MapReduce programming model and applications based on frameworks such as Apache Hadoop and Apache Spark. Evaluation and analysis of scalability, including concepts such as horizontal and vertical scaling, and strong and weak scaling. Deployment strategies using container technologies and an introduction to continuous integration and deployment.

Instruction

Lectures and seminars, guest lectures and laboratory work. Participants work both in groups and individually.

Assessment

Oral and written presentation on assignments. Written report on software project. Active participation in seminars.

If there are special reasons for doing so, an examiner may make an exception from the method of assessment indicated and allow a student to be assessed by another method. An example of special reasons might be a certificate regarding special pedagogical support from the disability coordinator of the university.

FOLLOW UPPSALA UNIVERSITY ON

facebook
instagram
twitter
youtube
linkedin