Syllabus for Data Engineering II
Data engineering II
Syllabus
- 7.5 credits
- Course code: 1TD075
- Education cycle: Second cycle
-
Main field(s) of study and in-depth level:
Computer Science A1F,
Data Science A1F,
Computational Science A1F
- Grading system: Fail (U), Pass (3), Pass with credit (4), Pass with distinction (5)
- Established: 2020-02-27
- Established by: The Faculty Board of Science and Technology
- Revised: 2022-10-23
- Revised by: The Faculty Board of Science and Technology
- Applies from: Autumn 2023
-
Entry requirements:
120 credits including participation in Data Engineering I. Proficiency in English equivalent to the Swedish upper secondary course English 6.
- Responsible department: Department of Information Technology
Learning outcomes
On completion of the course the student shall be able to:
- describe pros and cons of modern systems for handling data streams and use them in practice to address application needs;
- analyse properties of data intensive applications relying on streaming data and apply it to propose suitable solution architectures, including combination or batch and streaming data;
- implement software where the analysis from the previous point and technology addressed in the course is used;
- account for and handle practical aspects related to putting machine learning models into production;
- use frameworks for large-scale distributed machine learning;
- critically analyse, discuss and present solutions and implementations in writing and orally.
Content
The aim of this course is to gain advanced knowledge in technology used for scalable analysis of streaming data, to understand processes and technologies for large-scale distributed machine learning, and practical knowledge in how to architect and automate pipelines and workflows to handle the chain from data ingestion to machine learning models in production. Advanced concepts in cloud computing such as container orchestration and automation. Theory and frameworks for streaming data such as Apache Spark and Apache Kafka. Deployment and use of frameworks for distributed machine learning. Software and systems for continuous analytics, monitoring and model serving. Lifecycle management of machine learning models.
Instruction
Lectures, guest lectures, laboratory work, seminars and group supervision.
Assessment
Active participation in seminars. Written and oral presentation of assignments, a software project and research papers.
If there are special reasons for doing so, an examiner may make an exception from the method of assessment indicated and allow a student to be assessed by another method. An example of special reasons might be a certificate regarding special pedagogical support from the disability coordinator of the university.
Reading list
Reading list
Applies from: Autumn 2023
Some titles may be available electronically through the University library.
Research papers, reports and tutorials.