Syllabus for Large Datasets for Scientific Applications

Stora datamängder inom vetenskapliga tillämpningar


  • 5 credits
  • Course code: 1TD268
  • Education cycle: Second cycle
  • Main field(s) of study and in-depth level: Computer Science A1N, Technology A1N, Computational Science A1N
  • Grading system: Fail (U), Pass (3), Pass with credit (4), Pass with distinction (5)
  • Established: 2016-03-10
  • Established by:
  • Revised: 2018-10-26
  • Revised by: The Faculty Board of Science and Technology
  • Applies from: Autumn 2019
  • Entry requirements:

    Alt 1. 120 credits in science/engineering including Scientific computing I, Database design I and a second course in computer programming (programming in Java and/or Python). Scientific computing I may be replaced by Scientific computing and calculus, 10 credits, Numerical methods and simulation 5 credits, Scientific computing bridging course, 5 credits.
    Alt 2. 120 credits including Introduction to programming, scientific computing and statistics, Database design and Programming in Python.
    Alt 3. 120 credits including 30 credits in mathematics and 30 credits in computer science including 5 credits in database design. Script Programming. Proficiency in English equivalent to the Swedish upper secondary course English 6.

  • Responsible department: Department of Information Technology

Learning outcomes

To pass, the student should be able to

  • use state-of-the art software platforms for management and processing of massive scientific datasets.
  • analyse the characteristics of a data-intensive scientific application and propose suitable strategies to handle the data analytics aspects of the application.
  • implement software to address an application's data analysis needs using the technology presented in the course.
  • critically analyse, discuss and present solutions and implementations in writing and orally.


How to develop scientific applications utilizing methods and key concepts of large scale data processing platforms. Distributed file systems and cloud containers such as OpenStack Swift. Batch data processing with MapReduce based infrastructures such as Hadoop. Effective use of query languages such as Hive for scientific applications. Effective use of indexing. Array databases such as SciDB, data stream processing platforms such as Storm. Overview of techniques, concepts and tools for analysing massive data, such as NoSQL, NoDB, flat namespaces and ontology based data.


Lectures, guest lectures, laboratory work, seminars and group supervision.


Active participation in seminars. Written and oral presentation of assignments, a software project and research papers.

If there are special reasons for doing so, an examiner may make an exception from the method of assessment indicated and allow a student to be assessed by another method. An example of special reasons might be a certificate regarding special pedagogical support from the disability coordinator of the university.

Reading list

Reading list

Applies from: Autumn 2019

Some titles may be available electronically through the University library.

Research papers.

Last modified: 2022-04-26