Knowledge-Based Systems in Bioinformatics

5 credits

Syllabus, Master's level, 1MB416

A revised version of the syllabus is available.

Code: 1MB416
Education cycle: Second cycle
Main field(s) of study and in-depth level: Bioinformatics A1N, Technology A1N
Grading system: Fail (U), Pass (3), Pass with credit (4), Pass with distinction (5)
Finalised by: The Faculty Board of Science and Technology, 26 March 2021
Responsible department: Biology Education Centre

Entry requirements

Alt. 1. 120 credits including Genomics and Bioinformatics, Probability and Statistics, and Programming Techniques II.

Alt. 2. 120 credits including Introduction to Bioinformatics, Introduction to Programming, Scientific Computing and Statistics, and Programming in Python.

Alt. 3. 120 credits including 30 credits mathematics and 30 credits computer science. Introduction to Bioinformatics, and Script Programming. Proficiency in English equivalent to the Swedish upper secondary course English 6.

Learning outcomes

The course aims to provide a good understanding how logic-based methods can be applied in the construction of knowledge-based systems within Life Sciences. The flood of large and very large data sets such as gene expressions, molecular interactions and taxonomies requires efficient handling. Specifically, the course leads to an advanced understanding of how learning methods can be applied to solve several bioinformatics problems.

On completion of the course, the student should be able to

use and describe definitions and mathematical notation for information and decision systems, rough sets and rule systems
use other methods for machine learning such as clustering, decision trees, and relate them to rough sets
apply knowledge of rule-based systems and Monte Carlo-based selection methods to formulate and solve classification problems in Life Sciences

Content

Introduction to Boolean functions. Transformation and simplification of Boolean expressions. Information, decision systems and rough sets. Features and their synthesis and selection. Training and validation of models. Statistical properties of models. Examples of applications inLifeSciences include: classification of expressions , prediction of gene functions from time profiles and genomic databases, modelling of transcriptional mechanisms, ligand receptor bindings, drug resistance, prediction of protein function from structure and modelling with clinical and genomic data. Lectures are interspaced with computer labs using real and synthetic data. Ontologies. Machine learning: clustering, rough sets, decision trees, Monte Carlo-based selection, statistical model validity and significance.

Instruction

Lectures, computer exercises, project and problem solving exercises.

Assessment

Written closed-book exam at the end of the course (3 credits). Written and computer exercises (1 credit).Project (1 credit).

If there are special reasons for doing so, an examiner may make an exception from the method of assessment indicated and allow a student to be assessed by another method. An example of special reasons might be a certificate regarding special pedagogical support from the disability coordinator of the university.