Foundations of Data Science

10 credits

Syllabus, Master's level, 1MS048

Code
1MS048
Education cycle
Second cycle
Main field(s) of study and in-depth level
Data Science A1F, Mathematics A1F
Grading system
Fail (U), Pass (3), Pass with credit (4), Pass with distinction (5)
Finalised by
The Faculty Board of Science and Technology, 29 February 2024
Responsible department
Department of Mathematics

Entry requirements

120 credits including 30 credits in mathematics och 10 credits in computer science. Participation in Introduction to Data Science. Participation in Linear Algebra for Data Science or Linear Algebra II. Proficiency in English equivalent to the Swedish upper secondary course English 6.

Learning outcomes

On the completion of the course the student should be able to:

  • formulate decision problems, including action space and loss function, in particular for hypothesis testing and estimation problems,
  • derive confidence bounds using limit theorems such as Glivenko-Cantelli lemma and Dvoretzky-Kiefer-Wolfowitz inequality,
  • use concentration inequalities to derive bounds for specific distributions and give proofs of basic inequalities,
  • derive Bayes-optimal rules for simple decision problems,
  • obtain finite sample bounds for estimators via Vapnik-Chervonenkis dimension,
  • obtain lower bounds on minimax risk,
  • select appropriate model complexity measures to balance bias and variance, for example by penalization,
  • apply the above to derive/implement algorithms, including those designed for use under resource constraints.

Content

Uniform limit theorems and empirical processes, concentration of measure, optimality criteria in statistical decision theory, generalisation bounds and learning theory, minimax lower-bounds and information theory, low-dimensional approximations, algorithms for decision procedures under constraints, for eg. computing resources and applicable law.

Instruction

Lectures and problem solving sessions.

Assessment

Assignments during the course with an oral follow-up exam at the end of the course. The examination is divided into two parts concerning basic statistical decision theory and a priori estimates of generalization errors and model selection. These parts comprise 5 ECTS credits each and are addressed in corresponding assignments and also in the oral exam.

If there are special reasons for doing so, an examiner may make an exception from the method of assessment indicated and allow a student to be assessed by another method. An example of special reasons might be a certificate regarding special pedagogical support from the disability coordinator of the university.

Other directives

The course may not be included in the same higher education qualifications as Theoretical Foundations for Data Science 1MS047

FÖLJ UPPSALA UNIVERSITET PÅ

facebook
instagram
twitter
youtube
linkedin