Syllabus for Computer-Intensive Statistics and Data Mining

Datorintensiv statistik och informationsutvinning

Syllabus

  • 10 credits
  • Course code: 1MS009
  • Education cycle: Second cycle
  • Main field(s) of study and in-depth level: Mathematics A1N

    Explanation of codes

    The code indicates the education cycle and in-depth level of the course in relation to other courses within the same main field of study according to the requirements for general degrees:

    First cycle

    • G1N: has only upper-secondary level entry requirements
    • G1F: has less than 60 credits in first-cycle course/s as entry requirements
    • G1E: contains specially designed degree project for Higher Education Diploma
    • G2F: has at least 60 credits in first-cycle course/s as entry requirements
    • G2E: has at least 60 credits in first-cycle course/s as entry requirements, contains degree project for Bachelor of Arts/Bachelor of Science
    • GXX: in-depth level of the course cannot be classified

    Second cycle

    • A1N: has only first-cycle course/s as entry requirements
    • A1F: has second-cycle course/s as entry requirements
    • A1E: contains degree project for Master of Arts/Master of Science (60 credits)
    • A2E: contains degree project for Master of Arts/Master of Science (120 credits)
    • AXX: in-depth level of the course cannot be classified

  • Grading system: Fail (U), Pass (3), Pass with credit (4), Pass with distinction (5)
  • Established: 2007-03-15
  • Established by:
  • Revised: 2021-10-15
  • Revised by: The Faculty Board of Science and Technology
  • Applies from: Autumn 2022
  • Entry requirements:

    120 credits. Analysis of Regression participation. Proficiency in English equivalent to the Swedish upper secondary course English 6.

  • Responsible department: Department of Mathematics

Learning outcomes

On completion of the course, the student should be able to:

  • give an account for the theoretical foundation of Markov Chain Monte Carlo-methods and to use such techniques to solve given statistical problems;
  • give an account for the principles behind random number generators;
  • use simulation methods such as Bootstrap and SIMEX;
  • use EM methods;
  • use non-parametric statistical models;
  • use statistical software, preferably R.

Content

The purpose of the course is to give the student a good overview about statistical techniques that have been developed during the last years due to increasing computer capacity. Resampling techniques, Jack-knife, bootstrap. . EM algorithms. SIMEX methodology. Markov Chain Monte Carlo (MCMC) methods. Random number generators. Smoothing techniques. Kernel estimators, nearest neighbour estimators, orthogonal and local polynomial estimators, wavelet estimators. Splines. Choice of bandwidth and other smoothing parameters. Applications. Use of statistical software.

Instruction

Lectures, problem solving sessions and computer-assisted laboratory work.

Assessment

Written examination (8 credit points) at the end of the course as well as assignments (2 credit points) in accordance with instructions at course start.

If there are special reasons for doing so, an examiner may make an exception from the method of assessment indicated and allow a student to be assessed by another method. An example of special reasons might be a certificate regarding special pedagogical support from the disability coordinator of the university.

Reading list

Reading list

Applies from: Autumn 2022

Some titles may be available electronically through the University library.

  • Zwanzig, Silvelyn; Mahjani, Behrang Computer intensive methods in statistics

    Boca Raton: CRC Press, [2020]

    Find in the library