Syllabus for Computer-Intensive Statistics and Data Mining

Datorintensiv statistik och informationsutvinning

A revised version of the syllabus is available.

Syllabus

  • 10 credits
  • Course code: 1MS009
  • Education cycle: Second cycle
  • Main field(s) of study and in-depth level: Mathematics A1N

    Explanation of codes

    The code indicates the education cycle and in-depth level of the course in relation to other courses within the same main field of study according to the requirements for general degrees:

    First cycle

    • G1N: has only upper-secondary level entry requirements
    • G1F: has less than 60 credits in first-cycle course/s as entry requirements
    • G1E: contains specially designed degree project for Higher Education Diploma
    • G2F: has at least 60 credits in first-cycle course/s as entry requirements
    • G2E: has at least 60 credits in first-cycle course/s as entry requirements, contains degree project for Bachelor of Arts/Bachelor of Science
    • GXX: in-depth level of the course cannot be classified

    Second cycle

    • A1N: has only first-cycle course/s as entry requirements
    • A1F: has second-cycle course/s as entry requirements
    • A1E: contains degree project for Master of Arts/Master of Science (60 credits)
    • A2E: contains degree project for Master of Arts/Master of Science (120 credits)
    • AXX: in-depth level of the course cannot be classified

  • Grading system: Fail (U), Pass (3), Pass with credit (4), Pass with distinction (5)
  • Established: 2007-03-15
  • Established by: The Faculty Board of Science and Technology
  • Revised: 2007-11-06
  • Revised by: The Faculty Board of Science and Technology
  • Applies from: Autumn 2008
  • Entry requirements:

    120 credit points and Analysis of Regression and Variance

  • Responsible department: Department of Mathematics

Learning outcomes

In order to pass the course (grade 3) the student should

  • have a thorough knowledge about statistical techniques that have been developed during the last decades due to increasing computer capacity;
  • understand the theoretical foundation of Markov Chain Monte Carlo methods and be able to use such techniques;
  • understand the principles behind random number generators;
  • be able to use simulation methods such as Bootstrap and SIMEX;
  • be able to use computer-intensive non-linear statistical methods;
  • be familiar with EM methods;
  • be able to use non-parametric statistical models;
  • have some experience of applications from image analysis and financial mathematics;
  • be able to use statistical software, preferably R.
  • Content

    Resampling techniques, Jack-knife, bootstrap. Non-linear statistical methods. EM algorithms. SIMEX methodology. Markov Chain Monte Carlo (MCMC) methods. Random number generators. Smoothing techniques. Kernel estimators, nearest neighbour estimators, orthogonal and local polynomial estimators, wavelet estimators. Splines. Choice of bandwidth and other smoothing parameters. Applications. Use of statistical software.

    Instruction

    Lectures, problem solving sessions and computer-assisted laboratory work.

    Assessment

    Written and, possibly, oral examination (4 credit points) at the end of the course. Assignments and laboratory work (6 credit points) during the course.

    Reading list

    Reading list

    Applies from: Autumn 2008

    Some titles may be available electronically through the University library.

    • Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome H. The elements of statistical learning : data mining, inference, and prediction

      New York: Springer, cop. 2001

      Find in the library