Syllabus for Information Systems C: Data Mining and Data Science

Informationssystem C: Data mining och data science

Syllabus

  • 7.5 credits
  • Course code: 2IS063
  • Education cycle: First cycle
  • Main field(s) of study and in-depth level: Information Systems G2F
  • Grading system: Fail (U), Pass (G), Pass with distinction (VG)
  • Established: 2017-10-26
  • Established by:
  • Revised: 2020-02-27
  • Revised by: The Department Board
  • Applies from: Autumn 2020
  • Entry requirements: 60 credits in information systems or the equivalent including 7.5 credits in databases.
  • Responsible department: Department of Informatics and Media
  • This course has been discontinued.

Decisions and guidelines

The course is part of the minor field Database Technology.

Learning outcomes

In terms of knowledge and understanding, after completed course the student should be able to:

- explain fundamental terms within the areas data warehousing, big data analytics, data mining, and data science, as well as how these can support decision making in organisations,

- explain how data mining can support answering a research question in a data science project,

- describe and categorize different data mining methods and explain how they differ,

- explain under which conditions a particular data mining method can be used to answer a given question.

In terms of skills and abilities, after completed course the student should be able to:

- plan a data science project based the use of data mining methods, including problem identification, question formulation, selection of data, preprocessing method, data mining method(s), and method for evaluation of results,

- apply elementary data mining methods to perform analyses.

In terms of evaluation and analysis, after completed course the student should be able to:

- interpret and analyse the results of a data mining process, as well as assess the effects of choices made during the process,

- analyse and evaluate the social consequences of data mining and big data for society, taking into account ethical aspects.

Content

The course introduces the student to data mining as a method for answering a research question within the overall framework of data science. Data science is a research-oriented approach that includes problem identification, question formulation, identification and preprocessing of data, choice and application of analysis method, and analysis and evaluation of results. During the course, data will be discussed extensively, including data types and characteristics, data transformation, as well as how semi- or unstructured data, which characterize big data, can be processed. Further, the data mining process and the main data mining tasks classification, clustering, association analysis, and deviation detection, will be presented and methods for each applied. Special approaches for dealing with textual data, i.e., text mining, will be covered. Data mining applications, and strengths and weaknesses of different methods, will also be discussed. Finally, students will be exposed to social and ethical perspectives on data mining and big data.

Instruction

Lectures, laboratory exercises, seminars.

Assessment

Exam, assignments, laboratory exercises, seminars.

If there are special reasons for doing so, an examiner may make an exception from the method of assessment indicated and allow a student to be assessed by another method. An example of special reasons might be a certificate regarding special pedagogical support from the University's disability coordinator or a decision by the department's working group for study matters.

Reading list

Reading list

Applies from: Autumn 2021

Some titles may be available electronically through the University library.

  • Data mining : practical machine learning tools and techniques Witten, Ian H.; Frank, Eibe; Hall, Mark A.; Pal, Christopher J.

    4. ed.: Amsterdam: Morgan Kaufmann, [2017]

    Find in the library

    Mandatory

Reading list revisions

Last modified: 2022-04-26