STATISTICS Review Seminar: Väinö Yrjänäinen

Date
11 February 2026, 10:15–11:30
Location
Ekonomikum, H317
Type
Seminar
Organiser
Department of Statistics Uppsala University

Speaker Väinö Yrjänäinen, Department of Statistics Uppsala University

Opponent Pierre Nyquist, Department of Mathematical Sciences, Chalmers University of Technology and Gothenburg University

Abstract Data accuracy is crucial for reliable research, accurate decision-making, and high-performance machine learning. However, maintaining data accuracy, especially at scale, is complicated and difficult to verify. By integrating concepts from software engineering, statistical quality control, and branching process theory, we formalize an iterative data curation framework that scales well to large data sets. We go on to prove that the proposed approach asymptotically eliminates all errors in the data with probability one. Additionally, we provide theoretical guarantees that data accuracy tests speed up error reduction. We corroborate these results through simulations on text and tabular data, and a real-world application to the Swedish Parliamentary Corpus, demonstrating the framework’s effectiveness in preserving high-accuracy historical records at scale.

FOLLOW UPPSALA UNIVERSITY ON

Uppsala University on Facebook
Uppsala University on Instagram
Uppsala University on Youtube
Uppsala University on Linkedin