Ludvig Hult: Robust inference for systems under distribution shifts

  • Date: 20 September 2024, 09:15
  • Location: 10134, Polhemsalen, Lägerhyddevägen 1, Uppsala
  • Type: Thesis defence
  • Thesis author: Ludvig Hult
  • External reviewer: Antoine Chambaz
  • Supervisors: Dave Zachariah, Thomas B. Schön
  • DiVA

Abstract

We use statistics and machine learning to make advanced inferences from data. Challenges may arise, invalidating inferences, if the context changes. Situations where the data generating process changes from one context to another is known as distribution shift, and may arise for several reasons. This thesis presents five articles on the topic of making robust inferences in the presence of distribution shifts.

Paper 1 to 3 develop mathematical methods for robust inference. Paper 1 adresses the problem that when there is uncertainty about the structue of the underlying data generating process, confidence intervals are not generally valid for estimating the impact of interventions. We propose a method for constructing valid confidence intervals for the average treatment effect using linear structural causal models. Paper 2 addresses the problem of model evaluation under distribution shift, using nonparametric statistics. We show that with a small validation sample, one can make finite-samplevalid inference about a machine learning model performance on a new data set despite distribution shift. Paper 3 addresses the problem that inventory control policies may become invalid without assumptions on the demand. Using a deterministic feedback mechanism, we construct an order policy that guarantees any prescribed service level, with weak assumptions on the demand, allowing distribution shift.

Paper 4 and 5 focus on applications to neurocritical care data. Paper 4 uses machine learning to predict intracranial pressure insults in neurocritical care. Since distribution shift may occur between patients and/or years, the validation methods takes this into account. Paper 5 explores the use of causal inference on neurointensive care data. While this may eventually lead to inferences valid under intervention distribution shift, several obstacles to effective application are identified and discussed.

FOLLOW UPPSALA UNIVERSITY ON

facebook
instagram
twitter
youtube
linkedin