Akshai Parakkal Sreenivasan: Data-driven insights into the biological functions of chemicals and the progression of multiple sclerosis: A machine learning approach
- Datum
- 8 juni 2026, kl. 13.15
- Plats
- H:SON HOLMDAHLSALN, Akademiska Sjukhuset, Ing 100, 2 tr, Uppsala
- Länk till videomöte
- https://uu-se.zoom.us/j/7205329649?omn=64999650356
- Typ
- Disputation
- Respondent
- Akshai Parakkal Sreenivasan
- Opponent
- Etminani Farzaneh
- Handledare
- Kim Kultima, Ola Spjuth, Joachim Burman, Ida Erngren
- Forskningsämne
- Medicinsk vetenskap
- Publikation
- https://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-584323
Abstract
Machine learning is widely used to identify complex patterns in data and translate them into useful predictions. The choice of modelling approach depends on both the available data and the question being addressed, while modelling priorities may vary depending on the intended end user. This thesis investigates predictive modelling in two distinct biomedical scenarios: prediction of the biological function of chemicals using network topology-derived data (Paper I), and prediction of disease course in multiple sclerosis (MS) using electronic health records from hospital visits (Papers II, III, and IV).
In Paper I, a deep learning model was developed to predict protein network topology clusters directly from chemical structure representations, enabling functional characterization and identification of biological similarity for compounds not present in existing databases.
The second part of this thesis addresses a clinical prediction problem in MS. Identifying the transition from relapsing-remitting MS (RRMS) to secondary progressive MS (SPMS) remains challenging in clinical practice, despite its importance for clinical trials, treatment decisions, and patient management. In Paper II, we developed a machine learning model combined with conformal prediction to classify RRMS and SPMS using data from the Swedish Multiple Sclerosis Registry (SMSReg). The study demonstrated that uncertainty-aware predictions can improve trustworthiness by identifying cases that are difficult to classify reliably. In Paper III, we prospectively validated this framework using newly collected SMSReg data and showed that the model remained well-calibrated and outperformed the current state-of-the-art method. In Paper IV, we extended this work to an international registry setting using data from SMSReg and from 37 countries in the MSBase registry, based in Australia. We showed that calibrating models to specific countries can improve disease course classification across heterogeneous real-world cohorts, while also highlighting challenges arising from variation in clinical practice and data availability. Across these studies, the explainable AI framework SHAP was used to identify influential variables and improve model interpretability.
Collectively, the findings show that machine learning can extract meaningful biological and clinical patterns from complex data, and that uncertainty quantification and interpretability are essential for clinically useful decision-support tools.