Daniel Schmitz: Beyond GWAS: Novel Methods and Resources for Genetic Epidemiology
- Date: 2 February 2024, 09:00
- Location: A1:107a, BMC, Husargatan 3, Uppsala
- Type: Thesis defence
- Thesis author: Daniel Schmitz
- External reviewer: Tuuli Lappalainen
- Supervisor: Åsa Johansson
- DiVA
Abstract
Since the first human genome assembly’s release, our knowledge of the genetic architecture of complex traits and diseases has grown steadily. Genome-wide association studies (GWAS) played a major role but are limited to common traits and single-nucleotide polymorphisms (SNPs). Technologies and resources like next-generation sequencing, Mendelian Randomization (MR), long-read sequencing and improved reference genomes enable the investigation of variants inaccessible to GWAS, such as copy number variations (CNVs), rare variants and variants in previously unresolved regions.
In project I, we performed a GWAS of estradiol measurements using data from UK Biobank and quantified estradiol’s effect on bone mineral density (BMD) using MR. 14 loci were associated with estradiol levels in males, of which one was also significant in females and an additional female-specific locus. We found a significant effect of estradiol on BMD, confirming previous research of estrogen’s importance for skeletal health.
In project II, we used the GWAS results from project I to investigate the effect of endogenous estradiol on breast, endometrial and ovarian cancer using MR. Estradiol was associated with ovarian cancer and nominally associated with estrogen receptor-positive breast cancer, demonstrating the effect of endogenous estrogen on cancer risk.
In project III, we quantified the effect of 184,182 CNVs on 438 blood plasma proteins using whole-genome sequencing (WGS) data from a Northern Swedish cohort and validated our findings using long-read sequencing in a subcohort. 15 CNVs were associated with 16 proteins of which four could be validated using long reads and three more were more complex variation. Our findings show the effects of CNVs on the plasma proteome and highlight the application different sequencing technologies for CNV detection.
In project IV, we evaluated the use of T2T-CHM13 as reference for the SweGen cohort. Compared to GRCh38, mapping quality improved and we identified 9.8 million more variants. Sensitivity for rare, singleton and functionally relevant variants was higher. These findings show how research and clinical applications benefit from T2T-CHM13 by improving detection of previously unknown functionally relevant variation.
This thesis demonstrates the application of novel technologies and resources in genomics to detect variation and study its impact on quantitative traits. By using genotyping and WGS variants from short and long reads, I showed how we can leverage these technologies for research beyond GWAS.