Skip to navigation Skip to main content Skip to footer

Approved research

Genomic feature prediction models

Principal Investigator: Dr Peter Sorensen
Approved Research ID: 31269
Approval date: March 21st 2018

Lay summary

The aim is to develop more accurate genomic prediction models. Predicting complex trait phenotypes from high resolution genomic polymorphism data is important for personalized medicine in humans. This is difficult for populations of unrelated individuals when the number of causal variants is low relative to the total number of polymorphisms, and causal variants individually have small effects on the traits. We hypothesize that mapping molecular polymorphisms to genomic features such as genes and biological pathways, while accounting for different genetic architectures (e.g. additive or non-additive gene actions), could increase the accuracy of genomic predictions of the complex traits. Prediction models accounting for different genetic architectures and utilizing known biological mechanisms potentially improve biological understanding and prediction accuracy of disease risk and medically relevant traits. More accurate predictions could improve population stratification, which in turn improve preventive health care and disease treatment. We will develop genomic prediction models that utilize prior biological knowledge and handle different genetic architectures. We will evaluate whether it improves prediction accuracy of selected traits in the UK Biobank. Marker sets are defined by mapping molecular polymorphisms to genomic features such as genes and biological pathways. Genetic marker set tests accounting for different genetic architectures will be used to identify genomic features that are enriched for associated variants. We will investigate to what extend these genetic marker set test results can lead to more accurate prediction models. Accuracy of genomic prediction depend on sample size and trait specific factors such as heritability and genetic architecture (e.g. number of causal variants, their effect size and type of effect such as additive or non-additive gene actions). To maximize the power of our analyses we would require access to the full cohort and a range of complex trait and disease phenotypes.