A computational approach to rare disease association analysis on the 100,000 Genomes Project dataset

Hosted by the Bristol Heart Institute

Abstract: The genetic aetiologies of more than half of rare diseases remain unknown. Standardized genome sequencing and  phenotyping of large patient cohorts provide an opportunity for discovering the unknown etiologies. Here we demonstrate our computational approach to rare disease association analysis on the 100,000 Genomes Project dataset, comprising building a compact database, the "Rareservoir", containing rare variant genotypes and phenotypes, and applying our rare disease association method  "BeviMed".

In our analysis of protein coding genes, we identified 241 known and 19 previously unidentified associations and subsequently  validated associations between (1) loss-offunction variants in the Erythroblast Transformation Specific (ETS)-family transcription factor encoding gene ERG and primary lymphoedema, (2) truncating variants in the last exon of transforming growth factor-β regulator PMEPA1 and Loeys–Dietz syndrome and (3) loss-of-function variants in GPR156 and recessive congenital hearing impairment. An equivalent analysis of non-coding genes revealed a striking association between rare variants in functional regions of snRNA gene RNU4-2 and a novel but frequent neurodevelopmental disorder.

We will discuss elements of the translational pipeline that follows new gene discovery including opportunities for gaining novel biological insights into common human disease and routes to enhance genetic diagnosis of rare disorders in the clinical genomic medicine service.

All welcome; if joining online https://bristol-ac-uk.zoom.us/j/97955978036