Prioritizing causal variants is one major challenge for the clinical application of sequencing data. Prompted by the observation that 74.3% of missense pathogenic variants locate in protein domains, we developed an approach named domain damage index (DDI). DDI identifies protein domains depleted of rare missense variations in the general population, which can be further used as a metric to prioritize variants. DDI is significantly correlated with phylogenetic conservation, variant-level metrics, and reported pathogenicity. DDI achieved great performance for distinguishing pathogenic variants from benign ones in three benchmark datasets. The combination of DDI with the other two best approaches improved the performance of each individual method considerably, suggesting DDI provides a powerful and complementary way of variant prioritization.
All Science Journal Classification (ASJC) codes