CFTR served as a high-confidence benchmark for validating CodeXome's Deep Time Ancestry approach. Across 260+ curated pathogenic CFTR amino acid substitutions, only 2 were shared across 80 million years of primate evolution — >99% concordance between evolutionary absence and curated pathogenicity.
The CFTR gene is one of the most extensively studied disease genes, with over 2,000 reported variants and hundreds of well-characterized pathogenic substitutions. Its biological function is strictly regulated, and its protein domains exhibit established patterns of intolerance to change. This makes CFTR an ideal gene for evaluating whether deep-time evolutionary recurrence and evolutionary absence align with established clinical pathogenicity classifications. Across primate evolution, CFTR serves as a clean, high-confidence benchmark for validating CodeXome’s Deep Time Ancestry approach
This study aims to determine whether known pathogenic CFTR amino acid substitutions recur naturally across primates or if they remain entirely absent across the evolutionary record. If pathogenic variants were tolerated in other species, they would appear as natural substitutions in CodeXome’s primate alignment. Their absence across 80 million years would indicate strong functional constraint. Furthermore, this analysis evaluates whether naturally recurring substitutions align with known benign clinical classifications.
Using CodeXome’s Gene Profile module, the following analysis was performed:
Out of more than 260 curated pathogenic amino acid changes, CodeXome found that only 2 substitutions are shared across 80 million years of primate evolution. This corresponds to >99 percent concordance between evolutionary absence and curated pathogenicity, strongly supporting the principle that pathogenic variants are not tolerated in natural variation for CFTR.
CFTR displays dozens of tolerated amino acid substitutions across primates. For these sites, recurring primate substitutions co-occur with ClinVar benign and likely benign variants. Furthermore, for those ClinVar benign/likely benign variants that do not co-occur in primates, our predictive tool—the new Deep Time Ancestry Score (AI/ML)—matches their classification at 99% accuracy.
CFTR falls into the average range of evolutionary rates of change when compared with a dataset of 19,244 coding genes. Patterns of natural variation are distinctive and correlated with motifs and regions within:
Comparing evolutionary data with curated clinical classifications:
This supports the idea that evolutionary recurrence is an independent, biological truth set that can be used when clinical data are unknown.Interpretation
The CFTR case demonstrates that:
In short, evolution provides a functional reference for CFTR that is consistent, biologically meaningful, and highly predictive of pathogenicity. CFTR serves as a foundational validation of CodeXome’s approach.
Using CFTR as an anchor point, researchers can trust that:
This case helps establish CodeXome as a meaningful source of functional evidence across diverse disease genes.
CFTR evolutionary patterns are distinctive across functional domains (NBD1, NBD2, regulatory domains, transmembrane helices). Pathogenic variants are virtually absent from primate evolution; naturally recurring substitutions consistently match benign classifications.
The CodeXome platform lets you browse residue-level evolutionary evidence across 55 primate genera, live in your browser. No signup for the Gene Previewer.