Variant prioritization, grounded in 80 million years of primate evolution.

CodeXome adds an empirical evolutionary evidence layer to your annotation pipeline. Variants that recur across primate evolution — tested by selection and tolerated — are flagged in a single step, before manual review and functional follow-up.

Primate genera 55 · 239 species
Coding genes 19,244
Sequence datapoints ~2 billion
Resolution Residue-level

After standard filtering, hundreds of plausible candidates remain. Most are biologically tolerated, but you still have to look at them.

Standard pipeline output
~300
candidate missense variants per exome

After gnomAD frequency filtering and ML predictor scoring, hundreds of plausible candidates remain — 75–150 hours of manual review per case before functional work can begin.

With CodeXome annotation
40–75
variants worth your follow-up time

50–75% reduction by removing variants that recur across primate evolution. Selection has already tested them, and they passed. Recall on ClinGen-classified pathogenic variants: preserved.

Across 52 ClinGen expert-curated disease genes, no expert-classified pathogenic variant co-occurred with primates in the database.

Where the evidence comes from

An evolutionary window between population frequency and deep conservation.

Population frequency tools see the last few thousand years. Deep vertebrate conservation sees hundreds of millions. Primate constraint occupies the intermediate window — close enough to humans for biochemical relevance, broad enough to capture patterns that human cohorts alone cannot resolve.

Last 10,000 years

Human population

Variation observed in modern human cohorts. Captures recent allele frequencies but a narrow time window.

ExamplesgnomAD · LOEUF · ClinVar
Where CodeXome lives ~80 million years

Primate evolutionary constraint

Empirical recurrence across 55 primate genera, mapped to human GRCh38 coordinates at residue-level resolution.

Coverage239 species · 19,244 genes
100 million – 1 billion years

Deep vertebrate conservation

Conservation across hundreds of millions of years. Sensitive at deeply conserved sites; multiple hits per site at long timescales reduce resolution.

ExamplesphyloP · phastCons · GERP++

Inside the primate window, gene change tracks well-resolved speciation — the linear region of the evolutionary clock, where each substitution carries phylogenetic signal. Move further out and multiple hits per site begin to erase that signal. Move closer and the sampling collapses. The intermediate window is where empirical recurrence remains both biologically interpretable and statistically informative.

codexome gene previewer

Try it on a gene.

Enter a gene symbol see a glimpse of the alignment across 55 primate genera, mapped to human GRCh38 coordinates.

Run a Gene
Cross-primate alignment
55 genera, residue-level resolution
Free
Unlocked with platform access
🔒 Residue-level constraint
🔒 DTAS scoring
🔒 VCF annotation
🔒 TSV export
🔒 Cohort-scale processing
🔒 Gene Profile Module
codexome studies

Benchmarked across rare disease, hereditary cancer, and ClinGen-curated panels.

All studies
Cancer predisposition

BRCA Exchange benchmark

100%
Concordance with expert panels

Pathogenic variants in BRCA1/BRCA2 expert-classified by the BRCA Exchange were unique to humans in the primate database. Benign variants were broadly shared.

n = full BRCA ExchangeRead →
Expert-curated panel

ClinGen 52-gene concordance

52/52
Genes with full concordance

Across 52 ClinGen expert-curated disease genes, every pathogenic variant was human-specific in the primate database — no expert-classified pathogenic variant co-occurred with primates.

ClinGen expert panelsRead →
Cystic fibrosis

CFTR therapeutic target benchmark

99.3%
Accuracy on 260 drug-target mutations

CodeXome scoring matched 99.3% of 260 mutations targeted by approved CFTR drug therapeutics — an external, biology-anchored benchmark of relevance to gene-targeted therapy programs.

n = 260 drug-target mutationsRead →
ClinVar VUS survey

14,000-gene VUS survey

~20%
Of VUS missense identified as likely benign

Across ~14,000 ClinVar genes, an average of 20% of VUS missense variants (range 10–40% per gene) were shared with other primates — identifiable as likely benign on evolutionary evidence alone.

~14,000 genes · 93% positiveRead →
How it fits your pipeline

Three steps, from VCF in to prioritized candidates out.

01 / Upload

VCF or gene list

GRCh38-aligned VCFs, gene symbols, Ensembl IDs, or cohort-scale uploads. No preprocessing required.

02 / Annotate

Primate recurrence + DTAS

Every variant annotated with cross-primate recurrence, residue-level constraint, and Deep Time Ancestry Score.

03 / Export

TSV, VCF, or Gene Profile

Export annotated variants as TSV or VCF, or pull full Gene Profile reports for structured review.

Why this exists

A primate biology resource, two decades in the making, applied to human disease.

CodeXome did not begin as a product. It began with a scientific observation: humans share approximately 20,000 protein-coding genes with their closest evolutionary relatives, and 80 million years of primate evolution contains a record of which changes nature has tolerated and which it has not.

The biological resource that makes this observation usable — one of the most comprehensive primate sample collections ever assembled by a public research institution — was preserved through the closure of its original NCI laboratory and relocated to a formal research facility under NSF Phase II funding. 55 genera, 239 species, 19,244 genes. Assembled, curated, and released for researchers to use today.

Early access program

A small cohort of research labs, working with us first.

We're partnering with research groups working on rare disease, hereditary cancer, and gene-targeted therapeutics.

3 / 5slots remaining