People ask me all the time how the platform works, and the honest answer starts long before CodeXome existed. I spent most of my career at the National Cancer Institute at the National Institutes of Health asking one question: how do genes evolve over time? To get at it, I studied hundreds of species of mammals, anywhere from the African elephant to the primates. The primates became my favorite. And somewhere in that work, a simple idea took hold that I still build everything around today.
Here it is. We don't have to guess which changes a gene can tolerate. Nature has been running that experiment for tens of millions of years, in the wild, with no do-overs. Our job isn't to rerun it. It's to read the results.
The wild is a brutally honest lab
When you understand the phylogenetic relationships among the species you're looking at, how they're related and who descended from whom, you get very clear patterns of gene evolution. And what you're really seeing in those patterns is what mother nature has vetted as mostly natural variation. The variation that shows up across living species in the wild is, by definition, variation an animal could carry and still survive long enough to pass it on.
That matters because genetic disease is rare in the wild. Species die for lots of other reasons. There's plenty of infectious disease, of course, but inherited genetic disease is uncommon out there. When a deleterious allele arises, very often the animal simply does not survive. Nothing softens that. There's no clinic, no intervention.
It's not about how common a variant is
This is the part I most want researchers to understand, because it's where we differ from the tools you already know. Our approach is not a frequency approach. We're not telling you how often a variant shows up in a population. We're showing you the pattern of change across 80 million years of time, and those patterns jump out visually.
Eighty million years isn't an arbitrary number. The living primates we sampled arose from a common ancestor around 80 to 90 million years ago and have been diversifying ever since. Compare a human gene only to a chimpanzee and you'll see almost nothing; they're too close. Open the window to the full sweep of primate diversification and the gene's whole rulebook appears: which positions it has held sacred, and which it has let drift.
You can feel this when you look across our histogram of nearly twenty thousand genes. Down at the slow, conserved end sit the genes with vital functions. KRAS, a notorious cancer gene, barely moves; there is one amino acid change that arose 43 MYA in a common ancestor to new world monkeys, old world monkeys, great apes, gibbons, and humans. Out at the fast end you find the sensory genes, like the olfactory genes, and the genes involved in resistance to infection, exactly the ones you'd expect to be racing to keep up with a changing world. Where a gene falls on that curve already tells you something about what kind of gene it is.
The puzzles are the whole point
Once you can see what evolution has and hasn't allowed, the interesting cases practically announce themselves.
Sometimes evolution says a particular site is invariant, not allowed to change, and yet humans have adapted a change anyway. The change is pathogenic, and somehow it's still circulating in our populations. That's a genuinely cool puzzle, and it's the kind of thing this view reveals.
The most famous version of the puzzle is sickle cell: a pathogenic variant that persists precisely because it confers resistance to malaria. That's balancing selection: a "harmful" allele held in place because, in the right context, it also helps. When you can see what nature normally guards and then catch a spot where humans broke the rule, you've found a thread worth pulling.
Why we built it this way
The goal, from the start, was to reach out and push biomedicine and wildlife biodiversity together, two fields that almost never talk to each other. The medical world studies humans. The molecular-evolution world studies everything else. CodeXome is meant to be a bridge between those two worlds, so a researcher can stand in one and borrow the wisdom of the other.
We generated all of the comparative data ourselves and put it through a rigorous bioinformatic pipeline, because high-quality, low-noise data mattered to us more than speed. Many of our primate samples come from endangered, hard-to-obtain species, and they represent how a species' genetics actually look in the wild, not pampered into surviving a bad mutation in a comfortable setting. And alongside our own evolutionary evidence, we surface what the established resources say too, so you can see the population picture next to the deep-time one, all on a single screen.
I'll also say plainly what we are not: this is a research tool, not a clinical diagnostic test. It's built to help you ask better questions and prioritize where to look, not to hand down a verdict.
Come read the results with us
If you study why certain variants persist, where a candidate gene sits in the bigger evolutionary story, or what nature has quietly tolerated for millions of years, I'd love for you to put your own genes through it and see what jumps out.
Eighty million years of selection already did the hard part. If you're curious what it has to say about the gene you're working on, take a look, and if you find a puzzle worth chasing, come tell me about it. Those are my favorite conversations.
