Announcing The Pleiades Series
The Pleiades Series Visual

Towards a world free from neurodegenerative disease, starting with Pleiades

by Prima Mente

July 16, 2025

Download preprint of Pleiades Series here

Read on bioRxiv here

Our mission is to deeply understand the brain, protect it from disease, and enhance it in health. Today we introduce Pleiades, a series of biological foundation models trained upon the human epigenome. Advancing beyond DNA-only language models, we showcase Pleiades' capabilities in navigating the human genome and enabling precision clinical applications for Alzheimer's disease and Parkinson's disease.

Decoding biology's operating system to tackle complex disease

One of the greatest challenges in medicine is the lack of understanding of the brain. As we age, the brain undergoes subtle but profound molecular changes in gene regulation, cellular identity, immune signaling, and more. These shifts often begin decades before any symptoms emerge. In conditions like Alzheimer's, Parkinson's, and ALS, the earliest damage is molecular, silent, and scattered deep in brain tissue we can't easily observe. While brain biopsies are technically possible, they're invasive, risky, and rarely used outside of extreme cases. That leaves us with a major blind spot: the biology of ageing and neurodegeneration is happening in real-time, but we have no routine way to monitor it, let alone intervene.

That's starting to change; not just through better detection, but through better modelling.

Progress in AI for biology has centred on proteins, modelling their structures, predicting interactions, and accelerating molecule design. Recent advances have focussed on the genome, the static code we inherit at birth. But understanding complex diseases, especially in the brain, requires more than static snapshots. Biology is dynamic. Cells adapt continuously to ageing, inflammation, and environmental stress, and that adaptation is governed by the epigenome, the dynamic set of environmental and chemical modifications to DNA.

DNA methylation is among the most widespread and best-characterised epigenetic mechanisms. Changes profoundly influence how the genetic code is read, directly influencing downstream cellular activity. Disruption in the regulatory logic of DNA plays a key role in the development of neurodegenerative diseases, and is specific to a variety of brain cell types.

While reading the brain epigenome is impossible to do in the living, non-invasive liquid biopsies combined with foundation models, provide an important step toward the important goal of understanding the brain in real-time. Pleiades models both DNA sequences and methylation jointly, using a transformer-based architecture to infer cellular origin, regulatory state, and disease relevance, all from fragmented DNA found in plasma. This offers us a unique lens to see the regulatory architecture behind disease.

Pleiades represents a series of transformers (90M, 600M, and 7B parameters) trained on a comprehensive corpus of methylated DNA sequences, amounting to 1.9 trillion methylated and genomic tokens at single-nucleotide resolution. This is made up of:

  • A high-resolution atlas of methylation across normal human cell types.
  • Cell-free DNA (cfDNA) from healthy individuals.
  • A knowledge graph of diverse human reference genomes

With this world-class dataset, Pleiades achieves state-of-the-art understanding of the human genome when compared to DNA-only language models.

Turning neurobiological data into clinical action

Biological foundation models like Pleiades represent major research advances, but clinical application will determine their utility in the near-term and potential in the long-term.

We directly assess Pleiades for the minimally-invasive early detection of neurodegenerative diseases. Our approach can successfully detect conditions like Alzheimer's disease and Parkinson's disease by analysing cell-free DNA (cfDNA) derived from a simple blood sample. This provides a crucial non-invasive alternative to current diagnostic methods, which are often invasive, costly, and carry risks.

In our studies, Pleiades 7B matched the performance of leading protein biomarkers for Alzheimer's detection, and reached an AuROC of 0.97 when combined with pTau-217. It also performed strongly for detection of Parkinson's, achieving 0.84 AuROC. This provides proof-of-concept evidence for the use of liquid biopsy for the brain. In future, this may enable earlier diagnosis, improved patient outcomes, and targeted clinical trials on specific subsets of individuals with disease.

Pleiades may also enable accelerated biomarker discovery for brain conditions. By learning epigenetic patterns directly from the raw data, it could fundamentally speed up the discovery of new disease markers and mechanisms that inform novel targets.

Accelerating a new era of AI-led discovery for neuroscience

We are proud of Pleiades, but we're just getting started. We'll be sharing more results and pushing further in the months ahead.

Multi-modal foundation modelling for biology offers promise to enable precision medicine and unlock novel insights for complex diseases. Pleiades establishes the first step and displays the effectiveness of jointly modelling DNA and methylation in a unified, general-purpose foundation model. Our work lays the groundwork for accelerated biomarker discovery, and deeper, interpretable mechanistic insights into the genomic regulation of brain ageing and disease.

To build a true Brain Foundation Model, we will need to go further. This means incorporating other layers of biology, from single-cell transcriptomics to proteomics and beyond, into unified models capable of capturing the full complexity of brain health and disease. We have established partnerships with leading institutions to access large numbers of patient-derived samples and are embarking on petabyte-scale data generation in our laboratory to support this vision.

As the cost of molecular data generation continues to fall and computational models grow more powerful, inference will shift from sparse clinical markers to dense, multilayered biological signals. This shift will enable a step change in how we detect, track, and treat neurological diseases, from Alzheimer's to Parkinson's and beyond. With advances in AI, it is reasonable to expect a 1,000,000x leap in precision medicine in the next decade.

We are looking for researchers, collaborators, clinicians, and builders to join us on this mission. We are determined to understand the brain in health, age, and disease, by creating biological superintelligence: models that reason across modalities, make mechanistic predictions, and guide therapeutic innovation for all of medicine.

Moving from computational insight to biological impact is necessary for patient impact. This is why we believe an end-to-end approach is necessary: generate data, build models, and clinically validate all in-house. This gives us the ability to achieve large-scale outcomes with pace, towards a world free from neurodegenerative disease.

Download preprint of Pleiades Series here