NEW Preprint: Enabling the prediction of phage receptor specificity from genome data

The Phage Foundry is excited to share a new preprint! Led by postdoctoral researcher Lucas Morinière (LBNL), this work represents a collaboration between PI Vivek Mutalik‘s lab at LBNL, co-I Adam Arkin‘s lab at UC Berkeley, and the broader Phage foundry team. Lucas set out to answer a deceptively simple question: can we predict which receptor a phage targets using its genome sequence alone? For most phages, including those infecting the well-studied bacterial host E. coli, the answer was no.

Since no experimental data existed at the scale needed to build predictive models, Lucas and the team conducted 1,050 genome-wide screens across 255 E. coli phages, testing over 1.9 million gene-phage combinations. Phage Foundry postdoctoral researcher Avery Noonan (LBNL, Arkin Lab) then built AI/ML models trained on this data to predict receptor identity from genome sequence alone, with no prior knowledge of receptor binding proteins required.

The results have broad reach: more than half of all E. coli phage genomes in NCBI (National Center for Biotechnology Information) now have a prediction, and the team has experimentally validated many of them. Notably, comparative genomics and structural modeling independently converged on the same domains the models identified.

To make this dataset accessible to the research community for years to come, Phage Foundry postdoctoral researcher Milo Johnson (UC Berkeley, Koskella Lab) developed the Phage Datasheets platform, a compiled, browsable dataset portal.

This research partnership was made possible with support from the National Science Foundation (NSF) and the U.S. Department of Energy (DOE) Office of Science.

Looking ahead, the team is extending this workflow to develop phage countermeasures against antimicrobial-resistant (AMR) bacterial pathogens and to enable precision microbiome engineering across a range of applications.

Moriniere L, Noonan AJC, Kazakov A, Pena M, Svab M, Rivera-Lopez EO, Maucourt F, Johnson MS, Roux S, Koskella B, Deutschbauer AM, Dudley EG, Mutalik VK, Arkin AP. Enabling the prediction of phage receptor specificity from genome data. bioRxiv 2026:2026.04.02.716166. https://doi.org/10.64898/2026.04.02.716166.