CRADLE-1 Seeks to Rewrite the Economics of Protein Engineering
Drug discovery has two big machine learning stories: The first — de novo design, where AI conjures entirely new protein binders from scratch — has dominated recent headlines, rhe second is quieter but economically heavier: lead optimization, the year-long manual grind of refining a promising molecule until it actually works well enough to become a drug. It’s the step that typically consumes 12 to 36 months and $5M to $15M per candidate, and it has largely resisted automation. A new preprint from Cradle introduces CRADLE-1, a machine learning system that tackles this step head-on, reporting optimization four to seven times faster than traditional methods across an unusually wide range of protein types — antibodies, enzymes, peptides, CRISPR systems, and vaccines — and often improving half a dozen properties at once. If the results hold up, the bottleneck that has defined pre-clinical drug development for decades may be about to move.
A Costly Bottleneck in Drug Discovery
Lead optimization — the painstaking process of refining a promising molecule until it’s actually fit for clinical or industrial use — has long been the grinding middle step of drug discovery. It typically consumes 12 to 36 months and burns through $5M to $15M per candidate, and only about one in fifteen candidates ever reaches a successful product launch. For a therapeutic antibody, optimization might mean simultaneously boosting affinity, improving developability, and reducing immunogenicity. For an industrial enzyme, it could mean cranking up catalytic turnover while maintaining stability under punishing pH or temperature conditions. The iterative “design-build-test-learn” cycle that underpins this work has remained stubbornly slow, even as machine learning has transformed neighboring steps like de novo binder design.
A new preprint from Cradle, posted on bioRxiv in March 2026, argues that this is about to change. The authors introduce CRADLE-1, an automated machine learning framework that tackles multi-property lead optimization across an unusually broad sweep of protein modalities. Their headline claim: the system is four to seven times faster than traditional rational design, as measured by the number of wet lab rounds required to hit a target product profile.
What CRADLE-1 Actually Does
At its core, CRADLE-1 is a system for generating improved variants of a protein “template” — the starting lead sequence — while simultaneously optimizing multiple functional properties. The process is deliberately lightweight from the user’s perspective: a scientist passes a template sequence and any available assay data through an API or UI, waits roughly a day for designs, synthesizes and tests them in the lab (a single 96-well plate is enough), and then feeds the results back for the next round. The paper reports successful optimization across one to three rounds, with campaigns never requiring more than that.
What sets the system apart is its generality. The authors demonstrate results on VHHs (nanobodies), scFvs, full IgGs, peptides, industrial enzymes, CRISPR systems, and vaccines. They optimize properties ranging from binding affinity and thermostability to immunogenicity, aggregation, polyreactivity, cell binding, expression, and on- and off-target editing activity. In some campaigns, up to six properties are tuned at once; private benchmarks reportedly go as high as eight.
Some of the Case Studies
The results section reads like a greatest-hits tour of protein engineering challenges. In a public design competition hosted by Adaptyv Bio, CRADLE-1 won against 400 other entries by engineering twelve variants of an scFv-formatted Cetuximab for binding to EGFR — all twelve bound, with the best reaching 339 pM, compared to the second-place competitor at 5.18 nM. For a SARS-CoV-2 nanobody, the system pushed wild-type binding from 2.01 nM to 186 pM and melting temperature from 58.5 °C to 70.9 °C across three rounds, while also improving Omicron cross-reactivity and expression. A haloalkane dehalogenase campaign raised the enzyme’s melting temperature by a remarkable 20 °C in just two rounds, while simultaneously doubling expression and modestly improving activity.
Perhaps the most striking commercial anecdote involves a P450 enzyme. A unnamed pharmaceutical partner had previously spent eight rounds and 1,201 candidates on rational design, reaching only a 17.9× fold improvement in activity — the project was slated for cancellation. CRADLE-1 took over and delivered a 40.6× improvement in three rounds, with median round-over-round gains 3.82× higher than the rational design effort. An IgG campaign for a top-50 pharma partner — optimizing potency, aggregation, polyreactivity, cell binding, immunogenicity, and expression simultaneously — produced ten viable candidates after the partner’s own screening and in-house optimization had failed. Two CRISPR projects pushed on-target editing activity from below 25% to 68%, and in a separate campaign improved on-target activity to 75% while dropping worst-site off-target editing from 0.4% to 0.1%.
The authors are also transparent about a failure: a serine–pyruvate aminotransferase campaign improved thermostability dramatically but only managed a 1.96× activity gain, which was judged insufficient. Notably, two experienced human protein engineers given the same problem did no better, suggesting the ceiling was biological rather than algorithmic.
Under the Hood
Technically, CRADLE-1 is built on protein language models rather than the structure-prediction methods that dominate much of the de novo design literature. The pipeline starts with a foundation PLM pre-trained on large protein databases like UniRef, then fine-tunes it in two stages. First, an unsupervised “evotuning” step adapts the model to the evolutionary neighborhood of the template sequence. Second, when wet lab data are available, two specialized models are derived from the evotuned backbone:
- A “logiter” trained via group Direct Preference Optimization (g-DPO), a computationally efficient variant of DPO that the authors introduce in a companion paper, which learns to rank mutations by their effect on function.
- A “predictor” with a regression head that directly estimates property values from sequence.
Generation then proceeds via a “double beam” search: candidates are proposed by the logiter, ranked by the predictor for both predicted function and diversity, and refined over iterations with rising sampling temperature. When no assay data are available — as in the first round of most campaigns — the system falls back to a zero-shot mode that relies purely on evolutionary signal, which the authors note works because properties like thermostability and binding are often evolutionarily conserved.
Three Findings That Matter Beyond the Numbers
Beyond the speed claims, the paper makes three broader arguments that may be more consequential in the long run. The first is that the entire process is automatable — reduced to an API call — even when the wet lab data are noisy and exhibit batch effects. The second is that sequence-function data can be consumed as a “black box,” without the model needing to know anything about the underlying biochemical mechanism, the structure of a binding target, or the details of a multi-step enzymatic pathway. The third, and perhaps most provocative, is that structural data can largely be superseded by sequence-function pairs — a direct challenge to the structure-first orthodoxy that has dominated recent protein ML.
The authors also highlight a reliability improvement: they estimate CRADLE-1 achieves a 90–95% success rate against target product profiles, compared to the roughly 85% baseline Paul et al. reported for conventional lead optimization. And they note a subtler organizational benefit — the system produces a quantitative estimate of “optimization headroom,” giving teams a principled signal to stop pursuing campaigns that have hit a biological ceiling, rather than falling victim to the sunk cost fallacy.
Limitations and What’s Next
The paper is honest about current constraints such as the fact that it can only optimize what can be measured, so properties requiring expensive assays — like immunogenicity, which may need MHC-associated peptide proteomics or T-cell activation assays — inherit the cost and turnaround time of those measurements. The system also assumes a minimum of about 85 sequence-function pairs for reliable supervised fine-tuning, which rules out the very smallest experiments. And the authors acknowledge that hyperparameter tuning was done manually and could likely be improved.
The commercial statement at the end of the paper is candid: the methods described reflect Cradle’s workflow from roughly two years before publication, released now as a deliberate trade-off between commercial protection and open science. For a field where de novo design has absorbed most of the attention and most of the headline demos, CRADLE-1 represents a bet that the next wave of value will come from the less glamorous but economically heavier step that follows — turning a promising hit into something that actually works.
About the Company
Cradle — the Dutch-Swiss biotech behind CRADLE-1 — was founded in 2021 in Amsterdam by a team blending protein science with engineering pedigree from big tech, including CEO Stef van Grieken (ex-Google) and co-founders Jelle Prins, Elise de Reus, Eli Bixby, and Harmen van Rossum. The company has since raised over $100 million across seed, Series A, and a $73 million Series B led by IVP in 2024, and now runs more than 50 active projects powering drug discovery at six of the top 25 big pharma companies, including Johnson & Johnson, AbbVie, and Novo Nordisk. What makes Cradle interesting beyond its own numbers is the model it represents. Most AI-in-biology companies have followed the “biobucks” playbook — partnering with pharma on specific molecules in exchange for milestones and royalties, which ties the AI firm’s fate to individual drug programs. Cradle instead sells its platform as straightforward SaaS, letting scientists inside pharma run protein engineering themselves without the IP entanglements. That shift matters for the broader AI-in-research sector: it reframes applied AI from a bespoke research collaboration into infrastructure, much as cloud computing became infrastructure for software. Pair that with Cradle’s insistence on running its own wet lab — van Grieken likens a purely computational biology company to a self-driving car company without cars — and you get a template that’s starting to influence how the whole field thinks about building AI tools for science: own the data pipeline, productize the models, and let domain experts stay in the driver’s seat rather than handing their problems off to outside ML teams.
