New AI Model Designed to find Cures for Cancer Gets Google CEO Excited

Google’s DeepMind together with Yale University revealed that their AI model successfully validated an AI model generated hypothesis about a cancer-related cellular mechanism. This alongside a new model to identify genetic variants in tumors in partnership with researchers at University of California, Santa Cruz Genomics Institute.

The model called C2S-Scale 27B (built on the Gemma model family) generated a novel hypothesis that instead of just improving predictions or classifications and proposed new scientific insight—a potential therapeutic pathway—that was not obvious from prior knowledge. While Deepsomatic presents a tool for tumor diagnosis that leverages machine learning to identify genetic variants in tumor cells more accurately than current methods.

What Got Sundar Pichai so Excited?

Overview of the approach

  • Unlike the Alphafold models designed for protein prediction, The Gemma family of models is designed for biology / cellular modeling, analogous in spirit to large language models (LLMs), but trained on biological data (e.g. single-cell RNA, molecular interactions) rather than text.
  • C2S-Scale 27B is a larger model (27 billion parameters) in that family, optimized for single-cell-level inference and hypothesis generation.
  • The AI system was tasked not only with pattern recognition but with suggesting context-conditioned hypotheses. (In this case, it proposed that a known small molecule drug, silmitasertib (CX-4945), might act as a conditional amplifier—in certain contexts, increasing tumor cell visibility to the immune system. )
  • Experimental collaborators then tested this hypothesis in living human cells, and validated that under certain cellular conditions, silmitasertib indeed shows a behavior consistent with the prediction in petri-dish tests. (blog.google)

Significance claimed

  • The team frames this as evidence that scaling up biological AI models can yield novel mechanistic insight, not just incremental improvements in predictive performance. (blog.google)
  • It suggests that such models might be usable for high-throughput virtual screens, discovering “context-conditioned biology,” and generating hypotheses that guide wet-lab work more efficiently. (THE DECODER)
  • If broadly successful, this kind of approach could accelerate target discovery, drug repurposing, or highlighting reinforcement strategies (e.g. combination therapies) rather than starting from scratch.

Possible limitations and caveats

The announcement while significant in a line of AI led discoveries, it doesn’t erase a number of practical and conceptual challenges. The result was only validated in cell culture (human cells in vitro), a necessary first step; however only a minuscule number of findings that hold in vitro replicate in animal models or human trials, due to complexity, pharmacokinetics, off-target effects, tumor microenvironment, immune response, etc. This comes a month after AlphaEvolve, an LLM-based coding agent by Google, to find and verify combinatorial structures that improve results on the hardness of approximately solving certain optimization problems

Context specificity: The hypothesis is said to be context-conditioned—i.e. it may work only under specific cellular states, tumor genotypes, or microenvironment conditions. That limits portability or generality and for this discovery to become the next big drug candidate for cancer cure. However, it does show promise for finding personalised treatments for curing cancer in individual cases. In sum, while this announcement is a compelling proof-of-concept, it should be viewed as an early milestone rather than a guarantee of immediate therapies.

The Google Gemma work is part of a broader movement in biomedicine: applying machine learning, generative AI, and systems modeling to drug discovery, target identification, and precision oncology. Here are several parallel or complementary efforts:

AI’s Strengths and maturity

  1. Pattern recognition and prediction
    AI is quite strong at analyzing high-dimensional molecular, genomic, imaging, or omics data to find patterns, classify disease subtypes, predict outcomes, or cluster latent structure.
  2. Acceleration and cost reduction
    AI can reduce the search space (e.g. candidate molecules, binding sites) drastically, thereby saving time and cost in early phases of drug discovery.
  3. Hypothesis generation and repurposing
    AI is increasingly used to repurpose existing drugs for new indications, or generate mechanistic hypotheses to guide experiments (as with Gemma).
  4. Personalized or precision medicine support
    AI models combining genomics, transcriptomics, imaging, and clinical data enable more tailored therapy suggestions or biomarker discovery.
  5. Automation / closed loops
    The most advanced systems integrate AI with robotic lab work, building closed loops of “design → test → refine” faster than traditional cycles (e.g. high-throughput screening plus AI filtering).

AI’s Key limitations and challenges

  1. Data quality, heterogeneity, bias
    Biomedical data is noisy, sparse, heterogeneous (different labs, platforms). AI models are vulnerable to these artifacts, which can mislead predictions.
  2. Interpretability and trust
    Many AI models are “black boxes.” In medicine, decisions often require explainability and confidence measures, which are harder to ensure.
  3. Translational gap
    Success in cell lines or animal models does not guarantee efficacy or safety in humans. The multi-scale complexity of human biology is still a huge barrier.
  4. Regulation, validation, reproducibility
    Clinical translation requires rigorous validation, trials, regulatory approvals, and reproducibility across independent datasets.
  5. Overfitting / spurious correlation
    Models may pick up correlations that do not reflect causal mechanisms. Distinguishing causation from correlation is imperative in therapeutic development.
  6. Computational and scaling costs
    Training massive models on biological data demands huge compute, memory, and energy — not every lab or company can afford that.
  7. Ethical, privacy, bias issues
    Patient data is sensitive; misuse or unrecognized biases (e.g. underrepresented populations) can exacerbate inequities.

AI in its current state is most reliable in supporting decision-making, narrowing hypothesis spaces, prioritizing experiments, and improving efficiency. However, it is less reliable as a standalone drug designer; human oversight, domain expertise, and rigorous laboratory validation remain indispensable. The field is evolving: improvements in model architectures, multi-omics integration, transfer learning, interpretability, federated learning, and better experimental feedback loops will push capabilities further. Dozens of large companies have already invested heavily into new research ventures that seek to automate discovery.

Google’s Outlook towards AI

The Google Gemma / C2S-Scale result is an exciting demonstration of what next-generation AI models might achieve in hypothesis generation. It signals a possible shift from AI as a passive predictive tool to AI as a creative collaborator in science. However it is still constrained by data, interpretability, and validation challenges, which will take a while longer to resolve.

Over the coming years, progress will likely depend on tighter integration between AI which is much better at hypothesis generation, followed by automated experimentation, robust validation, and rigorous interdisciplinary collaboration between computational scientists and biologists. The Gemma work may well become a landmark case study in that evolving journey.

Read more about it here

Labcritics Alerts / Sign-up to get alerts on discounts, new products, apps, protocols and breakthroughs in tools that help researchers succeed.