- Home
- News and Events
- News
- Large Language Models (‘OncoLlama’) for Pan-Cancer Data Enrichment
Large Language Models (‘OncoLlama’) for Pan-Cancer Data Enrichment
Topics: Platforms, Funding, AI ApplicationsAI Centre receives funding from NHSE ‘Driver’ project to deploy OncoLlama in 5 Cancer centres across the UK.
Every year, 3 million people with cancer across England receive NHS care. This generates notes containing rich and detailed information about their diagnosis, genetic biomarkers, disease course, treatments, and outcomes. However, these crucial details are buried in millions of documents - like letters between clinicians, multidisciplinary team meeting reports and test reports - written in everyday medical language instead of computer-readable codes. Therefore, the coded records used for research and service planning are incomplete, and only capture basic details. This gap severely limits our ability to understand cancer patterns, improve services, and find patients for clinical trials.
This project brings together five major cancer centres from London, Cambridge, Leeds, and Kent-Medway, to deploy 'OncoLlama' - an NHS-developed AI tool that reads clinical documents and automatically extracts detailed cancer information with >98.5% accuracy. Rather than relying on expensive and time-consuming manual curation, OncoLlama can be run inside NHS firewalls to turn millions of cancer records into comprehensive cancer datasets safely.
We aim to demonstrate value through three studies. First, we will validate OncoLlama-generated data and conduct a joint analysis of cancer inequalities across our populations to understand differences in diagnosis, access to treatment including trials, and outcomes. Second, we will show how this technology can improve the completeness of national cancer records. Third, we will demonstrate how enriched data can accelerate and broaden patient identification and workflows in clinical trials, potentially opening up access to innovative treatments for marginalised populations.
In a broad collaboration, global technology partners will support this project through cloud infrastructure and AI expertise, while international life sciences companies will help validate our approach against industry standards.
We will build this solution to be portable and shareable. Any NHS cancer centre will be able to use our deployment package, documentation, and training materials to implement OncoLlama. We will also demonstrate how such collaborations can be conducted at scale across the NHS SDE network.
By demonstrating that scalable high-quality cancer data curation is possible, this project will revolutionise our understanding and enhancement of cancer care, ultimately assisting patients in receiving the appropriate treatment more quickly.