Population Health Data science
We are building tooling and pipelines for population health data science and AI, including software applications for data cleaning, and multiple models / segmentation applications in later stages of development.
This is now supporting two real-world pathways in a number of London ICBs, as the first step in establishing a full Learning Health System, and will be the foundation for future data-driven population health in London.
Infrastructure-as-code
Building blocks for advanced analytics, as reproducible code pipelines
- Platform Infrastructure + Standardised Data
London Data Service in Snowflake provides a shared platform and common data model for London, that supports the use of open source code pipelines and tooling for analytics, data science, and machine learning.
- Tooling + Pipelines, Reproducible and Shareable as Code
Given compatible data, code and tools can be used across different teams and different locations. This is additional ‘infrastructure’ to support the entire data science and AI lifecycle. These include detailed understanding of data quality, data drift, and ways to represent patients in clinically meaningful ways.
- Data Science + Machine Learning
This comes last. With richer data and richer patient representation, and a platform to conduct experiments and deploy models, data science can be embedded routinely into practice, and predictive modelling can be quickly scaled in response to urgent priorities.