Deep Diving into Data Readiness for AI-Enabled Biologic Therapeutics Design: From Data Foundations to Scalable AI‑Integrated Discovery Workflows
AI/ML adoption in biologics discovery is often constrained not by model capability, but by fragmented data, inconsistent experimental protocols, and limited interoperability across discovery workflows. This workshop focuses on the data, workflow, and operational data foundations required to deploy AI meaningfully and at scale across protein‑based discovery campaigns.
Participants will explore:
- Strategies for generating, capturing, and standardizing diverse experimental datasets to support AI‑ready biologics discovery, including how to leverage and derisk historical datasets
- How to benchmark biologics data maturity against small molecule discovery – which lessons translate, and which do not?
- What it takes to digitalize discovery workflows, including: NGS data alignment to antibody CDRs, epitope discovery and sequence diversification, library analysis and assay‑spanning data integration
- Implementing harmonized experimental protocols and metadata standards to enable robust AI/ML training and validation
- Integrating in silico tools with conventional computational methods and wet‑lab validation through human‑in‑the‑loop QA/QC
- Best practices for scaling AI/ML workflows across the enterprise, including: MLOps and dataset versioning, bias detection and mitigation, AI/ML model generalizability across discovery campaigns
- Emerging federated learning and consortium‑led approaches (e.g. FAITE) to address data scarcity, heterogeneity, and inter‑lab/inter-protocol variability