
Thesis Defense: Filipp Gusev | December 10, 2025 | 1pm
CPCB is proud to announce the following thesis defense:
Title:
Active Learning Guidance for Chemical Experimentation and Discovery
Committee:
Olexandr Isayev, 麻豆村, Chair
Jacob Durrant, PITT
Maria Kurnikova, 麻豆村
Frank Leibfarth, UNC Chapel Hill
Mellon Institute, Room 355
Abstract:
This thesis examines how Machine Learning (ML), and Active Learning (AL) in particular, can transform chemical discovery from a trial-and-error process into a strategic and data-efficient endeavor. Across three aims and six case studies spanning drug discovery, autonomous experimentation, and materials science, AL is employed as a unifying framework to navigate vast chemical and experimental design spaces while evaluating only a small fraction of candidates. Aim I develops AL-guided workflows that couple ML with physics-based binding free energy calculations for structure-based drug discovery in ultra-large virtual libraries. For the SARS-CoV-2 papain-like protease, AL-guided thermodynamic integration identifies 133 improved compounds, including 16 with more than 100-fold potency enhancement, using only 253 calculations from an 8,175-compound library derived from 1.3 billion purchasable molecules. For the LRRK2 WDR domain, AL-guided relative binding free energy optimization achieves a 23% experimental hit rate (8 binders from 35 tested), demonstrating that AL can render gold-standard simulations practical at scale.
Aim II integrates pool-based AL and AutoML with a closed-loop flow-synthesis platform to discover 19F MRI copolymers, identifying high-performance formulations while sampling less than 1% of a large combinatorial space. In parallel, a human-in-the-loop AL framework is applied to the development of an anomaly detector for automated HPLC systems, enabling real-time quality control in emerging self-driving laboratories. Aim III extends ML- and AL-guided strategies to materials discovery under multi-objective and multi-scale constraints: ML-guided screening of crystallizable organic semiconductors achieves a 50% success rate (3 of 6 candidates crystallizing as platelets) from an initial library of 462,000 molecules, while human-in-the-loop reinforcement learning for 3D-printable elastomers discovers 19 materials that double average toughness to 15.9 MPa, with >10 MPa strength and >200% extensibility. Taken together, these studies establish AL as a general-purpose engine for intelligent navigation in chemical and experimental spaces.