Skip to Content

Collective variable discovery and enhanced sampling in biomolecular simulation using autoencoders

Andrew Ferguson, University of Chicago

Usage Details

Andrew Ferguson, Wei Chen

The aim of this work is to establish a nonlinear machine learning approach to discover collective variables for protein folding and use these variables to perform enhanced sampling in molecular dynamics simulations. The predictive capacity of molecular dynamics simulations of protein folding is limited by the short time scales accessible compared to the large characteristic time scales to surmount high free energy barriers. Existing nonlinear dimensionality reduction techniques can ably discover good collective variables for accelerated sampling that are correlated with important molecular motions, but do not furnish the explicit coordinate mapping so that biased sampling must be conducted inefficiently and indirectly in proxy variables.

We recently developed the theoretical underpinnings of the first technique capable of performing on-the-fly collective variable discovery and accelerated sampling and validated the approach in small peptide systems. The dual goals of this work, enabled by the unique capabilities of the Blue Waters, are to extend and validate this approach to large proteins of biological relevance in anti-cancer and anti-HIV drug development, and to explicitly incorporate solvent degrees of freedom into collective variable discovery as a crucial determinant of the dynamics of large proteins.