Protein structure prediction using deep neural networks
The overall goal of this project is to develop a data-driven, deep-learning approach for protein structure prediction. Traditional approaches for protein 3D structure prediction are either based on comparative modeling or through ab initio folding. Instead, we propose the following research plan aiming to largely explore the protein conformational space and infer an accurate scoring function for evaluating predicted protein structures. Applying the existing protein structure prediction algorithms and sampling techniques, we will generate a massive dataset including predicted protein models and randomly perturbed near-native decoys for all single-domain protein. Based on this dataset, we will develop a novel structure motif-based deep neural network to infer the hidden sequence-structure relationship and assess the structural quality of predictions. Furthermore, we will apply this deep neural network to boost existing structure prediction algorithms. Collaborating with Dr. Matthew Turk's group, we will exploit the high-performance CPU and GPU resources from NCSA and develop efficient distributed implementations to accelerate both structure generation and training of deep neural networks.