Project Info

Nonlinear Dimension Reduction and Machine Learning Algorithms for Coarse-grained Simulations of Biomolecular Complexes

Steve Pankavich
Alex Pak

Project Goals and Description:

Biomolecules (e.g. proteins, lipids, nucleic acids, and carbohydrates) assemble into higher-order structures, so-called biomolecular complexes, to perform a variety of biological functions. Scientists have been interested in redesigning biomolecular complexes for applications across biotechnology, including in sustainability and healthcare. To do so, a molecular understanding of emergent biomolecular assembly is needed, which is difficult to probe from experiments alone. These insights can be revealed through multiscale dynamical simulations that leverage coarse-graining, a process to derive lower-resolution models from high-resolution (e.g. atomic) models. The project focuses on implementing multiscale dynamical simulations of macromolecular complexes by developing a holistic modeling framework that uses (i) supervised, nonlinear dimensional reduction techniques to coarse-grain atomistic variables and (ii) machine learning algorithms to restore the fine, atomic details of coarsened structures. This will require the exploration of new computational feature projection algorithms (e.g., Principal Component Analysis) and probabilistic generative models (e.g., Autoencoders).
All team members will work together as part of a collaborative research group. IMURF participants will regularly meet with professors and PhD students in a communal setting. Everyone on the project will meet weekly or bi-weekly as appropriate to discuss progress and share new ideas.

More Information:

Grand Challenge: Engineer better medicines.

Primary Contacts:

Steve Pankavich, | Alex Pak,

Student Preparation


Interested students should have familiarity with linear algebra (MATH 332) and scientific computing (MATH 307), as well as, some previous coding experience in Python or related languages. Prior experience with algorithms from data science, for instance within the course DSCI 303, is also preferred. Finally, students should be open to learning more about the foundational biophysics inherent to the problems of interest.




Students will gain theoretical, analytical, and computational skills while learning and implementing new methods from data science and machine learning.


The students will meet weekly with Profs. Pak and Pankavich and PhD students in QBE, CBE, or AMS.

Preferred Student Status

Share This