Variation in the amino acid sequence of proteins connects the changes in the genome to their subsequent alteration in phenotype. However, the biophysics of these molecular machines is complex, and predicting how a mutation in the protein sequence will manifest as a functional change is an unsolved challenge. This is especially true for amino acid substitutions that do not cause a severe disruption to the protein function, such as those that lead to genetic disease or accumulate over time.
Protein function is inherently tied to its structure, but it is important to remember that these macromolecules are not locked into a single conformation. Instead, the protein structure is constantly undergoing thermal fluctuations between similar energetic states. In the cover image for the January 19 issue of Biophysical Journal, we show variants in calmodulin as an ensemble of structures derived from molecular dynamics simulations.
The crystal structure is shown in orange, while populated “meta-conformations” are illustrated by the colored ribbons. Each ribbon is derived from a cluster of thousands of simulation-generated structures, and together they represent the color-coded conformational profile of the variant protein. When the structures are aligned by the bound peptide, the impact on the orientation of the loop and helix structures highlights the divergent impact of the different mutations on the protein’s conformational dynamics.
The high dimensionality of the structural ensemble presents challenges for the understanding of protein structure and contains information that can be leveraged to understand protein function. In our study, we show that these dynamic representations of variant protein structures capture information related to the type and severity of the functional disruption. And the performance of machine learning models that utilize these features suggest that they can be used to distinguish between closely related disease mechanisms.
Innovative approaches that are pushing the limits of biophysical simulations, decreasing computational costs, and the rapid development of machine learning algorithms will continue to give us a better understanding of the intricate relationship between sequence, structure, and function, and provide us with increasingly useful tools for understanding the complexity of life.
- Matthew McCoy, John Hamre, Dmitri Klimov, Mohsin Jafri