Got my structure – now what? Beyond protein structure prediction

Mark Agostino

15 January 2025

#molecular modeling #molecular modelling #alphafold 3 #molecular model #molecular dynamics

Molecular dynamics simulations of different missense variants of UBE3A

Got my structure – now what? Beyond protein structure prediction

AlphaFold 3, the AI system that predicts a protein’s 3D structure from its amino acid sequence, has completely changed what is possible today when it comes to accurately predicting protein structures.

In our blog, we have previously covered details of how it works, as well as problems of biochemical and pharmaceutical interest that still remain, despite the availability of AlphaFold 3 and the general growth in AI/machine learning methods.

Notable limitations of AlphaFold 3 are that it provides limited insight into protein dynamics and does not provide realistic protein folding pathways.

The problem of movement

An important aspect of protein function is often overlooked: proteins are not static but continuously move and undergo changes in shape in liquid environments. Even relatively subtle changes to a protein’s shape can have a significant impact on function. This movement can be explored using molecular dynamics (MD) simulations.

However, MD simulations have some important limitations:

Large-scale changes in a protein’s shape, such a conformational transition, may take several to hundreds of microseconds, while most MD simulations access timescales of tens to hundreds of nanoseconds, depending on the size of the protein. The timeframe we can routinely simulate is approximately 1000 fold too short to capture conformational changes in proteins.
MD simulations tend to explore a specific region of protein conformational space. Other conformational states that might also exist in the given conditions could be missed. Even if it was practical to access the timescales at which changes in protein shape occur, we might only see conformations similar to the one from which the simulation was started.

Fortunately, we can address these issues by using enhanced sampling approaches.

Overcoming MD limitations with enhanced sampling approaches

Commonly used enhanced sampling approaches include adaptive bias methods, such as metadynamics and its variants (see movie below), and replica exchange approaches.

In adaptive bias approaches, an energetic potential (the bias) is applied to encourage exploration of one or more of given variables in a biomolecular system.

The variables should be chosen so that they both promote sampling of likely states of the system and effectively distinguish between states; examples could include different states during and related to a given conformational change (path collective variable), relative distance and orientation between different system components (such as domains within a given protein or different molecules), or protein compactness and secondary structure content (as might be used to study protein folding).

The most likely state(s) of the system can generally be identified by analysing the applied energetic potential over the simulation timecourse.

Adaptive bias approaches come in very useful when exploring problems like receptor activation, molecular association, and protein folding. As long as appropriate variables can be identified for exploration, imagination is often the limit on what can be achieved.

And the best part: these simulations can often be implemented for only a slightly increased computational cost relative to unbiased simulations.

As an example of the power of these approaches, I have previously used metadynamics in a path collective variable to explore conformations of the ubiquitin E3 ligase UBE3A, variations of which are associated with neurodevelopmental disorders including Angelman Syndrome and Autism Spectrum Disorder. This study demonstrated that wildtype UBE3A and likely benign variants prefer to adopt active protein states, while disease-associated variants (e.g., Leu614Pro, shown in the video above) prefer to adopt alternative protein states, providing a structural biological explanation for their association with disease.

In replica exchange approaches, many replica simulations that vary in a given parameter are performed in parallel and periodically exchanged between one another. A common parameter varied is temperature, as simulations at higher temperatures can capture states that are more distant from the initial state; if these states are also relevant at a low/standard temperature, they will ultimately appear within the lowest temperature replica (see figure below). Replica exchange approaches have been widely used to explore peptide dynamics and folding of small proteins. Their advantage over adaptive bias approaches is that no knowledge of the likely underlying dynamics of the system is needed; on the flipside, the need for many simulations to be conducted in parallel means that in most cases replica exchange approaches need to be run on supercomputers.

Graphic showing four different colour lines symbolising different temperatures. Each line intercepts another at distinct steps. This graphic symbolises how in replica exchange molecular dynamics four simulations of different temperature states are run in parallel. — Graphical schematic of temperature replica exchange molecular dynamics, depicting four parallel simulations that attempt to exchange every m steps. Image source: Christopher Rowley, CC BY-SA 4.0, via Wikimedia Commons.

How can I make use of these approaches?

Unfortunately, it is unlikely that these kind of calculations will be available in a “black box” manner anytime soon, although graphical tools to facilitate setup of enhanced sampling simulations may increase their uptake.

Successful implementation of these approaches still requires specialist knowledge in protein structural biology and molecular simulation as each problem requires a specific setup and careful selection of parameters.

Additionally, while the approaches covered in this post allow for a thorough exploration of protein dynamics, we are often also interested in how proteins interact with other partners, such as ligands and other proteins. This requires alternative approaches that we will cover in another upcoming blog post.