SHARE THIS ARTICLE

Blending AI and Physics to See Proteins in 3D

Led by Professor Zhang Yang from NUS Computing, NUS Biochemistry, and the Cancer Science Institute of Singapore, a research team has developed a hybrid framework that combines deep learning with physics-based modelling to improve the predictions of complex protein structures.

Inside every cell, proteins carry out the work of life. They catalyse chemical reactions, relay signals, regulate genes and form structural scaffolds. Yet proteins begin as something deceptively simple: a linear sequence of amino acids encoded in DNA.

From that one-dimensional chain, a precise three-dimensional (3D) structure must emerge. The sequence bends, twists, and folds into a specific form – and it is this form that determines function. A slight change in shape can alter how a protein behaves, or whether it works at all.

For decades, predicting that final folded structure from sequence alone remained one of biology’s most enduring challenges. Experimental techniques such as X-ray crystallography and cryo-electron microscopy could reveal structures in remarkable detail, but they are resource-intensive and cannot easily scale to the vast number of proteins found in nature.

In recent years, AI has transformed that landscape. Deep learning systems trained on large databases of known protein structures began predicting protein structures with striking accuracy, raising hopes that structure prediction at scale was finally within reach.

Prof Zhang observed these developments with admiration – and caution.

“Deep learning models are extremely powerful at recognising structural patterns,” he says. “But proteins are physical systems. Their shapes are governed by molecular forces. If we want structures that are not only plausible, but physically sound, those forces still matter.”

Rather than viewing AI as a replacement for physics-based modelling, Prof Zhang saw an opportunity to integrate the two.

That conviction led to his team’s latest study, published in Nature Biotechnology. The work introduces a hybrid computational framework – D-I-TASSER – that combines AI-derived structural restraints with physics-based assembly and refinement to improve predictions of complex, multi-domain protein structures.

Where Prediction Becomes Subtle

Some proteins are relatively compact and fold into a single structural domain. Others – especially in humans – are multi-domain proteins, composed of several folded regions connected together.

Each domain may fold independently. But how those domains position themselves relative to one another can determine how the protein behaves.

Predicting that assembly is not straightforward.

Recent AI-based systems have achieved remarkable success in modelling many protein structures. Yet multi-domain proteins remain more challenging. Even when individual domains are predicted accurately, determining how they assemble — how they orient and stabilise in three-dimensional space – adds another layer of complexity.

For Prof Zhang, this was precisely where integration could make a difference.

Letting AI and Physics Inform Each Other

D-I-TASSER begins with deep learning. AI models are used to infer spatial restraints – predictions about which amino residues are likely to be near each other and how different regions may be oriented relative to one another.

These predictions provide a structural starting map.

But instead of treating that map as definitive, the framework incorporates physics-based simulations to assemble and refine candidate structures. Molecular interactions are explicitly modelled. Conformations are evaluated for energetic plausibility. Domains are adjusted until the final configuration is both statistically supported by data and consistent with physical principles.

In practical terms, AI proposes structural possibilities; physics evaluates whether those possibilities can realistically hold together.

For multi-domain proteins, this interplay is especially important. The way domains pack and interact can influence biological function. By combining pattern recognition with physical modelling, the framework seeks to improve accuracy at this delicate stage of structural prediction.

What the Results Show

In benchmark evaluations reported in the study, the hybrid framework demonstrated improved structural accuracy on complex multi-domain protein targets compared with earlier versions of the team’s modelling methods and showed competitive or improved performance relative to leading AI-based approaches assessed under comparable testing conditions.

The gains were particularly evident in modelling inter-domain orientations – spatial relationships that are often among the hardest to predict reliably.

These findings do not suggest that AI-based methods are insufficient. Rather, they indicate that integrating physics-based refinement can strengthen predictions in specific structural contexts, particularly where domain assembly is intricate.

Complementary, Not Competitive

Scientific advances are often framed as disruption – one method replacing another. But protein structure prediction tells a more layered story.

Deep learning has dramatically expanded the scale and accessibility of structural modelling. Physics-based approaches encode decades of insight into how molecules behave under real-world forces. Each captures a different aspect of the problem.

Prof Zhang’s work reflects a belief that progress does not require abandoning foundations. Sometimes it requires reconnecting them – allowing data-driven learning and physical law to reinforce one another.

As computational tools continue to evolve, the future of structural biology may lie not in choosing between methods, but in combining them thoughtfully.

Better Understanding of Human Biology

One reason for the framework’s improved performance lies in its domain-splitting strategy. By breaking complex proteins into constituent domains before assembling them through physics-based refinement, D-I-TASSER is able to model large, multi-domain proteins more effectively.

This approach is particularly valuable when structural complexity is high or when sequence information alone provides limited guidance. In such cases, integrating physical modelling can help refine domain orientations and improve overall structural coherence.

With improved predictions for human multi-domain proteins, the implications extend beyond computational benchmarking. Many human proteins – including those involved in disease pathways – consist of multiple interacting domains. A clearer understanding of how these domains assemble in three-dimensional space can provide deeper insight into biological mechanisms.

While computational models do not replace laboratory validation, stronger predictive frameworks can help guide experimental design, prioritise targets and inform downstream research in areas ranging from fundamental biology to therapeutic discovery.

Artificial intelligence has transformed protein structure prediction. By integrating AI with physics-based modelling, Prof Zhang and his team demonstrate that complementary approaches can further expand what is possible.

Between data-driven learning and physical law, a more detailed map of human biology continues to emerge.

Trending Posts