A New Grasp on Robotics: Teaching Robots to Hold the Future

4 June 2025

Faculty, Faculty Award, Feature, Research, Robotics

Shao Lin

Assistant Professor

Computer Science

SHARE THIS ARTICLE

Walk into any modern warehouse or high-tech factory and you’ll find robots moving with impressive precision. They can zip down aisles, lift heavy loads, and work around the clock. But look closer, especially when they need to pick up something delicate or oddly shaped, and you’ll likely notice a struggle. Despite decades of progress, robots still have trouble with what might seem like a simple task: grasping objects as reliably and flexibly as a human hand.

This seemingly basic problem of “dexterous grasping” isn’t just a nuisance. It’s a fundamental barrier to the next wave of robotic applications. Think of robots assisting in surgery, delicately assembling electronics, or even helping out at home by picking up toys or sorting groceries. The stakes are high, and the solution has remained elusive.

That’s what makes the recent research from Assistant Professor Shao Lin at NUS Computing so exciting. A team of scientists from NUS School of Computing and Shanghai Jiaotong University has developed a novel framework called D(R, O) Grasp that promises to bring human-like dexterity within reach of robotic hands. It combines deep learning with a novel spatial representation that allows robots to learn grasping strategies that are fast, accurate, and critically, generalizable across different robot hands and objects.

The research, grounded in both theory and real-world validation, may well mark a turning point in robotic manipulation. And yes; it all begins with a grasp.

Why Grasping Is So Hard for Robots

Humans make grasping look easy. We can pick up a slippery glass, turn a doorknob, or gently cradle a bird. But under the surface, these actions require sophisticated spatial awareness, precise motor control, and the ability to adapt to countless variations in shape, size, weight, and texture.

For robots, there have traditionally been two approaches to this problem. The first is the robot-centric method, where the robot learns to move its own joints and fingers to pick up an object. These models are fast, but they’re also brittle – change the object slightly, or swap out the robot hand, and they often fail. The second approach is object-centric. These models analyze the object’s shape and attempt to compute good grasps independent of the robot body. They’re more flexible but often slow and computationally costly.

In other words, robot-centric methods are fast but narrow; object-centric methods are general but slow. What if there was a way to get the best of both?

Introducing D(R, O) Grasp: A Unified Approach

That’s exactly what D(R, O) Grasp tries to do. The core idea is deceptively simple: instead of focusing only on the object or only on the robot, D(R, O) Grasp looks at the relationship between the two.

This relationship is encoded in something called a D(R, O) matrix that encodes the relative spatial distances between key points on the robot’s hand (in its desired grasp pose) and points on the object. Think of it as a 3D map that shows exactly how the hand and object should align for a successful grasp. The beauty of the D(R, O) representation is that it works across different hand designs and object geometries. It captures the essence of the interaction between the robot and what it’s trying to hold.

Learning to Know Its Own Hand

But creating this kind of generalizable knowledge requires something more than just data; it requires self-awareness, or at least something close to it.

Before D(R, O) Grasp learns how to grasp objects, it needs to learn about itself. Specifically, it undergoes a training process called configuration invariant pretraining. During this phase, the system is shown different poses of its own hand—from fully open to tightly closed—and learns to identify which points on the hand remain the same across these poses.

This might sound trivial, but it’s not. Robot hands change shape dramatically when they grasp. By learning which parts of the hand are “invariant” regardless of pose, the model develops a stable internal map of its own structure. This self-knowledge becomes crucial when the robot needs to plan how to move its joints into a specific grasp pose.

The Pipeline: From Sight to Touch

So, how does it all work when a robot is presented with a new object?

First, the system obtains two point clouds: it samples points on the hand’s link meshes and transforms them with forward kinematics to generate the hand points, then captures the object points with a depth camera. These point clouds are then encoded using neural networks, with the robot-hand encoder leveraging its earlier self-understanding.

Next, the system uses a cross-attention transformer, a powerful deep learning architecture, to match features between the robot and object. This is followed by a conditional variational autoencoder (CVAE), a structure that allows the model to generate multiple valid grasp poses, not just one. After all, there’s usually more than one good way to pick something up.

The CVAE produces a D(R, O) matrix that defines how far each point on the robot hand should be from the object. Using this matrix, the robot then estimates the final 3D positions for its hand points using a method called multilateration (similar to how GPS works). It aligns its rigid finger segments with these positions and finally solves an optimization problem to compute the specific joint angles needed to execute the grasp—all while staying within its physical constraints.

Performance in the Real World

All this theory is impressive, but does it actually work?

The results say yes—resoundingly. D(R, O) Grasp was tested on three different robot hands: the Barrett Hand, the Allegro Hand, and the Shadow Hand. Despite their vastly different shapes and numbers of joints, the system adapted to all of them. In simulation, it demonstrated strong success rates across both known and novel objects.

But perhaps more importantly, the system was fast. The entire pipeline—from perception to grasp execution—took less than a second. That makes D(R, O) Grasp viable for real-time use in applications like pick-and-place tasks or object sorting.

The researchers also tested D(R, O) Grasp in real-world experiments using a robotic arm and hand setup. It achieved an 89% success rate when grasping objects it had never seen before. It even handled partial observations (i.e., situations where parts of the object were obscured) remarkably well.

Why This Matters: Unlocking the Next Generation of Robotics

The implications of this work are far-reaching. In industrial automation, robots could be deployed more flexibly on production lines, quickly adapting to new tools or product shapes without manual reprogramming. In logistics, warehouse robots could handle a wider range of packages, reducing errors and improving throughput.

In the medical field, robotic assistants could assist surgeons by reliably handling instruments, even in dynamic and high-pressure environments. In eldercare or home robotics, helper robots could handle everything from medication bottles to laundry—objects they weren’t explicitly trained on.

And in space exploration or disaster recovery, environments where robots need to deal with unknown objects and environments, D(R, O) Grasp could make all the difference between failure and success.

But perhaps most exciting of all is the idea that the intelligence learned by one robot hand can be transferred to another. This kind of cross-embodiment generalization, demonstrated in the research, opens up new pathways for “robot brains” that aren’t tethered to a specific hardware body. Learn once, apply anywhere!

The Road Ahead

D(R, O) Grasp represents more than just a clever new algorithm; it embodies a new philosophy for robotic manipulation. It moves beyond the siloed thinking of robot-centric or object-centric models and instead embraces the complexity of interaction.

Its success hinges not on brute force, but on understanding. The robot understanding itself, the object, and the task. And in doing so, it takes a significant step toward the dream of versatile, intelligent, reliable and trustworthy robotic agents.

There’s still much work to be done. Future improvements could include more dynamic grasps (think soft deformation), learning from tactile feedback, or incorporating planning in cluttered environments. But with D(R, O) Grasp, the foundation is now in place.

If robots are to truly become helpful partners in our daily lives, not just in factories, but in homes, hospitals, and public spaces, then grasping is not a footnote. It’s the starting point. Thanks to this innovative work at NUS Computing, we’re now much closer to getting a firm hold on the future.

Further Readings: Wei, Z., Xu, Z., Guo, J., Hou, Y., Gao, C., Cai, Z., Luo, J., and Shao, L. (2025) “D(R, O) Grasp: A Unified Representation of Robot and Object Interaction for Cross-Embodiment Dexterous Grasping,” IEEE International Conference on Robotics and Automation (ICRA 2025), Atlanta: GA, May 19-23; available at https://arxiv.org/abs/2410.01702

Trending Posts

27 August 2019

There’s power in hierarchy — but not what you expect

These days, it seems that whenever you’re thirsty and in need of a quick caffeine pick-me-up, there’s always a Starbucks close by — whether you’re running errands locally in the ...

4 June 2021

Boosting creativity in the crowd with deep learning

How can you get your next great idea? One way is to ask other people, and many of them, even a crowd. Crowdsourcing — harnessing the wisdom of the crowd ...

2 October 2019

Quicker MRIs in the future? Machine learning can help

If you’ve ever had an MRI done, you would know that it’s not the most comfortable experience. They can make you feel claustrophobic, you’ll often hear loud thumping or tapping ...

12 November 2019

Here’s to better apps for all of us

This is a scenario that’s probably familiar to many of us: You touch down at your long-awaited holiday destination, collect your luggage, and step outside the airport, raring to go. ...

2 October 2024

Bytes and Barriers: Overcoming the Challenges of Integrating Electronic Health Records

Understanding the tensions and their mitigations is key to effectively managing the challenges involved in creating a seamless system of patient records. Sometimes a trip to the GP’s clinic isn’t ...

21 July 2025

How AI Models Learn to Read – and Learn From – Unnatural Language

Explore how scalable collaborative zk-SNARKs enable fast, secure zero-knowledge proofs across multiple servers. This breakthrough improves privacy and scalability for AI verification, blockchain, and data markets, making advanced cryptography more ...

10 June 2022

Want to make a good app? Update often and get customers involved

Modern-day learners have a wealth of “teachers” to turn to: online books, e-learning courses, YouTube tutorials, and even smartphone apps. If, for instance, you are yearning to lead a more ...

28 May 2019

What Bayesian Optimisation can teach us about baking better cookies and more

Mention “Bayesian Optimisation” to Professor Bryan Low Kian Hsiang and he begins to talk about baking cookies. That’s because to the uninitiated, concepts such as “distributed batch Gaussian process optimisation” ...

30 April 2020

Human-centred explainable AI: Helping people to faithfully interpret machine learning with less mental effort

These days, artificial intelligence (AI) is everywhere we look. It’s what powers predictive searches on Google, enables Spotify and Amazon to recommend new songs and products, puts self-driving vehicles on ...

1 July 2021

Fighting fake news with FANG

There have been many moments of disbelief throughout the pandemic, but one of the most shocking ones happened last April, when then U.S. President Donald Trump suggested that disinfectants could ...

14 May 2020

Always one step ahead: Robo-Chef predicts steps of recipes it’s never seen before

To understand the work she does, Angela Yao says to imagine a future where robot helpers are commonplace. Whether they’re workplace assistants, companions, or domestic helpers, robots need to be ...

13 April 2023

Spotting concurrency bugs in software with sampling

In the summer of 1983, the government organisation Atomic Energy of Canada Limited launched its newest radiation therapy machine. The Therac-25 was highly anticipated — it boasted a revolutionary dual ...

21 May 2021

Creating Human-Aware AI

In 1961, something momentous happened at a squat, nondescript factory in the tiny town of Ewing, New Jersey. The Unimate, a robotic arm, was fired up for the first time, ...

2 November 2023

Creating mobile health apps that factor in the weather

All across the developed world, people are living increasingly sedentary lives. The average adult spends more than half their day sitting down — nearly six hours for those in Singapore, ...

29 December 2021

Watch the Action as it Happens: Towards Low-Latency Video Streaming

Roger Zimmermann has been in the business for a long time — nearly 25 years to be precise. He first started studying media streaming in the late 1990s, as a ...

28 April 2025

When AI Confidence Rubs Off on Us: How AI’s Confidence Shapes Human Decision-Making

A new study by NUS Computing’s AI4SG Lab reveals that AI confidence levels can significantly influence human self-confidence, with lasting effects on our decision-making. ...

25 November 2019

Making Bitcoin Safer — By Breaking It

In Greek mythology, Erebus is the primeval god of darkness, son of Chaos. It’s also the region of the underworld, where souls pass through after dying. The word is so ...

23 April 2025

Stream Without Limits: How LTS Technology Could Revolutionize Immersive 3D Experiences

LTS is a new system developed by NUS Computing researchers (Associate Professor Ooi Wei Tsang and PhD student Shi Yuang) in collaboration with researchers from the National Tsing Hua University ...

14 February 2025

DiversiNews Helps Increase News Exposure and Broaden People’s Minds

Those of us who don’t belong to Gen Z or Alpha may recall simpler times, when there were no smartphones or internet, and the only news we got was via ...

6 May 2021

Can Mobile Apps Make Us Eat Better and Be Healthier?

Every decade has an exercise trend or two that defines it. Step aerobics and the Thighmaster were popular in the ‘90s, for instance, while exer-gaming and CrossFit were all the ...

27 November 2017

Picture Perfect: How Two Guys Changed Drone Photography

It is the 28th of July 2016. A crowd is gathered around a marked off area near the entrance of Level 2, COM1. Among them are NUS President, Professor Tan ...

8 October 2021

Empty shelves in Nairobi’s pharmacies: There’s more than meets the eye

When you’re ill, seeing the doctor is one thing. Getting your prescription filled is another. If you live in an industrialised country, you probably wouldn’t think twice about the latter ...

1 March 2025

Revolutionising 3D Modelling with Tetsphere Splatting: A New Era of Digital Geometry

Explore how a groundbreaking technique developed by NUS Computing’s Assistant Professor Wang Bohan is set to transform digital geometry. Tetsphere Splatting, recently presented at ICLR 2025, uses virtual clay-like spheres ...

1 June 2020

Vanquishing smartphone zombies with EYEditor

If you have been to parts of Orchard Road or Bugis Junction, two busy shopping streets in Singapore, you might have noticed something unusual. There, familiar “traffic light men” flash ...

26 September 2025

Reasoning with Intelligence: A New Blueprint for Controllable Generative AI

Reasoning with Intelligence: A New Blueprint for Controllable Generative AI Generative AI has already dazzled the world with its capabilities. Whether it’s crafting photorealistic images, composing music, generating dialogue, or ...

2 May 2025

Building the Right Features: Rethinking Innovation in the App Economy

A new study published in Information Systems Research by NUS Computing Assistant Professor Aditya Karanam sheds light on how feature strategy influences app adoption in the competitive app market. ...

13 November 2020

Quantum Physics Gets a Boost from AI

Stéphane Bressan and Christian Miniatura grew up in rival neighbourhoods of the naval garrison town of Toulon in southern France. They went to the same high school and the same ...

22 October 2021

Bug-bane begone — enter the era of Automated Program Repair

Consider a programmer sitting at her desk, trying to fix an error in a software system. First, she had to determine what was causing the problem and trace its source ...

21 January 2022

An AI that can read your emotions? Putting safeguards in place

In recent years, some companies, including Amazon, JP Morgan, and Unilever, began asking prospective employees to do a curious thing — to film themselves answering a fixed set of questions. ...

24 September 2021

Making sense of messy data with ThunderGP

Choice is good, but sometimes having too much choice can be a bad thing. Just ask anyone who’s ever tried to delve into a new film on Netflix, discover new ...

A New Grasp on Robotics: Teaching Robots to Hold the Future

SHARE THIS ARTICLE

Trending Posts

Programmes

ADMISSIONS

RESEARCH

DEPARTMENTS

RESOURCES

Programmes

ADMISSIONS

RESEARCH

DEPARTMENTS

RESOURCES