DUAN Jiafei
NUS Presidential Young ProfessorResearch Scientist, A*STAR
- Ph.D. (Computer Science & Engineering, University of Washington, 2026)
- M.S. (Computer Science & Engineering, University of Washington, 2024)
- B.Eng. (Electrical & Electronic Engineering, Nanyang Technological University, 2021)
Jiafei Duan is a Presidential Young Professor in the School of Computing at the National University of Singapore (NUS). He received his Ph.D. in Computer Science & Engineering from the University of Washington, where he was advised by Professors Dieter Fox and Ranjay Krishna, and his undergraduate degree in Electrical and Electronic Engineering from Nanyang Technological University (NTU). His research lies at the intersection of robotics and computer vision, focusing on robotics foundation models, with an emphasis on scalable data collection and generation, grounding vision–language models in robotic reasoning, and improving robust generalisation in robot learning. His work has been featured in MIT Technology Review, GeekWire, VentureBeat, and Business Wire. His research has appeared in top AI and robotics venues, including ICLR, ICML, RSS, CoRL, ECCV, IJCAI, CoLM, and EMNLP, and has received several honours, including Best Paper at Ubiquitous Robots 2023, Best Paper at the CoRL RememberRL Workshop 2025, Best Paper Awards at numerous ICRA 2026 workshops, and a Spotlight Award at ICLR 2024. He was also named a finalist for the CVPR 2026 Rising Star in Spatial Intelligence. In service to the community, he has served as an Area Chair for top conferences such as CoRL.
RESEARCH AREAS
Artificial Intelligence
- Machine Learning
- Computer Vision
- Robotics
RESEARCH INTERESTS
Embodied AI
Robot Learning
3D Vision
Multimodal Large Language Models Reasoning
Robotics Manipulation
RESEARCH PROJECTS
RESEARCH GROUPS
TEACHING INNOVATIONS
SELECTED PUBLICATIONS
- MolmoAct: Action Reasoning Models that Reason in Space
- MolmoAct2: Action Reasoning Models for Real-World Deployment
- SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation
- AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
- THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation
- Manipulate-Anything: Automating Real-World Robots using Vision-Language Models
- A Survey of Embodied AI: From Simulators to Research Tasks
AWARDS & HONOURS
International Conference on Ubiquitous Robots Best Paper Award
CVPR Rising Star in Spatial Intelligence (Finalist)
National Science Fellowship (PhD)
COURSES TAUGHT

