NUS hosts NVIDIA Chief Scientist William Dally for
Distinguished Speakers Series on the future of AI computing

Artificial intelligence systems are growing larger, faster and more power-hungry by the month. But the question that increasingly preoccupies the computing world isn’t what AI can do – it’s whether the infrastructure beneath it can keep up.
That challenge was at the heart of a lecture by Dr William Dally, Chief Scientist and Senior Vice President of Research at NVIDIA. at the NUS 120 Distinguished Speakers Series. A pioneer of parallel computing and high-performance interconnects, Dr Dally spoke to an audience of researchers, students, and industry leaders about the future of AI systems, the limits of semiconductor scaling, and why architectural innovation matters more than ever.
Opening the session, NUS President Tan Eng Chye spoke about the need for scalable, energy-efficient computing infrastructure to power the next wave of AI breakthroughs – and pointed to ongoing efforts at NUS to advance this through interdisciplinary research and industry collaboration.
He noted that while advances in AI models often capture public attention, the underlying systems powering them are equally consequential.
“Behind every frontier AI model lies a coordination challenge – getting thousands of processors to work in concert, communicating at speed and scale,” Prof Tan said. “Often, it is this connective challenge, not raw compute alone, that determines what is possible.”
The end of easy gains
In a wide-ranging discussion moderated by Professor Tulika Mitra, Dean of the NUS School of Computing, Dr Dally made the case that future leaps in AI performance will come less from shrinking transistors and more from rethinking how systems are designed.
“We’re seeing diminishing returns from semiconductor scaling itself,” he said. “Increasingly, more of the gains are coming from architecture rather than process technology.”
One key thread was how AI systems balance flexibility with efficiency. GPUs need to stay programmable enough to support rapidly evolving workloads, but they also need careful specialisation to get the most out of performance and energy.
“The art of good architecture,” Dr Dally said, “is designing instructions that are complex enough to amortise overhead, but still general enough to support almost any application.”
He also pointed to continued advances in low-precision computing beyond today’s FP4 systems, suggesting there is still room to push efficiency further. “I think cleverness will continue pushing us forward,” he said.
When communication becomes the constraint

As AI models scale into the trillions of parameters, raw compute is no longer the only bottleneck. Moving data quickly and reliably across thousands of GPUs has become just as critical.
Dr Dally, whose early research helped shape modern high-performance interconnects, noted that large language models today often need tens of GPUs working in concert, even for inference.
“If you’re interested in good interactivity, that’s fundamentally limited by latency,” he said. “You need very good communication between those systems.”
Many of the principles behind efficient interconnection networks, he added, were laid down decades ago – and still hold up in today’s AI era.
The conversation also turned to energy. While acknowledging the trade-offs between performance, power consumption and chip area, Dr Dally said NVIDIA treats energy efficiency as a core design priority.
“Every generation, we’re trying to improve efficiency,” he said. “And there’s still substantial room to do better.”
Why hardware needs more humans, not fewer

Perhaps the most striking part of the session was about people, not processors.
Asked about the role of academia when large technology companies dominate the field, Dr Dally argued that universities still have a distinct advantage: the freedom to take risks.
“Academia can take longer-term perspectives and pursue high-risk ideas that industry may not be willing to attempt,” he said. “That ability to experiment is one of academia’s greatest strengths.”
He also suggested that hardware design may matter more, not less, for the next generation of computing talent – even as AI automates more software tasks.
“Hardware design still requires substantial human creativity,” he said. “I believe there will continue to be strong opportunities in hardware.”
The session closed with an extended audience Q&A spanning topics from DeepSeek’s efficiency breakthroughs and agentic AI workloads to photonic communications and the prospect of AI-assisted GPU design – a reminder that the questions shaping this field are moving as fast as the technology itself.

Assistant Professor Yair Zick: Ethics in Artificial Intelligence