Helping computers see the world in 3D

9 September 2019

Artificial Intelligence, Department of Computer Science, Feature, Student

SHARE THIS ARTICLE

It is no simple feat to have a research paper accepted at a top tier computer science conference – let alone to achieve this as an undergraduate student. For recent Computer Science graduate Tang Yew Siang, what started out as an opportunity to learn about research turned into an accepted paper at the International Conference on Computer Vision (ICCV), one of the top computer science conferences.

During his third year of undergraduate studies, Yew Siang chose to take the Undergraduate Research Opportunities Programme (UROP) – a year-long programme for outstanding undergraduate students to undertake research work. While exploring various problems in computer vision, Yew Siang realised he could help computers gain a deeper visual understanding by tackling the problem of 3D object detection. This led to him developing a novel approach to help computers learn how to estimate the 3D position of objects.

We asked Yew Siang more about his UROP experience and his research on 3D object detection:

What is UROP and what did you do in the programme?

UROP is a one-year programme which I started in the third year of my undergraduate studies. I explored different research directions and problems within fully- and weakly-supervised 3D object detection for my UROP research with Assistant Professor Lee Gim Hee. Subsequently, we decided to work on the problem of semi-supervised 3D object detection for my final year project (FYP) thesis and subsequently the ICCV paper.

Why did you choose to take on UROP and to pursue research in Computer Vision?

Machine intelligence is an area of research that intrigues and excites me because of its significance to mankind. It is amazing how billions of seemingly haphazard neurons in our brains are able to help us to reason and act in our world. What we are currently developing in AI is only scratching the surface of what we (humans) are able to do.

As we work towards machine intelligence, I believe we will discover more about the basis behind our mental capabilities and use them to solve pressing and difficult problems around us such as traffic accidents, shortage of skilled doctors, global warming and more.

When I chose to take on UROP, I knew I wanted to deepen my expertise within this field in order to develop core AI technology to solve practical problems in the future. I decided to work on Computer Vision as a start – amongst other important problems in AI – because I found it interesting to get our machines to understand the semantics of our 3D world and the potential for it to be applied in robotics and AR/VR applications.

Can you share more about your research project? What was it about?

Humans are able to easily understand the 3D world around them and have the ability to identify the objects that they see and even predict their 3D positions. This ability is known as 3D object detection in computer vision. It is important for home robots, self-driving cars, AR/VR applications and others because they need to know the precise positions of objects in order to be useful. For example, a home robot needs to know what a bottle looks like and where it is in order to retrieve it while a self-driving car needs to know where the pedestrians are in order to avoid them.

However, for most 3D object detection algorithms, learning to estimate the 3D position of objects can be a very costly process. Currently, most 3D object algorithms require a large dataset of images containing the object they want to predict. To predict the 3D position of sofas, you will need a substantial number of sofa images annotated with 3D bounding box labels. Even if you have a large dataset of annotated table and chair images, they tend to work poorly on sofas. Annotated 3D box labels are very expensive to obtain and so will add substantial cost to the project. We think that this reliance on expensive 3D annotations is impractical and so we wanted to reduce the cost by using cheaper 2D annotations of the desired object and reusing 3D annotations of other objects.

In order to achieve this, we developed an algorithm that is able to learn 3D information from 2D annotations through a relaxation of the reprojection error of predicted 3D boxes. Additionally, the algorithm is able to transfer 3D information from related 3D annotations of objects from different classes. This is made possible by understanding the relationship between the predicted 3D boxes and our input point clouds, and correcting the 3D box predictions such that the 3D box will fit the input point cloud well.

With our algorithm, 3D object detection applications are able to achieve good performance and help save up to ten times in annotation costs. This would allow 3D object detectors to become more practical and likely to be used in different applications.

What was one challenging moment you encountered as you did your research?

When reality is far from our expectations, we will feel sorely disappointed. Indeed, in my first year, I thought I was making a lot of progress on the research problem and was on track towards submitting a paper to the conference. However, in the last few days before submission, I realised that I had used a wrong and outdated evaluation code, which meant that the progress I thought I had wasn’t as good as I thought. It was a difficult and painful time trying to grasp with this reality because I’ve spent a long time working on the problem. Nevertheless, as with all problems, we will eventually have to accept them and move forward. Trying to deal with failure was definitely difficult, but it provided a different but important dimension to my research experience.

Despite the setback, I really enjoyed the stories and conversations that I’ve had in the Computer Vision and Robotic Perception Laboratory. During my time in the lab, I’ve chatted and played badminton with students from China, Thailand, Philippines, Indonesia, Sri Lanka, Bangladesh, Iran and India. These experiences that we’ve had together and these friendships are something that I will cherish for the years to come.

How would you describe the experience of getting your paper accepted at ICCV?

It was a really sweet moment because after all the hard work, I really wanted our work to be recognized as a good piece of work and to be built upon by other researchers in the future. I am very glad that ICCV will indeed provide such a platform to showcase our work.

Did anyone play a significant role in your research journey thus far?

This experience would not have been possible without the support, encouragement, and feedback from Gim Hee and his PhD students Zijian, Lichen, Jiahao, and Mika. They refined my ideas through discussions, encouraged me when I faced setbacks and supported me in different ways. I am grateful to my girlfriend Jia Ying for always being my cheerleader, being so understanding towards my busy schedule and bringing joy in our times together.

I am especially thankful to Gim Hee for placing high expectations on an undergraduate student like me. He provided me with the space and computing resources, guided me with his direct and honest opinions, and encouraged me to aim for the very best. I can’t imagine how my research journey would have turned out without such support and expectations. In all, I am very grateful to have met the people in the lab.

Do you have any advice for students interested in UROP?

It is okay to take up UROP without a deep understanding or passion for the research area. UROP is an exploratory process for undergraduate students to learn how to conduct research and how to learn. That is more important than the actual content that you will acquire from the research. Hence, you can afford to take up UROP to explore whether the field of research is meant for you and whether you like that particular research area.

However, to get the most out of your research journey, my advice would be to set yourself a goal – whether it is to solve a difficult problem, create a novel application, or publish a paper – and to work really hard towards it. If you are willing to put in the effort, I’m sure you would gain something significant along the way.

Yew Siang graduated with a Bachelor’s degree in Computer Science (Highest Distinction) in July 2019. He won the IEEE Singapore Computer Society Book Prize at Commencement for the Best Honours Thesis. Yew Siang will be starting his Masters in Computer Science programme at Stanford University, USA, in September this year.

Paper:
Transferable Semi-supervised 3D Object Detection from RGB-D Data