05 November 2015 – Fourth year PhD student Rajiv Ratn Shah, Associate Professor Roger Zimmerman and their collaborators won second place in the Yahoo-flickr Event Summarization Challenge at ACM Multimedia 2015 (MM 2015), which was held in Brisbane, Australia in late October.

ACM Multimedia is a key international event for the gathering of experts and display of innovations in the field of multimedia. The Yahoo-flickr Challenge was part of the Multimedia Grand Challenge, which included a series of challenges that attempted to address pressing societal and industrial multimedia problems. For the Yahoo-flickr Challenge, participants were tasked to ‘automatically uncover structure within a collection of 100 million photos/videos in the form of detecting and identifying events, and summarizing them succinctly for consumer consumption’.

 Rajiv’s team’s solution to the Yahoo-flickr Challenge was EventBuilder. “EventBuilder helps users get a quick overview of different events by first detecting events from the large collection of user generated content (UGC), such as photos and videos from Flickr, and next producing text summaries from descriptions of UGC and Wikipedia. Moreover, our system provides a visualization of UGC using representative photos and videos on Google Map to show the geographical distribution of event. Experiments and user studies confirm that our system is fast, scalable, and provides useful event summaries.”

Speaking about the experience, Rajiv added, “It is challenging to build a real-time system that produces summaries from a large collection of UGC. Moreover, usually the metadata such as tags, descriptions, and others are noisy and it is difficult to extract useful semantic information from there. This system requires the understanding of multidisciplinary areas such as multimedia, database, NLP, and others. We [managed to] build a very fast and interactive event summarization system which produces event summaries in run-time from a large collection of photos and videos. Moreover, the representative photos and videos detected by our system are very accurate and relevant to events.”