Date/Time : Wednesdays, 4.00pm
Venue: Video Conference Room (VC), COM1-02-13, School of Computing

The seminar series organised by SoC Graduate Studies’ Office involve research talks given by senior PhD students, faculty members and industry partners.

=========================================================

Calendar of Talks

The slides/materials used for the talks can be found here.

AY2018/2019 Semester 2

30-Jan-2019           

Title: Non-Supervised Learning for Understanding Complex Activities from Video
Speaker: Assistant Professor Angela YAO, Department of Computer Science
Abstract: I will be talking about learning and interpretation of complex activities from video sequences.  A complex activity is a procedural task with multiple steps or sub-activities that follow some loose ordering. Complex activities are commonly found in everyday activites that we perform on a daily basis, as well as instructional videos on the web.

In this talk, I will showcase some recent works which use non-supervised learning approaches for complex activity understanding.  The first is a fully unsupervised approach for segmentation.  Given a collection of videos with the same complex activity, we apply an iterative approach which alternates between discriminatively learning the appearance of sub-activities from the videos’ visual features to sub-activity labels and generatively modelling the temporal structure of sub-activities using a Generalized Mallows Model. 

In the second part of the talk, I will highlight our recent submission on action prediction and present a hierarchical model that generalizes instructional knowledge from large-scale text-corpora and transfers the knowledge to the visual domain. Given a portion of an instructional video, our model predicts coherent and plausible actions multiple steps into the future, all in rich natural language.

13-Feb-2019

Title: Music and Mobile for Health and Learning
Speaker: Associate Professor WANG Ye, Department of Computer Science
Abstract:
The use of music as an aid for improving body and mind has received enormous attention over the last 20 years from a wide range of disciplines, including neuroscience, cognitive science, physical therapy, exercise science, psychological medicine, and pedagogy. It is important to translate insights gained from the scientific study of music, learning, and medicine into real-life applications. Such applications should be delivered widely, effectively, and accurately, harnessing the synergy of sound and music computing (SMC), wearable computing, and cloud computing technologies to promote learning and to facilitate disease prevention, diagnosis, and treatment in both developed countries and resource-poor developing countries. In this talk, I will highlight our recent projects at NUS Sound and Music Computing Lab (www.smcnus.org) that are developed to facilitate joyful learning, and motivate physical Rehabilitation.

20-Feb-2019 Title: Price of Incomplete Information in Decision Making: Bridging Multi-armed Bandits and Information Theory
Speaker: Debabrota Basu, PhD Student, Department of Computer Science
Abstract:
Multi-armed bandits are an archetypal setting of reinforcement learning where an agent plays an action at each step. Each of the actions yields a reward from a corresponding reward distribution unknown to the agent. While the agent tries to maximise the sum of accumulated rewards, unavailability of information about the reward distributions force her to explore the actions. Thus, designing a multi-armed bandit algorithm deals with the trade-off between exploration to gain information and exploitation of present information to maximise the accumulated reward as the central problem.
Different approaches to solve the multi-armed bandit problem and different tools to perform the theoretical analysis have been proposed in literature for many decades. Information theory has played a central role in this development, both in developing practical algorithms and establishing algorithm-independent performance bounds.  In this talk, I will present our work exploring the aspect of information geometry, which studies the geometry of the space of probability distributions, in order to accumulate the unavailable information and to leverage it. We propose a new algorithm called BelMan using this approach.  I will introduce the bandit problem, discuss its interaction with information theory, and present our contributions at the crossroad of these two fields.
6-Mar-2019 Title: Fairness, transparency and collaboration in data-driven environments
Speaker: Assistant Professor Yair ZICK, Department of Computer Science
Abstract:
Recent years have seen data-driven algorithms deployed in increasingly high-stakes environments. These algorithms often employ a complex infrastructure, making them effectively “black boxes”; this potentially exposes various stakeholders (such as end-users, or the agencies deploying said algorithms) to risks, such as unfair treatment or inadvertent data breaches. In response, government agencies and professional societies have highlighted fairness and transparency as key design paradigms in AI/ML applications.
In this talk I will discuss our recent work on the foundations of algorithmic transparency and fairness. From the transparency perspective, I will discuss how we design transparency measures that are guaranteed to satisfy certain natural desiderata; in addition, I will discuss a recent line of work showing how some natural transparency measures may be used by an adversary in order to extract private user information. Regarding fairness, I will discuss how we apply fairness paradigms to algorithms, in particular our work on designing and deploying fair allocation algorithms; our results show that humans respond well to provably fair algorithms, and are willing to collaborate effectively even in strategic domains. Finally, I will discuss how we apply learning-theoretic approaches to fairness via a novel paradigm for adapting game-theoretic solution concepts to data-driven domains.
13-Mar-2019 Title: Representation Learning with Graph Convolutional Network
Speaker: Feng Fuli, Dean’s Graduate Award winner (AY2018/2019 Sem1)
Abstract:
Graph Convolutional Network (GCN) is an emerging technique that performs learning and reasoning on graph data. It operates feature learning on the graph structure, through aggregating the features of the connected nodes to obtain the embedding of each node. Owing to the strong representation power, recent research shows that GCN achieves state-of-the-art performance on several graph-based applications such as recommendation and linked document classification. This talk will introduce the recent advance in GCN and two of our attempts to enhance GCN: Temporal-GCN and Cross-GCN. Temporal-GCN performs feature aggregation in a time-sensitive manner by dynamically calculating the strength of connections between nodes. Cross-GCN enhance the feature learning ability by considering arbitrary order cross-feature (feature combination).
20-Mar-2019 Title: Emotional AI in Visual Analytics
Speaker: Associate Professor Stefan Winkler, Department of Computer Science
Abstract:
While most “AI” researchers focus mainly on the “IQ” aspect of intelligence, emotional intelligence or “EQ” is just as important for machines to be able to interact with humans effectively and naturally. In this seminar, I will discuss our work on visual analytics projects where we explore the emotional aspects of image understanding.
The first is photowork: Ubiquitous and affordable digital cameras have led to an explosion of the amount of image material both amateurs and professionals have to work with. Assessing, selecting, editing, organizing, annotating, and browsing this large amount of visual data is tedious and time-consuming. Our aim in this project is to automate some of these processes. Our approaches are content-based and focus on family photo collections, where people and their relationships play a major role.
The second is on profiling people, with a focus on their affective states. Facial expressions in particular are an essential component for conveying and understanding human emotions. Contrary to most existing approaches in computer vision, we avoid the classification of emotions into a few predefined categories, and instead follow a dimensional paradigm as represented by the circumplex model. Based on the tracking of facial landmark points and relevant geometrical features, we directly estimate arousal, valence, and intensity of emotion. We discuss the benefits of our method, and also present some of its applications.
27-Mar-2019 Title: Compositional Static Race Detection at Scale, without False Positives
Speaker: Associate Professor Sergey Ilya, Department of Computer Science
Abstract:
Automatic static detection of data races is one of the most basic problems in reasoning about concurrency.

In my talk, I will present RacerD—a static program analysis for detecting data races in Java programs which is fast, can scale to large code, and has proven effective in an industrial software engineering scenario. RacerD is the first inter-procedural, compositional data race detector which has been empirically shown to have non-trivial precision and impact. Due to its compositionality, it can analyse code changes quickly, and this allows it to perform continuous reasoning about a large, rapidly changing codebase as part of deployment within a continuous integration ecosystem. RacerD has been in deployment for over a year at Facebook, where it has flagged over 2500 issues that have been fixed by developers before reaching production.

In contrast to previous static race detectors, RacerD's design favours reporting high-confidence bugs over ensuring their absence. In the second part of my talk I will explain a True Positives Theorem stating that, under certain assumptions, an idealised theoretical version of RacerD never reports a false positive. I will also describe an empirical evaluation of an implementation of this analysis, versus the original RacerD, showing that the loss of precisions in reporting races does not significantly affect the overall practical  impact of the analysis. This result, thus, suggests that, in the future, theorems of this variety might be generally useful in understanding, justifying and designing effective static analyses for bug catching.

This is a joint work with Sam Blackshear, Nikos Gorogiannis, and Peter O’Hearn (Facebook), published in OOPSLA'18 and POPL'19

KEYWORDS: Concurrency, Static Analysis, Race Freedom, Scalability, Abstract Interpretation

 

AY2018/2019 Semester 1

 

 

 

29-Aug-2018            

Title: Overview of research in mobile sensing and wireless sensor network protocols
Speaker: Associate Professor Chan Mun Choon, Department of Computer Science
Abstract: In this talk, I will cover recent research work done by research group on mobile computing and wireless sensor network protocols. For mobile sensing, I will touch on use of sensors available on wearables/smartphones for inference of user interactions, indoor localization and context detection. For wireless sensor network protocols, I will present research that exploits synchronous transmissions to mitigate wireless contention to design some of the fastest multiple-hop network protocols for data dissemination and sharing. Finally, I will also briefly cover recent work on edge computing and software defined networking with a focus on data plane programmability.

 

 

 

5-Sep-2018 

 

 

Title: Internet-of-Things Security: Benefits and Risks of Sensing
Speaker: Assistant Professor Han Jun, Department of Computer Science
Abstract: With the emergence of the Internet-of-Things (IoT) and Cyber-Physical Systems (CPS), we are witnessing a wealth of exciting applications that enable computational devices to interact with the physical world via overwhelming number of sensors and actuators. However, such interactions pose new challenges to traditional approaches of security and privacy.
In this talk, I will present how I utilize sensor data to provide security and privacy protections for IoT/CPS scenarios, and further introduce novel security threats arising from similar sensor data. Specifically, I will highlight a few of my recent projects that leverage sensor data for defense and attack scenarios in applications such as smart homes and semi-autonomous vehicles. I will also briefly introduce interesting research problems that I am working in newer application domains such as smart vehicles, buildings, and cities.

 

 

 

 

 

 

12-Sep-2018

 

 

 

Title: Overview of research in next generation low latency TCP and software defined networking
Speaker: Associate Professor Ben Leong, Department of Computer Science
Abstract: In this talk, I will describe recent research work done by my research group on next generation low latency TCP (Transmission Control Protocol) and software defined networking using the new P4 language (https://p4.org/) . We have seen in recent times the emergence of a large number of low-latency TCP variants. Surprisingly, these modern low-latency TCP variants can match the performance of TCP CUBIC and even outperform CUBIC for large RTT flows. We found that the likely reason is that the bottleneck buffers are relatively shallow and so these variants are likely throttling CUBIC by inflicting significant losses on the network. Our new rate-based congestion control algorithm that incorporates a new buffer estimation technique which allows a flow to infer its own buffer occupancy as well as that of the competing flows sharing the same bottleneck buffer. With this mechanism, the flow is able to determine its operating environment and, when in a low-latency environment, to collaboratively regulate the bottleneck buffer occupancy with other flows. We believe that the current Internet is facing a transition into another phase with new low latency TCP variants but the transition will not be easy. Our approach will allow the Internet to transition smoothly to a low-latency future. For P4-based work, we recently developed a new system using the P4 programming languag, called BurstRadar, that monitors microbursts in the dataplane. BurstRadar incurs 10 times less data collection and processing overhead than existing solutions. Furthermore, BurstRadar can handle simultaneous microburst traffic at gigabit line rates on multiple egress ports while consuming very little resources in the switching ASIC.

 

 

 

 

 

 

19-Sep-2018

 

 

 

 

Title: Enabling New Applications through Efficient, High-Performance Acceleration
Speaker: Assistant Professor Trevor E. Carlson, Department of Computer Science
Abstract: The development of faster computing devices each year, like what we have seen with mobile phones, have been what consumers have come to expect from the rapid pace of technology development. But, given two significant trends in technology scaling, this progress might hit a brick wall: even more expensive transistors going forward, while using fewer active transistors at a time to get more work done. Does this spell out the end of computing as we know it? Will computers stop getting faster?

As silicon technology improvements have slowed, research into alternatives technologies has increased. Nevertheless, these alternative technologies could still take decades to reach the performance and cost that current CMOS technology provides. One near-term solution is to adapt the computer’s architecture to more efficiently use the transistors that we have. By working smarter, our aim is to continue to provide more functionality in the face of these technological headwinds. To enable new applications, from mobile-based AR and VR to new machine learning approaches, we need to pursue innovative architectural directions.

To accomplish these goals, our research focuses on building flexible processors that can enable these next-generation applications. In this talk, I will present some of our recent work as well as future research directions that propose one direction to move us closer to that goal. In addition, I will also present some critical challenges and potential next steps that we will need to address in the coming years.

 

 

 

 

 

 

 

 

 

 

3-Oct-2018

 

 

 

 

Title: Towards Boosting Performance of Healthcare Analytics: Resolving Challenges in Electronic Medical Records
Speaker: Ms Zheng Kaiping, Dean’s Graduate Award winner (AY2017/2018 Sem2)
Abstract:  In recent years, the increasing availability of Electronic Medical Records (EMR) has brought more promising opportunities to automate healthcare data analytics. However, some challenges in EMR data pose a negative effect on healthcare analytic performance if not well handled, and lead to a gap between the potential of EMR data for analytics and its usability in practice. Therefore, it is vitally necessary and important to resolve the challenges in EMR data in order to boost the performance, and further help derive more medical insights, contributing to better patient management and faster medical research advancement.
In this talk, I will focus on two representative challenges in EMR data, namely irregularity, and bias, and then present our devised solutions to resolving them. First, I will justify that the irregularity challenge should be resolved at the feature level to reduce time information loss. Then I will demonstrate our proposal to incorporate the fine-grained feature-level time span information and show the analytic performance improvement. Second, I will explain that irregularity is a phenomenon, while bias should be the underlying reason. I will present our solution to transform the biased EMR time series into unbiased data and illustrate the improvement brought in terms of missing data imputation accuracy and prediction accuracy of data analytic applications.

Title: Storing videos efficiently and securely… in DNA!
Speaker: Assistant Professor Djordje Jevdjic, Department of Computer Science
Abstract:  The digital universe is exploding rapidly and we are running out of storage space to save it in an economical way. The vast majority of this digital content is multimedia, most notably videos. In the first part of this talk, I will introduce a concept of approximate storage as a new way of efficient and secure storage of videos on very dense, but unreliable emerging storage mediums. In the second part of the talk I will introduce a new, extremely dense, durable, Nature’s best storage medium — DNA! I will quickly cover the process of reading and writing to and from DNA. In the last part of the talk I will propose a few exciting projects related to efficient and secure storage of digital videos and images in DNA.

 

 

 

 

10-Oct-2018 

 

 

 

Title: Beyond SAT Revolution
Speaker:  Assistant Professor Kuldeep Singh Meel, Department of Computer Science
Abstract:  The paradigmatic NP-complete problem of Boolean satisfiability (SAT) solving is a central problem in Computer Science. While the mention of SAT can be traced to early 19th century, efforts to develop practically successful SAT solvers go back to 1950s. The past 20 years have witnessed a ``SAT revolution" with the development of conflict-driven clause-learning (CDCL) solvers. Such solvers combine a classical backtracking search with a rich set of effective heuristics.  While 20 years ago SAT solvers were able to solve instances with at most a few hundred variables, modern SAT solvers solve instances with up to millions of variables in a reasonable time.
The "SAT-revolution" opens up opportunities to design practical algorithms with rigorous guarantees for problems in complexity classes beyond NP by replacing a NP oracle with a SAT Solver. In this talk, we will discuss how we use SAT revolution to design practical algorithms for two fundamental problems in artificial intelligence and formal methods: Constrained Sampling and Counting.

 

 

 

 

 

 

 

 

 

17-Oct-2018 

 

 

 

Title:  Exploiting Knowledge Graph for Personalized Recommendation
Speaker:  Mr Wang Xiang, Dean’s Graduate Award winner (AY2017/2018 Sem2)
Abstract:  In the era of information overload, recommender system has gained widespread adoption across industry to drive various online customer-oriented services. It facilitates users to discover a small set of relevant items, which meet their personalized interests, from overwhelming choices. Generally, the modeling of user-item interactions is at the heart of personalized recommendation. Nowadays, diverse kinds of auxiliary information on users and items become increasingly available in online platforms, such as user demographics, social relations, and item knowledge.
To date, incorporating knowledge-aware channels, especially knowledge graph, into recommender systems is attracting increasing interests, since it can provide deep factual knowledge and rich semantics on items. The usage of such knowledge can better capture the underlying and complex user-item relationships, and further achieve higher recommendation quality. Furthermore, knowledge graph enables us to uncover valuable evidence as well as reasons on why a recommendation is made.
                                                                                                                                                                 
 
Title: Securing Applications from Untrusted Operating Systems using Enclaves
Speaker: Ms Shweta Shinde, Dean's Graduate Award winner (AY2017/2018 Sem2)
Abstract:  For decades, we have been building software with the default assumption of a trusted underlying stack such as the operating system. From a security standpoint, the status quo has been a hierarchical trust model, where trusting one layer implies trusting all the layers underneath it. However, with new usage  models such as outsourced computing and analytics on third-party cloud services, trusting the operating system is no longer an option. As a result, modern CPUs  have started supporting new abstractions which address the threats of an untrusted operating system. Intel SGX is one such new security capability available in commodity CPUs shipping from 2015. It allows user-level application code to execute in enclaves which are isolated from all other software on the system, even from the privileged OS or hypervisor. However, these architectural solutions offer a trade-off between security, ease of usability, and compatibility with legacy  software (both OS and applications). In this talk, I will present a low-TCB, POSIX-compatible, side-channel resistant, and a formally verified solution which allows  users to securely execute their applications on an untrusted operating system.

 

24-Oct-2018

Title: Adversarial Machine Learning
Speaker: Assistant Professor Reza Shokri, Department of Computer Science
Abstract: Machine learning models are used in many critical systems and applications. This makes them very attractive targets for a number of security and privacy attacks, including data poisoning, evasion attacks, and inference attacks.

In this talk, I will present all these attacks, and a systematic way for mitigating their risks. The solution is simple: know your enemy and anticipate their attacks. This is known as adversarial machine learning.

 

 

 

 

 

 

31-Oct-2018 

 

 

 

 

 

Title:  3 Projects on Computer System Performance
Speaker:  Professor Tay Yong Chiang, Department of Computer Science
Abstract: This talk describes 3 current projects on the performance of computer systems:
(1.Database) For 20-odd years, developers and researchers have used the TPC benchmarks to compare their products and algorithms.  These benchmarks have  fixed schemas that bear no relation to current applications.  The target of the database project is to replace TPC benchmarks with synthetic versions of application  datasets.  The idea is to first scale the empirical dataset to the appropriate size, then tweak the data in the resulting dataset to enforce application-specific properties.  The ambition is to have a repository of tweaking tools contributed by the developer community, and current work is on building a collaborative  framework to facilitate tool interoperability.
(2.Memory) Most of the current hot topics in computer science will become cold within 10 years, but caching will remain an issue 50 years from now. Most caching  algorithms try to strike a heuristic balance between recency (e.g. LRU) and frequency (i.e. popularity).  The target of the memory project is to use a Cache Miss Equation to do a scientific study of this balance.
(3.Networking) Over the last 2 years, Google has moved their production traffic to a TCP variant called BBR.  This may start a paradigm shift for TCP congestion  control, from one based on packet loss to one based on bandwidth-delay product.  BBR requires estimates for minimum round-trip time R and maximum  bandwidth X.  BBR measures R and X by periodically changing its packet sending rate.  The target of the networking project is to show that the estimation can be  done differently and passively.  The underlying idea works for any TCP version (CUBIC, Reno, etc.), and even for choosing between hardware/software  architectures for video games.

 

AY2017/2018 Semester 2

31-Jan-2018            

Title: 3 Projects on Computer System Performance
Speaker:
Tay Yong Chiang, Professor, Department of Computer Science
Abstract:
This talk describes 3 current projects on the performance of computer systems:

(1.Database) For 20-odd years, developers and researchers have used the TPC benchmarks to compare their products and algorithms.  These benchmarks have fixed schemas that bear no relation to current applications. The target of the database project is to replace TPC benchmarks with synthetic versions of application datasets.  The idea is to first scale the empirical dataset to the appropriate size, then tweak the data in the resulting dataset to enforce application-specific properties.  The amibition is to have a repository of tweaking tools contributed by the developer community, and current work is on building a collaborative framework to facilitate tool interoperability.

(2.Memory) Most of the current hot topics in computer science will become cold within 10 years, but caching will remain an issue 50 years from now. Most caching algorithms try to strike a heuristic balance between recency (e.g. LRU) and frequency (i.e. popularity).  The target of the memory project is to use a Cache Miss Equation to do a scientific study of this balance.

(3.Networking) Over the last 2 years, Google has moved their production traffic to a TCP variant called BBR.  This may start a paradigm shift for TCP congestion control, from one based on packet loss to one based on bandwidth-delay product.  BBR requires estimates for minimum round-trip time R and maximum bandwidth X.  BBR measures R and X by periodically changing its packet sending rate.  The target of the networking project is to show that the estimation can be done differently and passively.  The underlying idea works for any TCP version (CUBIC, Reno, etc.), and even for choosing between hardware/software architectures for video games.

7-Feb-2018 Title: Privacy and Security in (Outsourced) Machine Learning
Speaker:
Reza Shokri, Assistant Professor, Department of Computer Science
Abstract:
I will talk about the security and privacy threats against machine learning, notably when its training is outsourced. I will discuss how and why machine learning models leak information about the individual data records on which they were trained, and how an attacker can train a deep neural network in such a way that it leaks even more information. I will also talk about security issues with respect to outsourced machine learning, and how we can evaluate such attacks.
14-Feb-2018 Title: Constrained Counting and Sampling: Bridging the Gap between Theory and Practice
Speaker:
Kuldeep Singh Meel, Assistant Professor, Department of Computer Science
Abstract:
Constrained counting and sampling are two fundamental problems in Computer Science with numerous applications, including network reliability, privacy, probabilistic reasoning, and constrained-random verification. In constrained counting, the task is to compute the total weight, subject to a given weighting function, of the set of solutions of the given constraints . In constrained sampling, the task is to sample randomly, subject to a given weighting function, from the set of solutions to a set of give n constraints. In this talk, I will introduce a novel algorithmic framework for constrained sampling and counting that combines the classical algorithmic technique of universal hashing with the dramatic progress made in Boolean reasoning over the past two decades.  This has allowed us to obtain breakthrough results in constrained sampling and counting, providing a new algorithmic toolbox in machine learning, probabilistic reasoning, privacy, and design verification.  I will demonstrate the utility of the above techniques on various real applications including probabilistic inference, design verification and our ongoing collaboration in estimating the reliability of critical infrastructure networks during natural disasters.
21-Feb-2018 Title: Preparing for a Low-Latency Future Internet
Speaker:
Ben Leong, Associate Professor, Department of Computer Science
Abstract:
Google has deployed BBR, a new low-latency TCP variant. We show that to transition smoothly to a low-latency Internet of the future, we need a TCP variant that not only can contend effectively against CUBIC in the current Internet, but that is also able to reduce its level of aggressiveness in a low-latency environment. We present EvaRate, a rate-based congestion control algorithm that incorporates a new buffer estimation technique which allows an EvaRate flow to infer its own buffer occupancy as well as that of the competing flows sharing the same bottleneck buffer. With this mechanism, an EvaRate flow is able to determine its operating environment and, when in a low-latency (or benevolent) environment, collaboratively regulate the bottleneck buffer occupancy with other EvaRate flows. EvaRate highlights a new point in the congestion control design space that deserves further attention.
7-Mar-2018 Title: Super Speaking -- Tricks of the Trade
Speaker: Terence Sim, Associate Professor, Department of Computer Science
Abstract: Most of us in academia are engaged in this typical sequence of activities: (a) do research; (b) write a report/paper about it; (c) give an oral presentation. While many of us are good at research skills (a), and can write reasonable well (b), we are less confident in speaking about it (c). Indeed, presenting our work in front of an audience often causes knees to wobble and stomachs to cramp. It gets worse when we realize, halfway through the talk, that the audience is getting restless or bored because they are not understanding our message.

In this talk, I will share some techniques that will improve the intelligibility of our technical presentations. I learned many of these "tricks of the trade" in school -- the School of Hard Knocks. Others I picked up by observing the habits of good speakers; still others from the wise counsel of my seniors. While I cannot guarantee to take away the nervousness when you give a talk, I can certainly offer practical tips that will hopefully improve the clarity of your communication. At the very least, you can get a kick out of seeing whether I practice what I preach.

14-Mar-2018 Title: Information Theory and Machine Learning
Speaker:
Jonathan Scarlett, Assistant Professor, Department of Computer Science
Abstract:
The field of information theory was introduced as a means for understanding the fundamental limits of data compression and transmission, and has shaped the design of practical communication systems for decades.  In this talk, I will discuss the emerging viewpoint that information theory is not only a theory of communication, but a far-reaching theory of data that is applicable to seemingly unrelated learning problems such as estimation, prediction, and optimization.  This perspective leads to principled approaches for certifying the near-optimality of practical algorithms, as well as understanding where further improvements are possible.  I will provide a gentle introduction to some of the main ideas and insights offered by this perspective, and present examples in the problems of group testing, graphical model selection, sparse regression, and black-box function optimization.
21-Mar-2018

Title: Correcting Language Errors using Machine Translation Techniques
Speaker:
Shamil Chollampatt Muhammed Ashraf, Dean’s Graduate Award winner (AY2017/2018 Sem1)
Abstract: 
Grammatical error correction (GEC) tools play an important role in helping second language learning and providing assistance to non-native writers. Currently, the leading approach to GEC is the machine translation approach, in which potentially erroneous sentences are “translated” into fluent well-formed sentences. This talk will introduce various machine translation techniques that have been successfully applied and adapted to GEC, such as word and character-level statistical machine translation, neural network joint models, and neural encoder-decoder approaches.

Title: Linguistic Properties Matter for Implicit Discourse Relation Recognition: Combining Semantic Interaction, Topic Continuity and Attribution
Speaker: Lei Wenqiang, PhD Student, Department of Computer Science
Abstract:
Modern solutions for implicit discourse relation recognition largely build universal models to classify all of the different types of discourse relations. In contrast to such learning models, we build our model from first principles, analyzing the linguistic properties of the individual top-level Penn Discourse Treebank (PDTB) styled implicit discourse relations: Comparison, Contingency and Expansion.
We find semantic characteristics of each relation type and two cohesion devices – topic continuity and attribution – work together to contribute such linguistic properties. We encode those properties as complex features and feed them into a Naïve Bayes classifier, bettering baselines (including deep neural network ones) to achieve a new state-of-the-art performance level. Over a strong, feature-based baseline, our system outperforms one versus other binary classification by 4.83% for Comparison relation, 3.94% for Contingency and 2.22% for four-way classification.

28-Mar-2018 Title: (Gap/S)-ETH Hardness of SVP
Speaker: Divesh Aggarwal, Assistant Professor, Department of Computer Science
Abstract: There has been a lot of research in the last two decades on constructing cryptosystems whose security relies on the hardness of the shortest vector problem (SVP) on integer lattices. The SVP is well known to be NP-hard. However, such hardness proofs tell us very little about the quantitative or fine-grained complexity of SVP. E.g., does the fastest possible algorithm for SVP still run in time at least, say, 2^{n/5} , or is there an algorithm that runs in time 2^{n/100} or even 2^{\sqrt{n}}? The above hardness results cannot distinguish between these cases, but we certainly need to be confident in our answers to such questions if we plan to base the security of widespread cryptosystems on these answers.

In this talk, I will give a partial answer to this question by showing the following quantitative hardness results for the Shortest Vector Problem in the \ell_p norm (SVP_p)  where n is the rank of the input lattice. 1) For "almost all'' p > 2.14, there no 2^{n/C_p}-time algorithm for SVP_p for some explicit constant C_p > 0 unless the (randomized) Strong Exponential Time Hypothesis (SETH) is false. 2) For any p > 2, there is no 2^{o(n)}-time algorithm for SVP_p unless the (randomized) Gap-Exponential Time Hypothesis (Gap-ETH) is false. 3) There is no 2^{o(n)}-time algorithm for SVP_2 unless either (1) (non-uniform) Gap-ETH is false; or (2) there is no family of lattices with exponential kissing  number in the \ell_2 norm.

This is joint work with Noah Stephens-Davidowitz.

4-Apr-2018

Title: Your Toolbox for Privacy in the Cloud
Speaker:
Tople Shruti Shrikant, Dean’s Graduate Award winner (AY2017/2018 Sem1)
Abstract: 
Use of cloud services is becoming popular among users with terabytes of data uploaded every day. The state-of-the-practice method to secure this data is using encryption. But encryption alone is not enough. As cloud services offer complex functionalities at scale, my research raises several fundamental questions that are important to ensure practical privacy in the cloud. Concretely, 1) Can we compute on encrypted data in real-time? 2) What are the limits of defenses that hide side-channels appearing in encrypted computation techniques? 3) Can we design an ideally efficient side-channel defense for hiding specific data access patterns that exhibit in a large class of applications?
In this talk, I will present various tools that I have developed in my research that answer the above questions and enable practical privacy in the cloud. My first work enables practical arbitrary computation on encrypted data by switching between efficient cryptographic schemes with minimum trust in software. This work forks a new direction in the area of encrypted computation by bridging the gap between two independent lines of approach --- cryptographic primitives and trusted computing. Next, I will present an intractability result for hiding side-channels that leak information in encrypted computation. Lastly, I will show a construction that achieves ideal efficiency (constant latency) for hiding data access patterns in the read-only class of applications.

 

Title: Quantum Communication Using Coherent Rejection Sampling
Speaker:
Anurag Anshu, Dean’s Graduate Award winner (AY2017/2018 Sem1)
Abstract: 
Compression of a message up to the information it carries is key to many tasks involved in classical and quantum information theory. Schumacher [B. Schumacher, Phys. Rev. A 51, 2738 (1995)] provided one of the first quantum compression schemes and several more general schemes have been developed ever since [M. Horodecki, J. Oppenheim, and A. Winter, Commun. Math. Phys. 269, 107 (2007); I. Devetak and J. Yard, Phys. Rev. Lett. 100, 230501 (2008); A. Abeyesinghe, I. Devetak, P. Hayden, and A. Winter, Proc. R. Soc. A 465, 2537 (2009)]. However, the one-shot characterization of these quantum tasks is still under development, and often lacks a direct connection with analogous classical tasks. Here we show a new technique for the compression of quantum messages with the aid of entanglement. We devise a new tool that we call the convex split lemma, which is a coherent quantum analogue of the widely used rejection sampling procedure in classical communication protocols. As a consequence, we exhibit new explicit protocols with tight communication cost for quantum state merging, quantum state splitting, and quantum state redistribution (up to a certain optimization in the latter case). We also present a port-based teleportation scheme which uses a fewer number of ports in the presence of information about input.

Based on a joint work with Vamsi Krishna Devabathini and Rahul Jain. https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.119.120506

11-Apr-2018 Title: Mining Clinical Data
Speaker:
Vaibhav Rajan, Assistant Professor, Department of Information Systems and Analytics
Abstract:
Clinical data analysis poses several modeling challenges that arise due to data heterogeneity, temporality, sparsity, bias and noise. I will outline these challenges in the context of identifying patients at risk of developing complications in hospitals, and present two projects.

Nursing notes contain regular and valuable assessments of patients' condition but often have inconsistent abbreviations and lack the grammatical structure of formal documents, thereby making automated analysis difficult. We design a new approach that effectively utilizes the structure of the notes, is robust to inconsistencies in the text and surpasses the accuracy of previous methods.

Healthcare data often contains heterogeneous datatypes that exhibit complex feature dependencies. Our algorithm for dependency clustering uses copulas to effectively model a wide range of dependencies and can fit mixed -- continuous and ordinal -- data. It scales linearly with size and quadratically with dimensions of input data, which is significantly faster than state-of-the-art correlation clustering methods for mixed data.

I'll conclude with a summary of my current research.

 

 

AY2017/2018 Semester 1

30-Aug-2017            

Title: Analysis of Source Code and  Binaries for Vulnerability Detection and Patching
Speaker: Abhik Roychoudhury, Professor, Department of Computer Science
Abstract: Due to the absence of source code for parts of a software system - analysis methods which work on both source code and binaries are of value. We have studied vulnerability detection techniques which work on both source code and binaries. Our detection techniques combine the essential ingredients of various aspects of fuzz testing - model-based black-box fuzzing, coverage based greybox fuzzing, and symbolic execution based whitebox fuzzing. Apart from detecting security vulnerabilities, these capabilities can also be used for reproducing crashes from crash reports or clustering "similar" crashes. Finally, we have also studied methods for automated program repair, where vulnerability patch suggestions can be generated automatically.
All of our fuzz testing and patching techniques have been evaluated on large scale and well-known systems such as detecting vulnerabilities in real-life applications such as the Adobe Acrobat reader or Windows Media Player. 
The talk will also provide a glimpse into the growing field of semantic program repair and its applications, which was started at NUS and has been gaining traction ever since.

6-Sep-2017 Title: Continuing Moore’s Law: Challenges and Opportunities in Computer Architecture
Speaker: Trevor Erik Carlson, Assistant Professor, Department of Computer Science
Abstract: Ever faster, cheaper mobile phones (as well as other computing devices) have been what consumers have come to expect from technology for many years. But, given two recent trends in technology scaling (todays chips are limited by power and costs because scaling has slowed significantly), it is widely expected that we will no longer receive significant help from scaling to help us build these faster devices. Does this spell out the end of computing as we know it? Will computers stop getting faster?

As silicon technology improvements have slowed, research into alternatives technologies has increased. Nevertheless, these technologies could still take decades to reach the performance and cost that current CMOS provides. One solution to the problem of slowing technology scaling is to adapt the computer’s architecture to more efficiently use the transistors that we have. This is the main focus for our research.

To enable a variety of new applications (AR, VR, machine-learning, etc.) while still providing longer-battery life and higher performance, we need to pursue innovative architectural directions. To do this, our research focuses on building general-purpose (programmable) processors and accelerators that are now a necessity to enable these new applications. In this talk, I will present some recent developments in computer architecture to move us closer to that goal, and present some critical challenges (and potential solutions) that we will need to address in the coming years.

13-Sep-2017 Title: Learning From Multiple Social Networks for Research And Business: A PhD Journey
Speaker:
Aleksandr Farseev, Dean’s Graduate Award winner (AY2016/2017 Sem2)
Abstract:
The drastic change in the Web was witnessed throughout the past decade, which saw an exponential growth in social networking services. The reason of such growth is that social media users concurrently produce and consume data. In this context, millions of users, who follow different lifestyles and belong to different demographic groups, regularly contribute multi-modal data on various online social networks, such as Twitter, Facebook, Foursquare, Instagram, and Endomondo. Traditionally, social media users are encouraged to complete their profiles by explicitly providing their personal attributes such as age, gender, interest, etc. (individual user profile). Additionally, users are likely to join interest-based groups that are devoted to various topics (group user profile). Such information is essential for different applications, but unfortunately, it is often not available publicly. This gives rise of automatic user profiling, which aims at automatic inference of users' hidden information based on observable information such as individual's behavior or utterances. The talk is focused on investigating user profiling across multiple social networks in different application domains.
20-Sep-2017

Title: Adapting User Technologies: Bridging Designers, Machine Learning and Psychology through Collaborative, Dynamic, Personalized Experimentation
Speaker:
Joseph Jay Williams, Assistant Professor, Department of Information Systems and Analytics
Abstract:
Enhancing people's real-world learning and thinking is a challenge for HCI and psychology, while AI aims to build systems that can behave intelligently in the real-world. This talk presents a framework for redesigning the everyday websites people interact with to function as: (1) Intelligent adaptive agents that implement machine learning algorithms to dynamically discover how to optimize and personalize people’s learning and reasoning. (2) Micro-laboratories for psychological experimentation and data collection,

I present an example of how this framework is used to create “MOOClets” that embed randomized experiments into real-world online educational contexts – like learning to solve math problems. Explanations (and experimental conditions) are crowdsourced from learners, teachers and scientists. Dynamically changing randomized experiments compare the learning benefits of these explanations in vivo with users, continually adding new conditions as new explanations are contributed.

Algorithms (for multi-armed bandits, reinforcement learning, Bayesian Optimization) are used for real-time analysis (of the effect of explanations on users’ learning) and optimizing policies that provide the explanations that are best for different learners. The framework enables a broad range of algorithms to discover how to optimize and personalize users’ behavior, and dynamically adapt technology components to trade off experimentation (exploration) with helping users (exploitation).

Bio: Joseph Jay Williams is an Assistant Professor at the National University of Singapore's School of Computing, department of Information Systems & Analytics. He was previously a Research Fellow at Harvard's Office of the Vice Provost for Advances in Learning, and a member of the Intelligent Interactive Systems Group in Computer Science. He completed a postdoc at Stanford University in the Graduate School of Education in Summer 2014, working with the Office of the Vice Provost for Online Learning and the Open Learning Initiative. He received his PhD from UC Berkeley in Computational Cognitive Science, where he applied Bayesian statistics and machine learning to model how people learn and reason. He received his B.Sc. from University of Toronto in Cognitive Science, Artificial Intelligence and Mathematics, and is originally from Trinidad and Tobago. More information about his research and papers is at www.josephjaywilliams.com.

4-Oct-2017 Title: Improving Medication Compliance: How CS Can Help
Speaker:
Ooi Wei Tsang, Associate Professor, Department of Computer Science
Abstract:
Medical compliance refers to the degree to which a patient accurately follows medical advice given by healthcare professionals, including whether they take medication as prescribed, are they taking the right dosage, and at the right timing.  It is challenging for children and young adults patients who need long-term medication to comply due to their lifestyle and the need to balance between their study, social activities, and possibly work.  This talk aims to (i) highlight the importance of the problem and the challenge that the patients face, (ii) review some existing work in computing literature that addresses this problem, and (iii) identify some open research challenges towards improving medical compliance that involve computer networking, sensors, multimedia-multimodal data, AI, and HCI research.
11-Oct-2017

Title: Introduction to blockchain and cryptocurrency research
Speaker:
Luu The Loi, Dean’s Graduate Award winner (AY2016/2017 Sem2)
Abstract: Cryptocurrencies, such as Bitcoin, Ethereum and 250 similar alt-coins, embody at their core a blockchain protocol—a mechanism for a open and decentralized network with even malicious nodes to periodically agree on a set of new transactions. Two of the most popular cryptocurrencies, Bitcoin and Ethereum, support the feature to encode rules or scripts for processing transactions. This feature has evolved to give practical shape to the ideas of smart contracts, or full-fledged programs that are run on blockchains. Recently, Ethereum’s smart contract system has seen steady adoption, supporting millions of contracts, holding billions dollars worth of virtual coins.
In this talk I will give brief introduction about blockchain and smart contract research. I also discuss a few interesting applications and research papers in this space. The talk is concluded by presenting open and interesting research problems that the community is focusing on.

 

Title: Bounds on Distributed Information Spreading in Networks with Latencies
Speaker:
Suman Sourav, PhD Student, Department of Computer Science
Abstract: Consider the problem of disseminating information (broadcast) in a large-scale distributed system: one (or more) nodes in a network have information that they want to share/aggregate/reconcile with others. Classic examples include distributed database replication, sensor network data aggregation, and P2P publish-subscribe systems. We study the performance of these distributed systems under the gossip protocol, in which a node is restricted to communicate with only one other neighboring node per round and show both theoretical upper and lower bounds for the case where networks have arbitrary varying latencies. The network is modeled as a weighted graph, where the network nodes are represented by the vertices, network links by the graph edges and the link latencies by the edge weights. We define a parameter called the weighted conductance and choose a particular latency as the critical latency for the graph. The weighted conductance characterizes how well connected the graph is with respect to the critical latency. We show that this weighted conductance provides an accurate characterization of connectedness by showing that the time required for information spreading has a tight dependence on the weighted conductance. We view our results as a step towards a more accurate characterization of connectivity in networks with delays and we believe that the metric can prove useful in solving numerous other graph problems.
In this talk, I will briefly share the motivation, the possible impact, the current solutions we have, and the research opportunities for the problem.

25-Oct-2017

Title: Making Software Secure: Hardening & Analysis
Speaker:
Roland Yap, Associate Professor, Department of Computer Science
Abstract: Software plays a critical role in everyday life both from personal and enterprise/government standpoint. Unfortunately it is common than many critical software suffer from vulnerabilities, part of the reason being that such software usually is written in or has components in unsafe languages such as C and C++. An important question then is how to make protect ourselves from the inevitable bugs.
This talk looks at two important ingredients to address this critical problem.
Firstly, how to harden real-world low-level code in C/C++. This involves how to make C/C++ code safer while preserving their essential properties.
For example, finding/preventing memory errors, type confusion, undefined behaviors. Some of this research directions will build on extending existing work on low fat pointers which is a state-of-art defence mechanism for buffer overflows.
Another direction is how to find such errors. Symbolic execution is the main method use to analyse the behavior of programs without test cases because it can simulate program execution in a general fashion.
Symbolic execution brings the challenge of how to solve the constraints used to model programs effectively, e.g. string operations such as regular expression matching, how to deal with the heap, etc.
Such analysis can also hand in hand with optimizing and improving the code hardening.

 

Title: Interpretable Machine Learning for User Friendly, Healthy Interventions
Speaker:
Brian Lim, Assistant Professor, Department of Computer Science
Abstract: Advances in artificial intelligence, sensors and big data management have far-reaching societal impacts. These systems augment our everyday lives and can provide healthy interventions to improve our behaviors. These AI-driven systems can be directly helpful to consumers, such as by recognizing and recommending healthy foods, or indirectly by generating insights from data analytics to help to drive policy decisions for on urban populations. However, it is becoming increasingly important for people to understand them and remain in control. As we employ more sophisticated sensors and accurate machine learning models, how can we gain the users’ trust and understand in these applications?
In this talk, I will give an overview of my group’s research into building AI-based, user-centered, and explainable applications spanning healthcare disease risk prediction, mobile food recognition logging, public health fitness tracking, context-aware interruption management, and urban mobility. We employ methods from human-computer interaction and machine learning to (i) eliciting requirements from target users, (ii) develop deployable hardware prototypes and software interfaces, and (iii) evaluate impact on real users in lab and field studies.

1-Nov-2017 Title: Data Privacy in Machine Learning
Speaker:
Reza Shokri, Assistant Professor, Department of Computer Science
Abstract: I will talk about what machine learning privacy is, and will discuss how and why machine learning models leak information about the individual data records on which they were trained.  My quantitative analysis will be based on the fundamental membership inference attacks: given a data record and (black-box) access to a model, determine if a record was in the model's training set.  I will demonstrate how to build such inference attacks on different classification models e.g., trained by commercial "machine learning as a service" providers such as Google and Amazon. Website: http://www.shokri.org
8-Nov-2017 Title: Analyzing Filamentary Structured Objects in Biomedical Images: Segmentation, Tracing, and Synthesis
Speaker:
Cheng Li, Adjunct Assistant Professor, Department of Computer Science
Abstract: Filamentary structured objects are abundant in biomedical images, such as neuronal images, retinal fundus images, and angiography, to name a few.
In this talk, we will discuss on our recent research efforts in addressing the tasks of segmentation, tracing, and synthesis for such images. More details can be found at our project websites https://web.bii.a-star.edu.sg/archive/machine_learning/Projects/filaStructObjs/project.htm.

 

AY2016/2017 Semester 2

25-Jan-2017

Title: Transparency & Discrimination in Big Data Systems
Speaker: Yair Zick, Assistant Professor, Department of Computer Science
Abstract: Big data and machine learning techniques are being increasingly used to make decisions about important, often sensitive, aspects of our lives; these include healthcare, finance and law enforcement. These algorithms often learn from data; for example, they might try to predict someone's income levels based on various features, such as their age, salary or marital status. These algorithms are often very, very good at their job (hence their popularity): they are able to process a huge amount of data and offer accurate predictions that would have otherwise been made by human decision makers with only very partial, biased data (and would certainly require much more time). It is often thought that algorithms are unbiased, in the sense that they do not hold any prior opinions that affect their decisions. In particular, we would not like our algorithms to base their predictions on sensitive features - such as ethnicity or gender.

So, did a big data algorithm base its decisions on "protected" user features? The problem is that in many cases it is very hard to tell: big data algorithms are often extremely complex, so we cannot be sure whether an algorithm used a protected feature (say, gender), or based its prediction on a correlated input.

Our research aims at developing formal methods that offer some transparency into the way that the algorithms use their inputs. Using tools from game theory, formal causality analysis and statistics, we offer influence measures that can indicate how important was a feature in making a decision about an individual, or a protected group. In this talk, I will review some of the latest developments on algorithmic transparency, and its potential impact on interpretable ML.

1-Feb-2017

Title: The emerging security and privacy issues in the tangled web
Speaker: Jia Yaoqi, Dean’s Graduate Award winner (AY2016/2017 Sem1)
Abstract: World Wide Web gradually becomes an essential part of our daily life in the digital age. With the advent of cloud services and peer-to-peer techniques, new security and privacy issues are emerging in the tangled web. In this talk, I first illustrate how cloud services affect the web/local boundary provided by browsers, and then briefly present the privacy leakage in the P2P web overlays as well as the solutions using onion-routing and oblivious RAM.

First, browsers such as Chrome adopt process-based isolation design to protect “the local system” from “the web”. However, as billions of users now use web-based cloud services (e.g., Dropbox and Google Drive), which are integrated into the local system, the premise that browsers can effectively isolate the web from the local system has become questionable. We argue that if the process-based isolation disregards the same-origin policy as one of its goals, then its promise of maintaining the “web/local system (local)” separation is doubtful. Specifically, we show that existing memory vulnerabilities in Chrome’s renderer can be used as a stepping-stone to drop executables/scripts in the local file system, install unwanted applications and misuse system sensors. These attacks are purely data-oriented and do not alter any control flow or import foreign code. Thus, such attacks bypass binary-level protection mechanisms, including ASLR and in-memory partitioning. Finally, we discuss various full defenses and present a possible way to mitigate the attacks presented.

Second, the web infrastructure used to be a client-server model, in which clients (or browsers) request and fetch web contents such as HTML, JavaScript and CSS from web servers. Recently peer-to-peer (P2P) techniques (supported by real-time communications or RTC) have been introduced into the web infrastructure, which enables browsers to directly communicate with each other and form a P2P web overlay. This also brings the open and unsolved problems like privacy issues in P2P systems to the new web overlays. We investigate the security and privacy issues in web overlays, and propose solutions to address these issues using cryptographic and hardware primitives such as onion routing and oblivious RAM. First, we present inference attacks in peer-assisted CDNs on top on web overlays, which can infer user’s online activities such as browsing history. To thwart such attacks, we propose an anonymous peer-assisted CDN (APAC), which employs onion-routing techniques to conceal users’ identities and uses region-based circuit selection algorithm to reduce performance overhead. Second, to hide online activities (or access patterns) of users against long-term global analysis, we design an oblivious peer-to-peer content sharing system (OBLIVP2P), which uses new primitives such as distributed-ORAM in the P2P setting.

8-Feb-2017

Title: From networked chips to cities
Speaker:
Peh Li Shiuan, Provost's Chair Professor, Department of Computer Science
Abstract: As a new faculty member of SoC, I am currently actively scouting for PhD students for my group. This talk is pitched at the students, providing an overview of the kind of research my group has done in the past, and briefly discussing our next steps.
This talk will give an overview of my group’s research, starting from our foray into networks-on-a-chip that enables scalable many-core processors. With many-core processors making their way into mobile devices, providing unprecedented compute power on such devices, we then explore how these powerful mobile devices can enable next-generation applications in smart cities.

15-Feb-2017

Title: On Modeling the Time-Energy Performance of Data-Parallel Applications on Heterogeneous Systems
Speaker: Dumitrel Loghin, Dean’s Graduate Award winner (AY2016/2017 Sem1)
Abstract: The increasing volume of data to be processed leads to an energy usage issue in datacenter computing. Traditionally, datacenters employ homogeneous brawny servers based on x86/64 CPUs which are known to be power-hungry. In contrast, heterogeneous systems combining CPU and GPU cores represent a promising alternative for energy-efficient data-parallel processing. Moreover, the last few years have witnessed a significant performance improvement of low-power, wimpy systems, traditionally used in mobile devices. However, selecting the best configuration in terms of software parameters and system resources is a daunting task because of the very large configuration space exposed by data-parallel frameworks and heterogeneous systems. To alleviate this, we have developed measurement-driven analytic models to determine and analyze suitable system configurations for Hadoop MapReduce, which represents the most popular data-parallel framework.  Using baseline measurements on a single node with small inputs, our models determine the execution time and energy usage on scale-out clusters and workloads. To evaluate the models, we have used two types of systems and five representative MapReduce workloads covering domains such as financial analysis, data mining and simulations. The systems consist of both cloud-based Amazon EC2 instances with discrete GPUs and self-hosted Nvidia Jetson TK1 nodes with integrated GPUs representing brawny and wimpy heterogeneous systems, respectively. Our model-based analysis supports the following key results. Firstly, for both brawny and wimpy systems, we show that heterogeneous clusters consisting of nodes with CPUs and GPUs are almost always more time-energy-efficient than homogeneous clusters with CPU-only nodes. Secondly, we show that multiple wimpy nodes achieve the same time performance as a single brawny node while saving up to 90% of the energy used. In contrast with the related work, we are the first to design an energy usage model for MapReduce and to apply this model to analyze the performance of wimpy heterogeneous systems with GPU.

1-Mar-2017

Title: Real world opportunities for NLP Research to Impact Global Education through MOOCs
Speaker: Kan Min Yen, Associate Professor, Department of Computer Science
Abstract: Massive Open Online Courses (MOOCs) have been heralded as a game-changer as they have the potential to disseminate the best lectures by top educations to the masses.  However, many students who enrol drop out, in part due to the difficulties in finding the motivation to complete the assignments.  Part of this is due to the (lack of) participation by instructor staff actively involved in deliberations in the course, especially in terms of dialogue and discussions with students through courses' discussion forums. 

We leverage natural language processing technologies to better analyse student conversations to identify opportunities for timely instructor intervention to produce better learning outcomes.   We discuss how diversity in MOOC offering has compromised the validity of previously published results, how automatic discourse parsing can improve prediction and the real problem of the bias presented by the user interface that affects the instructors' decision to intervene.

We are actively recruiting interested individuals to continue work on these and allied topics.

8-Mar-2017

*Note: venue

at Seminar Room 3

(COM1-02-12)

Title: Power Papers -- Some Practical Pointers, Part 1
Speaker: Terence Sim, Associate Professor, Department of Computer Science
Abstract: If I write with the flowery flourish of Shakespeare, but my prose proves problematic, then my words become like a noisy gong or a clanging cymbal.  If I have the gift of mathematical genius and can fathom all theorems, but cannot articulate the arcane, my genius appears no different from madness.  If I achieve breakthrough research that can change the world, but cannot explain its significance, the world gains nothing and I labor in vain.

Writing a good research paper takes effort; more so if there is a page limit.  Yet this skill is required of every researcher, who, more often than not, fumbles his or her way through.  Good grammar is only a start; care and craft must be applied to turn a mediocre paper into a memorable one.  Writing skills can indeed be honed.

In this talk, I will highlight the common mistakes many authors make, and offer practical pointers to pack more punch into your paper. Needless to say, the talk will be biased: I will speak not from linguistic theories, but from personal experience, sharing what has, and has not, worked for me.  Students and staff are all welcome to participate: your views and insights will certainly benefit us all.

15-Mar-2017

Title: Cache Miss Equation, and Synthetic Dataset Scaling
Speaker:
Zhang Jiangwei, Research Achievement Award winner (AY2016/2017 Sem1)
Abstract:

Cache Miss Equation: Science seeks to discover what is forever true of nature. For Computer Science, what can we discover that will be forever true about computation or, at least, immune to changes in technology?  Computation fundamentally requires cycles, memory, bandwidth and time. The memory in a computer system has innumerable caches, and our research on this resource focuses on developing an equation to describe cache misses for all levels of the memory hierarchy. It works for a disk cache, database buffers, garbage-collected heaps, nonvolatile memory and content-centric networking. For more details, please check:  http://www.math.nus.edu.sg/~mattyc/CME.html

Synthetic Dataset Scaling: Benchmarks are ubiquitous in the computing industry and academia. Developers use benchmarks to compare products, while researchers use them similarly in their research. For 20-odd years, the popular benchmarks for database management systems were the ones defined by the Transaction Processing Council (TPC). However, the small number of TPC benchmarks are increasingly irrelevant to the myriad of diverse applications, and the TPC standardization process is too slow. This led to a proposal for a paradigm shift, from a top-down design of domain-specific benchmarks by committee consensus, to a bottom-up collaboration to develop tools for application-specific benchmarking. A database benchmark must have a dataset. For the benchmark to be application-specific, it must start with an empirical dataset D.  This D may be too small or too large for the benchmarking experiment, so the first tool to develop would be for scaling D to a desired size. This motivates the Dataset Scaling Problem(DSP): Given a set of relational tables D and a scale factor s, generate a database state D' that is similar to D but s times its size. For more details, please check: http://www.comp.nus.edu.sg/~upsizer/ 

In this talk, I will briefly share the motivation, the possible impact, the current solutions we have, and the research opportunities for both problems.

22-Mar-2017

Title: Computer Vision for Robotics Perception
Speaker:
Lee Gim Hee, Assistant Professor, Department of Computer Science
Abstract:
Camera is a good sensor for robotic perception over traditionally used Lidar because of low-cost and rich in information, but the algorithms are often computationally too expensive, and sensitive to noise and outliers.
In this talk, I will present my work on making some of the computer vision algorithms more efficient and robust for robots to percieve the world through cameras.

29-Mar-2017

Title: Hardening Programs Against Software Vulnerabilities AND Constraints Solvers for Problems in Security
Speaker:
Roland Yap, Associate Professor, Department of Computer Science
Abstract:
The talk will be about two but partially related topics. The first is on preventing exploitation of software vulnerabilities and will be the main focus on the talk. Memory bugs are still the main route where software is attacked. In fact, one might regard that in most of today's complex software in low level languages such as C and C++ that such bugs are inevitable. As such, a strategy to harden the program such that these bugs cannot be exploited, e.g. to corrupt the stack, is perhaps the strategy which needs to be adopted in the long term. There are many kinds of memory errors, perhaps, the most well known are spatial and temporal errors. I will talk about a research direction which opens up the area from simple to complex kinds of program hardening. For students interested in knowing a bit more before hand, a recent paper at NDSS 2017 on protecting stack objects is

Stack Object Protection with Low Fat Pointers https://www.internetsociety.org/events/ndss-symposium/ndss-symposium-2017/ndss-2017-programme/ndss-2017-session-10-software-and

The second topic which I will touch on more briefly is research on constraint solving. Constraint solving is of broad applicability to many domains ranging from theoretical computer science, to verification, to security. I will mention some problems in constraints with some links to verification and security.

5-Apr-2017

Title: Analyzing the Behaviors of Articulated Objects in 3D : Applications to Human and Animals
Speaker:
Cheng Li, Adjunct Assistant Professor, Department of Computer Science
Abstract:
Recent advancement of depth cameras has opened door to many interesting applications. In this talk, I will discuss our research efforts toward addressing the related tasks of pose estimation, tracking, action and behavior analysis of a range of articulated objects (human upper-body, human hand, fish, mouse) from such 3D cameras. In particular, I will talk about our recent Lie group based approach that enables us to tackle these problems under a unified framework. Looking forward, the results could be applied to everyday life scenarios such as natural user interface, behavior analysis and surveillance, gaming, among others.