Date/Time : Wednesdays, 4.00pm
Venue: Via Zoom
The seminar series organised by SoC Graduate Studies’ Office involve research talks given by senior PhD students, faculty members and industry partners.
Calendar of Talks
The slides/materials used for the talks can be found here.
AY2020/2021 Semester 2
27-Jan-2021 |
Title: Fast and Secure Processing on the Edge: Efficiently Protecting Your Personal Data Footprint Speaker: Trevor E. Carlson, Assistant Professor, Department of Computer Science Abstract: Edge computing is a key technology that aims to enable both fast and efficient local data processing, where your requests are serviced by local, efficient endpoints instead of distant cloud services. But, maintaining both physically secure systems that are also efficient can be extremely difficult, as modern solutions tend to degrade performance with additional security. To address these concerns, the era of the end of Dennard Scaling and Moore’s Law requires moving beyond technology-only solutions to provide cross-stack solutions from the compiler, architecture and circuit designs. In this presentation, we present our recent work in physical security, and side-channel security, as well as privacy protections to bring efficient AI and data processing to the edge. Our recently released and open-source LABS hardware protection framework, and new initiatives in security and high-performance processing, aim to provide the foundation for a larger secure processor design that incorporates efficient privacy and security protections: physical, timing side-channel, software, and others. |
3-Feb-2021 | Title: Hardening and Defences to Make Software Secure Speaker: Roland Yap, Associate Professor, Department of Computer Science Abstract: A large fraction of software in common use is insecure. By this, what is meant is that any sufficiently complex software is likely to have security bugs. This is exacerbated by a large percentage of software which is part of the critical software stack being written in low level languages. In this talk, we will discuss why the state of software security is poor and the challenges as well as the tradeoffs faced by real-world software. We will then discuss approaches to help address this situation ranging from analysis, hardening defences (and sanitizers) and safer programming languages. |
17-Feb-2021 |
Title: Towards Generating Human-like Deep Questions |
3-Mar-2021 | Title: GPU-accelerated Graph Processing Speaker: Sha Mo, Dean’s Graduate Award Winner Abstract: Graph processing is of vital significance for investigating complex relationships and mining underlying knowledge in different fields. The rapidly increasing scale of problems and the strict requirement for real-time solutions have drastically raised interest in the area of high-performance graph processing. Specifically, graph processing on graphics processing units (GPUs) has recently attracted a great deal of attention in both industry and academia due to the GPU's enormous potential for boosting the graph processing efficiency as an accelerator. Although prior studies migrate various graph applications to GPUs and demonstrate a significantly boosted graph processing performance achieved on GPUs, many challenges still impede the popularization of GPUs as a graph processing accelerator in broader application scenarios. Therefore, in this talk, we aim to propose a systematic solution to make GPU-accelerated graph processing more practical, i.e., to develop a hardware-agnostic graph processing system that handles large-scale and dynamic graphs. |
AY2020/2021 Semester 1
26-Aug-2020 |
Title: New algorithms for efficient statistical inference Speaker: Arnab Bhattacharyya, Assistant Professor, Department of Computer Science Abstract: I will describe some of our recent work on computational problems arising in statistics and causal inference. The talk consists of three parts. (1) In the first part, we discuss efficient distance approximation algorithms for several popular classes of structured high-dimensional distributions, such as Bayes networks, Ising models, and multivariate gaussians. Our results are the first efficient distance approximation algorithms for these well-studied problems. They are derived using a simple and general connection to distribution learning algorithms. [Joint work with Sutanu Gayen, Kuldeep Meel, and N.V. Vinodchandran] (2) In the second part, we study high-dimensional estimation from truncated samples. We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models. Our algorithms show sample complexity that scale with the model sparsity instead of the dimension.For both problems, our estimator minimizes the sum of the finite population negative log-likelihood function and a Lasso penalty term. [Joint work with Rathin Desai, Sai Nagarajan, and Ioannis Panageas] (3) In the third part, we consider testing independence among a set of variables where the samples arrive as a stream. Improving upon past work by Indyk and McGregor (SODA '08) and Braverman et al. (STOC '10, STACS '10), we give new algorithms with improved space complexity bounds for approximating the distance between the input joint distribution on n variables and the product distribution of its marginals. [Joint work with Rathin Desai, Yi Li, and David P. Woodruff] |
2-Sep-2020 |
Title: Fast and Accurate Deep Neural Network Training |
9-Sep-2020 | Title: End-to-End Advanced Data Analytics Speaker: Ooi Beng Chin, Distinguished Professor, Department of Computer Science Abstract: Big Data, Data Science and Data-driven insights have drawn many into data management and analytics. Instead of focusing on particular modules or functionalities, leaving it to others to integrate these into useful systems, it is important to build end-to-end solutions, from data cleaning, through data curation with human-in-loop (crowdsourcing) and big data processing, all the way to complex (machine learning and deep learning based) data analytics. As a system researcher, we have to work on both algorithmic research and system development. In this talk, I shall briefly step through some of my works over the years, and highlight some common techniques shared in various data processing systems. I shall also share my experience as a database system researcher. |
16-Sep-2020 |
Title: Voice-based Interactions for Editing Text On the Go Speaker: Debjyoti Ghosh, Dean’s Graduate Award winner Abstract: Towards an envisioned interaction paradigm where computing is more seamlessly integrated with the users' everyday mobility, voice-based interaction is likely to play a pivotal role as speaking is a natural form of human communication and also, offers an untethered and device-independent channel of communication with a computing interface. Additionally, the interaction is hands- and eyes-free, leaving the users free to engage in other tasks. Yet, to lower the interaction burden when the interactions are embedded into the users' everyday mobility and activities, the traditional (existing) interaction vocabulary for on-the-go (mobile) interactions needs to be redesigned under the new paradigm. To this end, the talk presents voice-based and multimodal interaction techniques for text input/editing as an everyday mobile computing task for both eyes-free interfaces and heads-up computing-based interfaces like Augmented Reality Smart Glasses (ARSG), that better support flexible and natural interactions consistent with the users' mobility needs. |
AY2019/2020 Semester 2
29-Jan-2020 |
Title: Advantages and Risks of Sensing for Cyber-Physical Security Speaker: Han Jun, Assistant Professor, Department of Computer Science Abstract: With the emergence of the Internet-of-Things (IoT) and Cyber-Physical Systems (CPS), we are witnessing a wealth of exciting applications that enable computational devices to interact with the physical world via overwhelming number of sensors and actuators. However, such interactions pose new challenges to traditional approaches of security and privacy. In this talk, I will present how I utilize sensor data to provide security and privacy protections for IoT/CPS scenarios, and further introduce novel security threats arising from similar sensor data. Specifically, I will highlight three of my recent projects that leverage sensor data for defense and attack scenarios in applications such as smart homes and semi-autonomous vehicles. Furthermore, I will introduce my future research directions such as identifying and defending against unforeseen security challenges from newer application domains such as smart vehicles, buildings, and cities. |
5-Feb-2020 *Note: Venue at Seminar Room 2 (COM1-02-04) |
Title: Power Papers -- Some Practical Pointers (Part 1) Writing a good research paper takes effort; more so if there is a page limit. Yet this skill is required of every researcher, who, more often than not, fumbles his or her way through. Good grammar is only a start; care and craft must be applied to turn a mediocre paper into a memorable one. Writing skills can indeed be honed. In this reprise talk, I will highlight the common mistakes many researchers make, and offer practical pointers to pack more punch into your paper. Needless to say, the talk will be biased: I will speak not from linguistic theories, but from personal experience, sharing what has, and has not, worked for me. Students and staff are all welcome ; your views and insights will certainly benefit us all. |
12-Feb-2020 *Note: Venue at Seminar Room 2 (COM1-02-04) |
Title: Power Papers -- Some Practical Pointers (Part 2) Speaker: Terence Sim, Associate Professor, Department of Computer Science Abstract: If I write with the flowery flourish of Shakespeare, but my prose proves problematic, then my words become like a noisy gong or a clanging cymbal. If I have the gift of mathematical genius and can fathom all theorems, but cannot articulate the arcane, my genius appears no different from madness. If I achieve breakthrough research that can change the world, but cannot explain its significance, the world gains nothing and I labor in vain. Writing a good research paper takes effort; more so if there is a page limit. Yet this skill is required of every researcher, who, more often than not, fumbles his or her way through. Good grammar is only a start; care and craft must be applied to turn a mediocre paper into a memorable one. Writing skills can indeed be honed. In this reprise talk, I will highlight the common mistakes many researchers make, and offer practical pointers to pack more punch into your paper. Needless to say, the talk will be biased: I will speak not from linguistic theories, but from personal experience, sharing what has, and has not, worked for me. Students and staff are all welcome ; your views and insights will certainly benefit us all. |
19-Feb-2020 | Title: Privacy at the intersection of trustworthy machine learning (robustness and interpretability) Speaker: Reza Shokri, Assistant Professor, Department of Computer Science Abstract: Machine learning algorithms have shown an unprecedented predictive power for many complex learning tasks. As they are increasingly being deployed in large scale critical applications for processing various types of data, new questions related to their trustworthiness would arise. Can machine learning algorithms be trusted to have access to individuals’ sensitive data? Can they be robust against noisy or adversarially perturbed data? Can we reliably interpret their learning process, and explain their predictions? In this talk, I will go over the challenges of building trustworthy yet privacy-preserving machine learning algorithms in centralized and distributed (federated) settings, and will discuss the inter-relation between privacy, robustness, and interpretability. |
4-Mar-2020 | Title: Learning Visual Attributes for Discovery of Actionable Media Speaker: Francesco Gelli, Dean’s Graduate Award winner (AY2019/2020 Sem1) Abstract: Since the advent of social media, it became common practice for marketers to browse user generated content in social networks websites to discover actionable media that resonate with the brand identity and is likely to engage a target audience. Since there is still no concrete understanding of what are the visual attributes of actionable media, we formalize the task of discovery of actionable media and investigate the role of three different classes of attributes. By learning generic visual attributes, brand attributes and user attributes, we achieve higher performance and provide a better understanding of the properties of the actionable media. We design a discovery framework that integrates the three different classes of attributes and addresses the challenges derived from the subjective nature of visual actionability. Our comprehensive set of experiments and visualizations confirms that this work is a valuable concrete step toward using AI to discover actionable media for brands. |
11-Mar-2020 | Title: Part 1: Rigorous Verification of Neural Networks, Part 2: How to be a PhD student Speaker: Kuldeep S. Meel, Assistant Professor, Department of Computer Science Abstract: The first part of the talk will focus on the rigorous verification approach for neural networks. Relevant Paper: https://teobaluta.github.io/NPAQ/. Last semester, I spent 15 minutes of talk on distilling my observations on what it takes to succeed in PhD program. Several students found it very helpful, I plan to do the same again. |
18-Mar-2020 | Title: Corpus-Level End-to-End Exploration for Interactive Systems Speaker: Grace Hui Yang, Visiting Associate Professor, Georgetown University Abstract: A core interest in building Artificial Intelligence (AI) agents is to let them interact with and assist humans. One example is Dynamic Search (DS), which models the process that a human works with a search engine agent to accomplish a complex and goal-oriented task. Early DS agents using Reinforcement Learning (RL) have only achieved limited success for (1) their lack of direct control over which documents to return and (2) the difficulty to recover from wrong search trajectories. In this paper, we present a novel corpus-level end-to-end exploration (CE3) method to address these issues. In our method, an entire text corpus is compressed into a global low-dimensional representation, which enables the agent to gain access to the full state and action spaces, including the under-explored areas. We also propose a new form of retrieval function, whose linear approximation allows end-to-end manipulation of documents. Experiments on the Text REtrieval Conference (TREC) Dynamic Domain (DD) Track show that CE3 outperforms the state-of-the-art DS systems. |
25-Mar-2020 | Title: Finding Fair and Efficient Allocations When Valuations Don't Add Up Speaker: Yair Zick, Assistant Professor, Department of Computer Science Abstract: In this paper, we present new results on the fair and efficient allocation of indivisible goods to agents that have monotone, submodular, non-additive valuation functions over bundles. Despite their simple structure, these agent valuations are a natural model for several real-world domains. We show that, if such a valuation function has binary marginal gains, a socially optimal (i.e. utilitarian social welfare-maximizing) allocation that achieves envy-freeness up to one item (EF1) exists and is computationally tractable. We also prove that the Nash welfare-maximizing and the leximin allocations both exhibit this fairness-efficiency combination, by showing that they can be achieved by minimizing any symmetric strictly convex function over utilitarian optimal outcomes. To the best of our knowledge, this is the first valuation function class not subsumed by additive valuations for which it has been established that an allocation maximizing Nash welfare is EF1. Moreover, for a subclass of these valuation functions based on maximum (unweighted) bipartite matching, we show that a leximin allocation can be computed in polynomial time. |
1-Apr-2020 |
Title: Systems Design in the Post-Moore’s Law Era |
8-Apr-2020 | Title: Human-imperceptible Privacy Protection Against Machines Speaker: Shen Zhiqi, Research Engineer, Department of Computer Science Abstract: Privacy concerns with social media have recently been under the spotlight, due to a few incidents on user data leakage on social networking platforms. With the current advances in machine learning and big data, computer algorithms often act as a first-step filter for privacy breaches, by automatically selecting content with sensitive information, such as photos that contain faces or vehicle license plate. In this paper we propose a novel algorithm to protect the sensitive attributes against machines, meanwhile keeping the changes imperceptible to humans. In particular, we first conducted a series of human studies to investigate multiple factors that influence human sensitivity to the visual changes. We discover that human sensitivity is influenced by multiple factors, from low-level features such as illumination, texture, to high-level attributes like object sentiment and semantics. Based on our human data, we propose for the first time the concept of human sensitivity map. With the sensitivity map, we design a humansensitivity-aware image perturbation model, which is able to modify the computational classification results of sensitive attributes while preserving the remaining attributes. Experiments on real world data demonstrate the superior performance of the proposed model on human-imperceptible privacy protection. |
AY2019/2020 Semester 1
21-Aug-2019 |
Title: Quantum Monte Carlo |
28-Aug-2019 | Title: Benefits and Risks of Sensing for Emerging Internet-of-Things Applications Speaker: Han Jun, Assistant Professor, Department of Computer Science Abstract: With the emergence of the Internet-of-Things (IoT) and Cyber-Physical Systems (CPS), we are witnessing a wealth of exciting applications that enable computational devices to interact with the physical world via overwhelming number of sensors and actuators. However, such interactions pose new challenges to traditional approaches of security and privacy. In this talk, I will present how we utilize sensor data to provide security and privacy protections for IoT/CPS scenarios, and further introduce novel security threats arising from similar sensor data. Specifically, I will highlight a few of our recent projects that leverage sensor data for defense and attack scenarios in applications such as smart homes, semi-autonomous vehicles, and drone delivery. I will also briefly introduce interesting research problems that I am working in newer application domains such as smart vehicles, buildings, and cities. |
4-Sep-2019 | Title: Dialog Systems Go Multimodal Speaker: Liao Lizi, Dean’s Graduate Award winner (AY2018/2019 Sem2) Abstract: The next generation of user interfaces aims at intelligent systems that are able to adapt to common forms of human dialogs and hence provide more intuitive and natural ways of interaction. This ambitious goal, however, poses new challenges for the design and implementation of the systems. First of all, as visual perception is one of the major means of perceiving the environment in addition to text (through speed), it motivates the development of dialog systems with multimodal understanding ability. Second, to make the system “smart” in generating substantive responses, knowledge should be incorporated as a foundation to achieve human-like abilities. In this talk, we aim to discuss how the task-oriented dialog systems could go multimodal. Specifically, we investigate the critical issues in multimodal dialog system design and propose a novel multimodal dialog system framework which can be realised as fully-fledged prototype systems. |
11-Sep-2019 | Title: Formal Methods and AI: Yet Another Entanglement Speaker: Kuldeep Singh Meel, Assistant Professor, Department of Computer Science |
18-Sep-2019 | Title: DDoS and Bitcoin Attacks Exploiting Internet Routing Speaker: Kang Minsuk, Assistant Professor, Department of Computer Science Abstract: The knowledge of Internet architecture and inter-domain routing can be extremely useful for strong and stealthy attacks. In this talk, I will present two such recent examples. First, I will discuss a new adaptive link-flooding attack strategy (IEEE S&P 2019), called a detour-learning attack, that can detect any adaptive rerouting defense attempts by victim networks that are under link-flooding attacks, such as Crossfire or Coremelt. We show that in the current BGP routing any adaptive defense is defeated by our adaptive link-flooding attack because the defense, unfortunately, is inherently slower than attacks. In the second part of the talk, I will present our recent, powerful Bitcoin partitioning attack (IEEE S&P 2020), called an Erebus attack. A previous attack by Apostolaki et al. has shown that network adversaries (e.g., ISPs) can perform a BGP prefix hijacking attack against Bitcoin nodes. However, due to the nature of BGP operation, such a hijacking is globally observable and thus enables immediate detection of the attack and the identification of the perpetrator. Our Erebus attack partitions the Bitcoin network without any routing manipulations, making the attack undetectable to control-plane and even to data-plane detectors. We show that the Erebus attack is readily available for large ISPs against the vast majority of public Bitcoin nodes with negligible attack traffic rate and a modest (e.g., 5–6 weeks) attack execution period. As the attack exploits the topological advantage of being a network adversary but not the specific vulnerabilities of Bitcoin core, no quick patches seem to be available. I will discuss some suggested modifications to the Bitcoin core. |
AY2018/2019 Semester 2
30-Jan-2019 |
Title: Non-Supervised Learning for Understanding Complex Activities from Video In this talk, I will showcase some recent works which use non-supervised learning approaches for complex activity understanding. The first is a fully unsupervised approach for segmentation. Given a collection of videos with the same complex activity, we apply an iterative approach which alternates between discriminatively learning the appearance of sub-activities from the videos’ visual features to sub-activity labels and generatively modelling the temporal structure of sub-activities using a Generalized Mallows Model. In the second part of the talk, I will highlight our recent submission on action prediction and present a hierarchical model that generalizes instructional knowledge from large-scale text-corpora and transfers the knowledge to the visual domain. Given a portion of an instructional video, our model predicts coherent and plausible actions multiple steps into the future, all in rich natural language. |
13-Feb-2019 |
Title: Music and Mobile for Health and Learning |
20-Feb-2019 | Title: Price of Incomplete Information in Decision Making: Bridging Multi-armed Bandits and Information Theory Speaker: Debabrota Basu, PhD Student, Department of Computer Science Abstract: Multi-armed bandits are an archetypal setting of reinforcement learning where an agent plays an action at each step. Each of the actions yields a reward from a corresponding reward distribution unknown to the agent. While the agent tries to maximise the sum of accumulated rewards, unavailability of information about the reward distributions force her to explore the actions. Thus, designing a multi-armed bandit algorithm deals with the trade-off between exploration to gain information and exploitation of present information to maximise the accumulated reward as the central problem. Different approaches to solve the multi-armed bandit problem and different tools to perform the theoretical analysis have been proposed in literature for many decades. Information theory has played a central role in this development, both in developing practical algorithms and establishing algorithm-independent performance bounds. In this talk, I will present our work exploring the aspect of information geometry, which studies the geometry of the space of probability distributions, in order to accumulate the unavailable information and to leverage it. We propose a new algorithm called BelMan using this approach. I will introduce the bandit problem, discuss its interaction with information theory, and present our contributions at the crossroad of these two fields. |
6-Mar-2019 | Title: Fairness, transparency and collaboration in data-driven environments Speaker: Assistant Professor Yair ZICK, Department of Computer Science Abstract: Recent years have seen data-driven algorithms deployed in increasingly high-stakes environments. These algorithms often employ a complex infrastructure, making them effectively “black boxes”; this potentially exposes various stakeholders (such as end-users, or the agencies deploying said algorithms) to risks, such as unfair treatment or inadvertent data breaches. In response, government agencies and professional societies have highlighted fairness and transparency as key design paradigms in AI/ML applications. In this talk I will discuss our recent work on the foundations of algorithmic transparency and fairness. From the transparency perspective, I will discuss how we design transparency measures that are guaranteed to satisfy certain natural desiderata; in addition, I will discuss a recent line of work showing how some natural transparency measures may be used by an adversary in order to extract private user information. Regarding fairness, I will discuss how we apply fairness paradigms to algorithms, in particular our work on designing and deploying fair allocation algorithms; our results show that humans respond well to provably fair algorithms, and are willing to collaborate effectively even in strategic domains. Finally, I will discuss how we apply learning-theoretic approaches to fairness via a novel paradigm for adapting game-theoretic solution concepts to data-driven domains. |
13-Mar-2019 | Title: Representation Learning with Graph Convolutional Network Speaker: Feng Fuli, Dean’s Graduate Award winner (AY2018/2019 Sem1) Abstract: Graph Convolutional Network (GCN) is an emerging technique that performs learning and reasoning on graph data. It operates feature learning on the graph structure, through aggregating the features of the connected nodes to obtain the embedding of each node. Owing to the strong representation power, recent research shows that GCN achieves state-of-the-art performance on several graph-based applications such as recommendation and linked document classification. This talk will introduce the recent advance in GCN and two of our attempts to enhance GCN: Temporal-GCN and Cross-GCN. Temporal-GCN performs feature aggregation in a time-sensitive manner by dynamically calculating the strength of connections between nodes. Cross-GCN enhance the feature learning ability by considering arbitrary order cross-feature (feature combination). |
20-Mar-2019 | Title: Emotional AI in Visual Analytics Speaker: Associate Professor Stefan Winkler, Department of Computer Science Abstract: While most “AI” researchers focus mainly on the “IQ” aspect of intelligence, emotional intelligence or “EQ” is just as important for machines to be able to interact with humans effectively and naturally. In this seminar, I will discuss our work on visual analytics projects where we explore the emotional aspects of image understanding. The first is photowork: Ubiquitous and affordable digital cameras have led to an explosion of the amount of image material both amateurs and professionals have to work with. Assessing, selecting, editing, organizing, annotating, and browsing this large amount of visual data is tedious and time-consuming. Our aim in this project is to automate some of these processes. Our approaches are content-based and focus on family photo collections, where people and their relationships play a major role. The second is on profiling people, with a focus on their affective states. Facial expressions in particular are an essential component for conveying and understanding human emotions. Contrary to most existing approaches in computer vision, we avoid the classification of emotions into a few predefined categories, and instead follow a dimensional paradigm as represented by the circumplex model. Based on the tracking of facial landmark points and relevant geometrical features, we directly estimate arousal, valence, and intensity of emotion. We discuss the benefits of our method, and also present some of its applications. |
3-Apr-2019 | Title: Compositional Static Race Detection at Scale, without False Positives Speaker: Associate Professor Sergey Ilya, Department of Computer Science Abstract: Automatic static detection of data races is one of the most basic problems in reasoning about concurrency. In my talk, I will present RacerD—a static program analysis for detecting data races in Java programs which is fast, can scale to large code, and has proven effective in an industrial software engineering scenario. RacerD is the first inter-procedural, compositional data race detector which has been empirically shown to have non-trivial precision and impact. Due to its compositionality, it can analyse code changes quickly, and this allows it to perform continuous reasoning about a large, rapidly changing codebase as part of deployment within a continuous integration ecosystem. RacerD has been in deployment for over a year at Facebook, where it has flagged over 2500 issues that have been fixed by developers before reaching production. In contrast to previous static race detectors, RacerD's design favours reporting high-confidence bugs over ensuring their absence. In the second part of my talk I will explain a True Positives Theorem stating that, under certain assumptions, an idealised theoretical version of RacerD never reports a false positive. I will also describe an empirical evaluation of an implementation of this analysis, versus the original RacerD, showing that the loss of precisions in reporting races does not significantly affect the overall practical impact of the analysis. This result, thus, suggests that, in the future, theorems of this variety might be generally useful in understanding, justifying and designing effective static analyses for bug catching. This is a joint work with Sam Blackshear, Nikos Gorogiannis, and Peter O’Hearn (Facebook), published in OOPSLA'18 and POPL'19 KEYWORDS: Concurrency, Static Analysis, Race Freedom, Scalability, Abstract Interpretation |
10-Apr-2019 | Title: HyCUBE: Programmable Accelerator for Next-Generation Wearables Speaker: Dissanayaka Mudiyanselage Emil Manupa Karunaratne, Dean’s Graduate Award winner (AY2018/2019 Sem1) Abstract: Internet of Things (IoT) --- a giant, ever-growing network of billions (estimated to be 25 billion by 2020) of devices embedded within physical objects — is expected to revolutionize our future. The increasing demand from the customers is continuously pushing the computation envelope towards in-situ processing of data in the wearable/IoT devices operating at ultra-low power. The general-purpose processors are incapable of providing such performance per watt capability. Thus, computations are generally offloaded to a smart-phone or the cloud. In this talk, I will present our research woven around a novel programmable accelerator called, HyCUBE, that is designed to meet the power-performance demands of the modern wearable/IoT applications on the device itself. Essentially, HyCUBE is a simple array of processing elements (PE), connected by a programmable interconnect that offers single-cycle multi-hop connection between PEs. The simple yet efficient design of the architecture provides ample room for various levels of parallelism at very low-power, that is exploited by an intelligent compiler to achieve superior performance. Moreover, we have fully fabricated the chip on 40nm technology, with full compiler chain support that is able to map applications written in high-level programming languages such as C. |
AY2018/2019 Semester 1
29-Aug-2018 |
Title: Overview of research in mobile sensing and wireless sensor network protocols |
5-Sep-2018
|
Title: Internet-of-Things Security: Benefits and Risks of Sensing Speaker: Assistant Professor Han Jun, Department of Computer Science Abstract: With the emergence of the Internet-of-Things (IoT) and Cyber-Physical Systems (CPS), we are witnessing a wealth of exciting applications that enable computational devices to interact with the physical world via overwhelming number of sensors and actuators. However, such interactions pose new challenges to traditional approaches of security and privacy. In this talk, I will present how I utilize sensor data to provide security and privacy protections for IoT/CPS scenarios, and further introduce novel security threats arising from similar sensor data. Specifically, I will highlight a few of my recent projects that leverage sensor data for defense and attack scenarios in applications such as smart homes and semi-autonomous vehicles. I will also briefly introduce interesting research problems that I am working in newer application domains such as smart vehicles, buildings, and cities. |
12-Sep-2018
|
Title: Overview of research in next generation low latency TCP and software defined networking Speaker: Associate Professor Ben Leong, Department of Computer Science Abstract: In this talk, I will describe recent research work done by my research group on next generation low latency TCP (Transmission Control Protocol) and software defined networking using the new P4 language (https://p4.org/) . We have seen in recent times the emergence of a large number of low-latency TCP variants. Surprisingly, these modern low-latency TCP variants can match the performance of TCP CUBIC and even outperform CUBIC for large RTT flows. We found that the likely reason is that the bottleneck buffers are relatively shallow and so these variants are likely throttling CUBIC by inflicting significant losses on the network. Our new rate-based congestion control algorithm that incorporates a new buffer estimation technique which allows a flow to infer its own buffer occupancy as well as that of the competing flows sharing the same bottleneck buffer. With this mechanism, the flow is able to determine its operating environment and, when in a low-latency environment, to collaboratively regulate the bottleneck buffer occupancy with other flows. We believe that the current Internet is facing a transition into another phase with new low latency TCP variants but the transition will not be easy. Our approach will allow the Internet to transition smoothly to a low-latency future. For P4-based work, we recently developed a new system using the P4 programming languag, called BurstRadar, that monitors microbursts in the dataplane. BurstRadar incurs 10 times less data collection and processing overhead than existing solutions. Furthermore, BurstRadar can handle simultaneous microburst traffic at gigabit line rates on multiple egress ports while consuming very little resources in the switching ASIC. |
19-Sep-2018
|
Title: Enabling New Applications through Efficient, High-Performance Acceleration Speaker: Assistant Professor Trevor E. Carlson, Department of Computer Science Abstract: The development of faster computing devices each year, like what we have seen with mobile phones, have been what consumers have come to expect from the rapid pace of technology development. But, given two significant trends in technology scaling, this progress might hit a brick wall: even more expensive transistors going forward, while using fewer active transistors at a time to get more work done. Does this spell out the end of computing as we know it? Will computers stop getting faster? As silicon technology improvements have slowed, research into alternatives technologies has increased. Nevertheless, these alternative technologies could still take decades to reach the performance and cost that current CMOS technology provides. One near-term solution is to adapt the computer’s architecture to more efficiently use the transistors that we have. By working smarter, our aim is to continue to provide more functionality in the face of these technological headwinds. To enable new applications, from mobile-based AR and VR to new machine learning approaches, we need to pursue innovative architectural directions. To accomplish these goals, our research focuses on building flexible processors that can enable these next-generation applications. In this talk, I will present some of our recent work as well as future research directions that propose one direction to move us closer to that goal. In addition, I will also present some critical challenges and potential next steps that we will need to address in the coming years. |
3-Oct-2018
|
Title: Towards Boosting Performance of Healthcare Analytics: Resolving Challenges in Electronic Medical Records
Speaker: Ms Zheng Kaiping, Dean’s Graduate Award winner (AY2017/2018 Sem2) Abstract: In recent years, the increasing availability of Electronic Medical Records (EMR) has brought more promising opportunities to automate healthcare data analytics. However, some challenges in EMR data pose a negative effect on healthcare analytic performance if not well handled, and lead to a gap between the potential of EMR data for analytics and its usability in practice. Therefore, it is vitally necessary and important to resolve the challenges in EMR data in order to boost the performance, and further help derive more medical insights, contributing to better patient management and faster medical research advancement. In this talk, I will focus on two representative challenges in EMR data, namely irregularity, and bias, and then present our devised solutions to resolving them. First, I will justify that the irregularity challenge should be resolved at the feature level to reduce time information loss. Then I will demonstrate our proposal to incorporate the fine-grained feature-level time span information and show the analytic performance improvement. Second, I will explain that irregularity is a phenomenon, while bias should be the underlying reason. I will present our solution to transform the biased EMR time series into unbiased data and illustrate the improvement brought in terms of missing data imputation accuracy and prediction accuracy of data analytic applications.
Title: Storing videos efficiently and securely… in DNA! Speaker: Assistant Professor Djordje Jevdjic, Department of Computer Science Abstract: The digital universe is exploding rapidly and we are running out of storage space to save it in an economical way. The vast majority of this digital content is multimedia, most notably videos. In the first part of this talk, I will introduce a concept of approximate storage as a new way of efficient and secure storage of videos on very dense, but unreliable emerging storage mediums. In the second part of the talk I will introduce a new, extremely dense, durable, Nature’s best storage medium — DNA! I will quickly cover the process of reading and writing to and from DNA. In the last part of the talk I will propose a few exciting projects related to efficient and secure storage of digital videos and images in DNA. |
10-Oct-2018
|
Title: Beyond SAT Revolution
Speaker: Assistant Professor Kuldeep Singh Meel, Department of Computer Science Abstract: The paradigmatic NP-complete problem of Boolean satisfiability (SAT) solving is a central problem in Computer Science. While the mention of SAT can be traced to early 19th century, efforts to develop practically successful SAT solvers go back to 1950s. The past 20 years have witnessed a ``SAT revolution" with the development of conflict-driven clause-learning (CDCL) solvers. Such solvers combine a classical backtracking search with a rich set of effective heuristics. While 20 years ago SAT solvers were able to solve instances with at most a few hundred variables, modern SAT solvers solve instances with up to millions of variables in a reasonable time. The "SAT-revolution" opens up opportunities to design practical algorithms with rigorous guarantees for problems in complexity classes beyond NP by replacing a NP oracle with a SAT Solver. In this talk, we will discuss how we use SAT revolution to design practical algorithms for two fundamental problems in artificial intelligence and formal methods: Constrained Sampling and Counting.
|
17-Oct-2018
|
Title: Exploiting Knowledge Graph for Personalized Recommendation
Speaker: Mr Wang Xiang, Dean’s Graduate Award winner (AY2017/2018 Sem2) Abstract: In the era of information overload, recommender system has gained widespread adoption across industry to drive various online customer-oriented services. It facilitates users to discover a small set of relevant items, which meet their personalized interests, from overwhelming choices. Generally, the modeling of user-item interactions is at the heart of personalized recommendation. Nowadays, diverse kinds of auxiliary information on users and items become increasingly available in online platforms, such as user demographics, social relations, and item knowledge. To date, incorporating knowledge-aware channels, especially knowledge graph, into recommender systems is attracting increasing interests, since it can provide deep factual knowledge and rich semantics on items. The usage of such knowledge can better capture the underlying and complex user-item relationships, and further achieve higher recommendation quality. Furthermore, knowledge graph enables us to uncover valuable evidence as well as reasons on why a recommendation is made. Title: Securing Applications from Untrusted Operating Systems using Enclaves
Speaker: Ms Shweta Shinde, Dean's Graduate Award winner (AY2017/2018 Sem2) Abstract: For decades, we have been building software with the default assumption of a trusted underlying stack such as the operating system. From a security standpoint, the status quo has been a hierarchical trust model, where trusting one layer implies trusting all the layers underneath it. However, with new usage models such as outsourced computing and analytics on third-party cloud services, trusting the operating system is no longer an option. As a result, modern CPUs have started supporting new abstractions which address the threats of an untrusted operating system. Intel SGX is one such new security capability available in commodity CPUs shipping from 2015. It allows user-level application code to execute in enclaves which are isolated from all other software on the system, even from the privileged OS or hypervisor. However, these architectural solutions offer a trade-off between security, ease of usability, and compatibility with legacy software (both OS and applications). In this talk, I will present a low-TCB, POSIX-compatible, side-channel resistant, and a formally verified solution which allows users to securely execute their applications on an untrusted operating system. |
24-Oct-2018 |
Title: Adversarial Machine Learning Speaker: Assistant Professor Reza Shokri, Department of Computer Science Abstract: Machine learning models are used in many critical systems and applications. This makes them very attractive targets for a number of security and privacy attacks, including data poisoning, evasion attacks, and inference attacks. In this talk, I will present all these attacks, and a systematic way for mitigating their risks. The solution is simple: know your enemy and anticipate their attacks. This is known as adversarial machine learning. |
31-Oct-2018
|
Title: 3 Projects on Computer System Performance
Speaker: Professor Tay Yong Chiang, Department of Computer Science Abstract: This talk describes 3 current projects on the performance of computer systems: (1.Database) For 20-odd years, developers and researchers have used the TPC benchmarks to compare their products and algorithms. These benchmarks have fixed schemas that bear no relation to current applications. The target of the database project is to replace TPC benchmarks with synthetic versions of application datasets. The idea is to first scale the empirical dataset to the appropriate size, then tweak the data in the resulting dataset to enforce application-specific properties. The ambition is to have a repository of tweaking tools contributed by the developer community, and current work is on building a collaborative framework to facilitate tool interoperability.
(2.Memory) Most of the current hot topics in computer science will become cold within 10 years, but caching will remain an issue 50 years from now. Most caching algorithms try to strike a heuristic balance between recency (e.g. LRU) and frequency (i.e. popularity). The target of the memory project is to use a Cache Miss Equation to do a scientific study of this balance.
(3.Networking) Over the last 2 years, Google has moved their production traffic to a TCP variant called BBR. This may start a paradigm shift for TCP congestion control, from one based on packet loss to one based on bandwidth-delay product. BBR requires estimates for minimum round-trip time R and maximum bandwidth X. BBR measures R and X by periodically changing its packet sending rate. The target of the networking project is to show that the estimation can be done differently and passively. The underlying idea works for any TCP version (CUBIC, Reno, etc.), and even for choosing between hardware/software architectures for video games.
|
AY2017/2018 Semester 2
31-Jan-2018 |
Title: 3 Projects on Computer System Performance (1.Database) For 20-odd years, developers and researchers have used the TPC benchmarks to compare their products and algorithms. These benchmarks have fixed schemas that bear no relation to current applications. The target of the database project is to replace TPC benchmarks with synthetic versions of application datasets. The idea is to first scale the empirical dataset to the appropriate size, then tweak the data in the resulting dataset to enforce application-specific properties. The amibition is to have a repository of tweaking tools contributed by the developer community, and current work is on building a collaborative framework to facilitate tool interoperability. (2.Memory) Most of the current hot topics in computer science will become cold within 10 years, but caching will remain an issue 50 years from now. Most caching algorithms try to strike a heuristic balance between recency (e.g. LRU) and frequency (i.e. popularity). The target of the memory project is to use a Cache Miss Equation to do a scientific study of this balance. (3.Networking) Over the last 2 years, Google has moved their production traffic to a TCP variant called BBR. This may start a paradigm shift for TCP congestion control, from one based on packet loss to one based on bandwidth-delay product. BBR requires estimates for minimum round-trip time R and maximum bandwidth X. BBR measures R and X by periodically changing its packet sending rate. The target of the networking project is to show that the estimation can be done differently and passively. The underlying idea works for any TCP version (CUBIC, Reno, etc.), and even for choosing between hardware/software architectures for video games. |
7-Feb-2018 | Title: Privacy and Security in (Outsourced) Machine Learning Speaker: Reza Shokri, Assistant Professor, Department of Computer Science Abstract: I will talk about the security and privacy threats against machine learning, notably when its training is outsourced. I will discuss how and why machine learning models leak information about the individual data records on which they were trained, and how an attacker can train a deep neural network in such a way that it leaks even more information. I will also talk about security issues with respect to outsourced machine learning, and how we can evaluate such attacks. |
14-Feb-2018 | Title: Constrained Counting and Sampling: Bridging the Gap between Theory and Practice Speaker: Kuldeep Singh Meel, Assistant Professor, Department of Computer Science Abstract: Constrained counting and sampling are two fundamental problems in Computer Science with numerous applications, including network reliability, privacy, probabilistic reasoning, and constrained-random verification. In constrained counting, the task is to compute the total weight, subject to a given weighting function, of the set of solutions of the given constraints . In constrained sampling, the task is to sample randomly, subject to a given weighting function, from the set of solutions to a set of give n constraints. In this talk, I will introduce a novel algorithmic framework for constrained sampling and counting that combines the classical algorithmic technique of universal hashing with the dramatic progress made in Boolean reasoning over the past two decades. This has allowed us to obtain breakthrough results in constrained sampling and counting, providing a new algorithmic toolbox in machine learning, probabilistic reasoning, privacy, and design verification. I will demonstrate the utility of the above techniques on various real applications including probabilistic inference, design verification and our ongoing collaboration in estimating the reliability of critical infrastructure networks during natural disasters. |
21-Feb-2018 | Title: Preparing for a Low-Latency Future Internet Speaker: Ben Leong, Associate Professor, Department of Computer Science Abstract: Google has deployed BBR, a new low-latency TCP variant. We show that to transition smoothly to a low-latency Internet of the future, we need a TCP variant that not only can contend effectively against CUBIC in the current Internet, but that is also able to reduce its level of aggressiveness in a low-latency environment. We present EvaRate, a rate-based congestion control algorithm that incorporates a new buffer estimation technique which allows an EvaRate flow to infer its own buffer occupancy as well as that of the competing flows sharing the same bottleneck buffer. With this mechanism, an EvaRate flow is able to determine its operating environment and, when in a low-latency (or benevolent) environment, collaboratively regulate the bottleneck buffer occupancy with other EvaRate flows. EvaRate highlights a new point in the congestion control design space that deserves further attention. |
7-Mar-2018 | Title: Super Speaking -- Tricks of the Trade Speaker: Terence Sim, Associate Professor, Department of Computer Science Abstract: Most of us in academia are engaged in this typical sequence of activities: (a) do research; (b) write a report/paper about it; (c) give an oral presentation. While many of us are good at research skills (a), and can write reasonable well (b), we are less confident in speaking about it (c). Indeed, presenting our work in front of an audience often causes knees to wobble and stomachs to cramp. It gets worse when we realize, halfway through the talk, that the audience is getting restless or bored because they are not understanding our message. In this talk, I will share some techniques that will improve the intelligibility of our technical presentations. I learned many of these "tricks of the trade" in school -- the School of Hard Knocks. Others I picked up by observing the habits of good speakers; still others from the wise counsel of my seniors. While I cannot guarantee to take away the nervousness when you give a talk, I can certainly offer practical tips that will hopefully improve the clarity of your communication. At the very least, you can get a kick out of seeing whether I practice what I preach. |
14-Mar-2018 | Title: Information Theory and Machine Learning Speaker: Jonathan Scarlett, Assistant Professor, Department of Computer Science Abstract: The field of information theory was introduced as a means for understanding the fundamental limits of data compression and transmission, and has shaped the design of practical communication systems for decades. In this talk, I will discuss the emerging viewpoint that information theory is not only a theory of communication, but a far-reaching theory of data that is applicable to seemingly unrelated learning problems such as estimation, prediction, and optimization. This perspective leads to principled approaches for certifying the near-optimality of practical algorithms, as well as understanding where further improvements are possible. I will provide a gentle introduction to some of the main ideas and insights offered by this perspective, and present examples in the problems of group testing, graphical model selection, sparse regression, and black-box function optimization. |
21-Mar-2018 |
Title: Correcting Language Errors using Machine Translation Techniques Title: Linguistic Properties Matter for Implicit Discourse Relation Recognition: Combining Semantic Interaction, Topic Continuity and Attribution |
28-Mar-2018 | Title: (Gap/S)-ETH Hardness of SVP Speaker: Divesh Aggarwal, Assistant Professor, Department of Computer Science Abstract: There has been a lot of research in the last two decades on constructing cryptosystems whose security relies on the hardness of the shortest vector problem (SVP) on integer lattices. The SVP is well known to be NP-hard. However, such hardness proofs tell us very little about the quantitative or fine-grained complexity of SVP. E.g., does the fastest possible algorithm for SVP still run in time at least, say, 2^{n/5} , or is there an algorithm that runs in time 2^{n/100} or even 2^{\sqrt{n}}? The above hardness results cannot distinguish between these cases, but we certainly need to be confident in our answers to such questions if we plan to base the security of widespread cryptosystems on these answers. In this talk, I will give a partial answer to this question by showing the following quantitative hardness results for the Shortest Vector Problem in the \ell_p norm (SVP_p) where n is the rank of the input lattice. 1) For "almost all'' p > 2.14, there no 2^{n/C_p}-time algorithm for SVP_p for some explicit constant C_p > 0 unless the (randomized) Strong Exponential Time Hypothesis (SETH) is false. 2) For any p > 2, there is no 2^{o(n)}-time algorithm for SVP_p unless the (randomized) Gap-Exponential Time Hypothesis (Gap-ETH) is false. 3) There is no 2^{o(n)}-time algorithm for SVP_2 unless either (1) (non-uniform) Gap-ETH is false; or (2) there is no family of lattices with exponential kissing number in the \ell_2 norm. This is joint work with Noah Stephens-Davidowitz. |
4-Apr-2018 |
Title: Your Toolbox for Privacy in the Cloud
Title: Quantum Communication Using Coherent Rejection Sampling Based on a joint work with Vamsi Krishna Devabathini and Rahul Jain. https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.119.120506 |
11-Apr-2018 | Title: Mining Clinical Data Speaker: Vaibhav Rajan, Assistant Professor, Department of Information Systems and Analytics Abstract: Clinical data analysis poses several modeling challenges that arise due to data heterogeneity, temporality, sparsity, bias and noise. I will outline these challenges in the context of identifying patients at risk of developing complications in hospitals, and present two projects. Nursing notes contain regular and valuable assessments of patients' condition but often have inconsistent abbreviations and lack the grammatical structure of formal documents, thereby making automated analysis difficult. We design a new approach that effectively utilizes the structure of the notes, is robust to inconsistencies in the text and surpasses the accuracy of previous methods. Healthcare data often contains heterogeneous datatypes that exhibit complex feature dependencies. Our algorithm for dependency clustering uses copulas to effectively model a wide range of dependencies and can fit mixed -- continuous and ordinal -- data. It scales linearly with size and quadratically with dimensions of input data, which is significantly faster than state-of-the-art correlation clustering methods for mixed data. I'll conclude with a summary of my current research. |
AY2017/2018 Semester 1
30-Aug-2017 |
Title: Analysis of Source Code and Binaries for Vulnerability Detection and Patching |
6-Sep-2017 | Title: Continuing Moore’s Law: Challenges and Opportunities in Computer Architecture Speaker: Trevor Erik Carlson, Assistant Professor, Department of Computer Science Abstract: Ever faster, cheaper mobile phones (as well as other computing devices) have been what consumers have come to expect from technology for many years. But, given two recent trends in technology scaling (todays chips are limited by power and costs because scaling has slowed significantly), it is widely expected that we will no longer receive significant help from scaling to help us build these faster devices. Does this spell out the end of computing as we know it? Will computers stop getting faster? As silicon technology improvements have slowed, research into alternatives technologies has increased. Nevertheless, these technologies could still take decades to reach the performance and cost that current CMOS provides. One solution to the problem of slowing technology scaling is to adapt the computer’s architecture to more efficiently use the transistors that we have. This is the main focus for our research. To enable a variety of new applications (AR, VR, machine-learning, etc.) while still providing longer-battery life and higher performance, we need to pursue innovative architectural directions. To do this, our research focuses on building general-purpose (programmable) processors and accelerators that are now a necessity to enable these new applications. In this talk, I will present some recent developments in computer architecture to move us closer to that goal, and present some critical challenges (and potential solutions) that we will need to address in the coming years. |
13-Sep-2017 | Title: Learning From Multiple Social Networks for Research And Business: A PhD Journey Speaker: Aleksandr Farseev, Dean’s Graduate Award winner (AY2016/2017 Sem2) Abstract: The drastic change in the Web was witnessed throughout the past decade, which saw an exponential growth in social networking services. The reason of such growth is that social media users concurrently produce and consume data. In this context, millions of users, who follow different lifestyles and belong to different demographic groups, regularly contribute multi-modal data on various online social networks, such as Twitter, Facebook, Foursquare, Instagram, and Endomondo. Traditionally, social media users are encouraged to complete their profiles by explicitly providing their personal attributes such as age, gender, interest, etc. (individual user profile). Additionally, users are likely to join interest-based groups that are devoted to various topics (group user profile). Such information is essential for different applications, but unfortunately, it is often not available publicly. This gives rise of automatic user profiling, which aims at automatic inference of users' hidden information based on observable information such as individual's behavior or utterances. The talk is focused on investigating user profiling across multiple social networks in different application domains. |
20-Sep-2017 |
Title: Adapting User Technologies: Bridging Designers, Machine Learning and Psychology through Collaborative, Dynamic, Personalized Experimentation I present an example of how this framework is used to create “MOOClets” that embed randomized experiments into real-world online educational contexts – like learning to solve math problems. Explanations (and experimental conditions) are crowdsourced from learners, teachers and scientists. Dynamically changing randomized experiments compare the learning benefits of these explanations in vivo with users, continually adding new conditions as new explanations are contributed. Algorithms (for multi-armed bandits, reinforcement learning, Bayesian Optimization) are used for real-time analysis (of the effect of explanations on users’ learning) and optimizing policies that provide the explanations that are best for different learners. The framework enables a broad range of algorithms to discover how to optimize and personalize users’ behavior, and dynamically adapt technology components to trade off experimentation (exploration) with helping users (exploitation). Bio: Joseph Jay Williams is an Assistant Professor at the National University of Singapore's School of Computing, department of Information Systems & Analytics. He was previously a Research Fellow at Harvard's Office of the Vice Provost for Advances in Learning, and a member of the Intelligent Interactive Systems Group in Computer Science. He completed a postdoc at Stanford University in the Graduate School of Education in Summer 2014, working with the Office of the Vice Provost for Online Learning and the Open Learning Initiative. He received his PhD from UC Berkeley in Computational Cognitive Science, where he applied Bayesian statistics and machine learning to model how people learn and reason. He received his B.Sc. from University of Toronto in Cognitive Science, Artificial Intelligence and Mathematics, and is originally from Trinidad and Tobago. More information about his research and papers is at www.josephjaywilliams.com. |
4-Oct-2017 | Title: Improving Medication Compliance: How CS Can Help Speaker: Ooi Wei Tsang, Associate Professor, Department of Computer Science Abstract: Medical compliance refers to the degree to which a patient accurately follows medical advice given by healthcare professionals, including whether they take medication as prescribed, are they taking the right dosage, and at the right timing. It is challenging for children and young adults patients who need long-term medication to comply due to their lifestyle and the need to balance between their study, social activities, and possibly work. This talk aims to (i) highlight the importance of the problem and the challenge that the patients face, (ii) review some existing work in computing literature that addresses this problem, and (iii) identify some open research challenges towards improving medical compliance that involve computer networking, sensors, multimedia-multimodal data, AI, and HCI research. |
11-Oct-2017 |
Title: Introduction to blockchain and cryptocurrency research
Title: Bounds on Distributed Information Spreading in Networks with Latencies |
25-Oct-2017 |
Title: Making Software Secure: Hardening & Analysis
Title: Interpretable Machine Learning for User Friendly, Healthy Interventions |
1-Nov-2017 | Title: Data Privacy in Machine Learning Speaker: Reza Shokri, Assistant Professor, Department of Computer Science Abstract: I will talk about what machine learning privacy is, and will discuss how and why machine learning models leak information about the individual data records on which they were trained. My quantitative analysis will be based on the fundamental membership inference attacks: given a data record and (black-box) access to a model, determine if a record was in the model's training set. I will demonstrate how to build such inference attacks on different classification models e.g., trained by commercial "machine learning as a service" providers such as Google and Amazon. Website: http://www.shokri.org |
8-Nov-2017 | Title: Analyzing Filamentary Structured Objects in Biomedical Images: Segmentation, Tracing, and Synthesis Speaker: Cheng Li, Adjunct Assistant Professor, Department of Computer Science Abstract: Filamentary structured objects are abundant in biomedical images, such as neuronal images, retinal fundus images, and angiography, to name a few. In this talk, we will discuss on our recent research efforts in addressing the tasks of segmentation, tracing, and synthesis for such images. More details can be found at our project websites https://web.bii.a-star.edu.sg/archive/machine_learning/Projects/filaStructObjs/project.htm. |
AY2016/2017 Semester 2
25-Jan-2017 |
Title: Transparency & Discrimination in Big Data Systems So, did a big data algorithm base its decisions on "protected" user features? The problem is that in many cases it is very hard to tell: big data algorithms are often extremely complex, so we cannot be sure whether an algorithm used a protected feature (say, gender), or based its prediction on a correlated input. Our research aims at developing formal methods that offer some transparency into the way that the algorithms use their inputs. Using tools from game theory, formal causality analysis and statistics, we offer influence measures that can indicate how important was a feature in making a decision about an individual, or a protected group. In this talk, I will review some of the latest developments on algorithmic transparency, and its potential impact on interpretable ML. |
1-Feb-2017 |
Title: The emerging security and privacy issues in the tangled web Speaker: Jia Yaoqi, Dean’s Graduate Award winner (AY2016/2017 Sem1) Abstract: World Wide Web gradually becomes an essential part of our daily life in the digital age. With the advent of cloud services and peer-to-peer techniques, new security and privacy issues are emerging in the tangled web. In this talk, I first illustrate how cloud services affect the web/local boundary provided by browsers, and then briefly present the privacy leakage in the P2P web overlays as well as the solutions using onion-routing and oblivious RAM. First, browsers such as Chrome adopt process-based isolation design to protect “the local system” from “the web”. However, as billions of users now use web-based cloud services (e.g., Dropbox and Google Drive), which are integrated into the local system, the premise that browsers can effectively isolate the web from the local system has become questionable. We argue that if the process-based isolation disregards the same-origin policy as one of its goals, then its promise of maintaining the “web/local system (local)” separation is doubtful. Specifically, we show that existing memory vulnerabilities in Chrome’s renderer can be used as a stepping-stone to drop executables/scripts in the local file system, install unwanted applications and misuse system sensors. These attacks are purely data-oriented and do not alter any control flow or import foreign code. Thus, such attacks bypass binary-level protection mechanisms, including ASLR and in-memory partitioning. Finally, we discuss various full defenses and present a possible way to mitigate the attacks presented. Second, the web infrastructure used to be a client-server model, in which clients (or browsers) request and fetch web contents such as HTML, JavaScript and CSS from web servers. Recently peer-to-peer (P2P) techniques (supported by real-time communications or RTC) have been introduced into the web infrastructure, which enables browsers to directly communicate with each other and form a P2P web overlay. This also brings the open and unsolved problems like privacy issues in P2P systems to the new web overlays. We investigate the security and privacy issues in web overlays, and propose solutions to address these issues using cryptographic and hardware primitives such as onion routing and oblivious RAM. First, we present inference attacks in peer-assisted CDNs on top on web overlays, which can infer user’s online activities such as browsing history. To thwart such attacks, we propose an anonymous peer-assisted CDN (APAC), which employs onion-routing techniques to conceal users’ identities and uses region-based circuit selection algorithm to reduce performance overhead. Second, to hide online activities (or access patterns) of users against long-term global analysis, we design an oblivious peer-to-peer content sharing system (OBLIVP2P), which uses new primitives such as distributed-ORAM in the P2P setting. |
8-Feb-2017 |
Title: From networked chips to cities |
15-Feb-2017 |
Title: On Modeling the Time-Energy Performance of Data-Parallel Applications on Heterogeneous Systems |
1-Mar-2017 |
Title: Real world opportunities for NLP Research to Impact Global Education through MOOCs We leverage natural language processing technologies to better analyse student conversations to identify opportunities for timely instructor intervention to produce better learning outcomes. We discuss how diversity in MOOC offering has compromised the validity of previously published results, how automatic discourse parsing can improve prediction and the real problem of the bias presented by the user interface that affects the instructors' decision to intervene. We are actively recruiting interested individuals to continue work on these and allied topics. |
8-Mar-2017 *Note: venue at Seminar Room 3 (COM1-02-12) |
Title: Power Papers -- Some Practical Pointers, Part 1 Speaker: Terence Sim, Associate Professor, Department of Computer Science Abstract: If I write with the flowery flourish of Shakespeare, but my prose proves problematic, then my words become like a noisy gong or a clanging cymbal. If I have the gift of mathematical genius and can fathom all theorems, but cannot articulate the arcane, my genius appears no different from madness. If I achieve breakthrough research that can change the world, but cannot explain its significance, the world gains nothing and I labor in vain. Writing a good research paper takes effort; more so if there is a page limit. Yet this skill is required of every researcher, who, more often than not, fumbles his or her way through. Good grammar is only a start; care and craft must be applied to turn a mediocre paper into a memorable one. Writing skills can indeed be honed. In this talk, I will highlight the common mistakes many authors make, and offer practical pointers to pack more punch into your paper. Needless to say, the talk will be biased: I will speak not from linguistic theories, but from personal experience, sharing what has, and has not, worked for me. Students and staff are all welcome to participate: your views and insights will certainly benefit us all. |
15-Mar-2017 |
Title: Cache Miss Equation, and Synthetic Dataset Scaling Speaker: Zhang Jiangwei, Research Achievement Award winner (AY2016/2017 Sem1) Abstract: Cache Miss Equation: Science seeks to discover what is forever true of nature. For Computer Science, what can we discover that will be forever true about computation or, at least, immune to changes in technology? Computation fundamentally requires cycles, memory, bandwidth and time. The memory in a computer system has innumerable caches, and our research on this resource focuses on developing an equation to describe cache misses for all levels of the memory hierarchy. It works for a disk cache, database buffers, garbage-collected heaps, nonvolatile memory and content-centric networking. For more details, please check: http://www.math.nus.edu.sg/~mattyc/CME.html Synthetic Dataset Scaling: Benchmarks are ubiquitous in the computing industry and academia. Developers use benchmarks to compare products, while researchers use them similarly in their research. For 20-odd years, the popular benchmarks for database management systems were the ones defined by the Transaction Processing Council (TPC). However, the small number of TPC benchmarks are increasingly irrelevant to the myriad of diverse applications, and the TPC standardization process is too slow. This led to a proposal for a paradigm shift, from a top-down design of domain-specific benchmarks by committee consensus, to a bottom-up collaboration to develop tools for application-specific benchmarking. A database benchmark must have a dataset. For the benchmark to be application-specific, it must start with an empirical dataset D. This D may be too small or too large for the benchmarking experiment, so the first tool to develop would be for scaling D to a desired size. This motivates the Dataset Scaling Problem(DSP): Given a set of relational tables D and a scale factor s, generate a database state D' that is similar to D but s times its size. For more details, please check: http://www.comp.nus.edu.sg/~upsizer/ In this talk, I will briefly share the motivation, the possible impact, the current solutions we have, and the research opportunities for both problems. |
22-Mar-2017 |
Title: Computer Vision for Robotics Perception Speaker: Lee Gim Hee, Assistant Professor, Department of Computer Science Abstract: Camera is a good sensor for robotic perception over traditionally used Lidar because of low-cost and rich in information, but the algorithms are often computationally too expensive, and sensitive to noise and outliers. In this talk, I will present my work on making some of the computer vision algorithms more efficient and robust for robots to percieve the world through cameras. |
29-Mar-2017 |
Title: Hardening Programs Against Software Vulnerabilities AND Constraints Solvers for Problems in Security Speaker: Roland Yap, Associate Professor, Department of Computer Science Abstract: The talk will be about two but partially related topics. The first is on preventing exploitation of software vulnerabilities and will be the main focus on the talk. Memory bugs are still the main route where software is attacked. In fact, one might regard that in most of today's complex software in low level languages such as C and C++ that such bugs are inevitable. As such, a strategy to harden the program such that these bugs cannot be exploited, e.g. to corrupt the stack, is perhaps the strategy which needs to be adopted in the long term. There are many kinds of memory errors, perhaps, the most well known are spatial and temporal errors. I will talk about a research direction which opens up the area from simple to complex kinds of program hardening. For students interested in knowing a bit more before hand, a recent paper at NDSS 2017 on protecting stack objects is Stack Object Protection with Low Fat Pointers https://www.internetsociety.org/events/ndss-symposium/ndss-symposium-2017/ndss-2017-programme/ndss-2017-session-10-software-and The second topic which I will touch on more briefly is research on constraint solving. Constraint solving is of broad applicability to many domains ranging from theoretical computer science, to verification, to security. I will mention some problems in constraints with some links to verification and security. |
5-Apr-2017 |
Title: Analyzing the Behaviors of Articulated Objects in 3D : Applications to Human and Animals Speaker: Cheng Li, Adjunct Assistant Professor, Department of Computer Science Abstract: Recent advancement of depth cameras has opened door to many interesting applications. In this talk, I will discuss our research efforts toward addressing the related tasks of pose estimation, tracking, action and behavior analysis of a range of articulated objects (human upper-body, human hand, fish, mouse) from such 3D cameras. In particular, I will talk about our recent Lie group based approach that enables us to tackle these problems under a unified framework. Looking forward, the results could be applied to everyday life scenarios such as natural user interface, behavior analysis and surveillance, gaming, among others. |