My Conception of a Swarm of Agents. Courtesy of Cocoa Yeo (2008).
what's new
Invited to serve as a World Economic Forum’s Global Future Councils Fellow for the Council on the Future of Artificial Intelligence and Robotics, Sep 2016 - Jun 2018
|
|
Appointed as Deputy Director of AI Research & Technology
AI Singapore, Mar 2018 - Present
|
|
Our AAAI 2019 & AURO submissions are accepted!
|
|
Source codes for our parallel/distributed Gaussian processes (ICML 2016,
AAAI 2015, UAI 2013) are now available!
|
|
Source codes for our online/anytime Gaussian processes (ICML 2015,
AAAI 2014) are now available!
|
|
MOE AcRF Tier 2 Grant : Scaling up Gaussian Process Predictive Models for Big Data, SGD $737,461, Jul 2017 - Jul 2020
|
|
SMART Subaward Agreement - FM IRG :
Automatic Probabilistic Machine Learning for Traffic Modeling and Prediction, SGD $45,000, Apr 2018 - Mar 2019
|
|
Invited to serve as senior program committee members of AAAI 2019, AAMAS 2018-2019, IJCAI 2015,
associate editors of IROS 2012 & ICRA 2011,
program committee members of AAAI 2010, 2016-2018, IJCAI 2011, 2015, 2017, 2019, AAMAS 2011-2014, 2016, RSS 2014, 2018,
ICAPS 2010-2012, 2018-2019, and
reviewer of NIPS 2013-2016, 2018, ICML 2019, AISTATS 2019, ICLR 2019
|
|
Students interested to join MapleCG, click here for more info
|
|
members
I am looking for talented undergraduate and graduate students in NUS to join my MapleCG research group.
If you are really excited and motivated to be involved in novel research in the fields of
artificial intelligence, planning under uncertainty (i.e., decision-theoretic, information-theoretic),
robotics, multi-agent systems (i.e., multi-agent coordination, planning, and learning), game theory,
statistical machine learning, optimization, and/or swarm intelligence, please email me and we can set
up a time for discussion. Please also take some time to view our research projects.
I am currently advising the following students and research staff:
 |
Dai, Zhongxiang
Ph.D. (Co-advised by Patrick Jaillet, MIT)
Recipient of SMART SMA3 Graduate Fellowship & ST Electronics Prizes for being the top Year 1 and Year 2 student in Electrical Engineering
B.Eng. in Electrical Engineering (Hons. 1st Class) > National University of Singapore
Research Interests: machine learning
|
 |
Ryutaro Oikawa |及川竜太郎|
Ph.D.
M.Sc. in Statistics > Imperial College London
B.Eng in Social Engineering > Tokyo Institute of Technology
Research Interests: reinforcement learning
|
 |
Sreejith Balakrishnan
Ph.D.
M.Eng. in Electrical Engineering with Specialization in Computer Engineering > National University of Singapore
B.Eng. in Electrical and Electronics Engineering (Hons. 1st Class) with Minor in Computing > Nanyang Technological University
Research Interests: robotics, machine learning
|
 |
Zhang, Jingfeng |张景锋|
Ph.D. (Co-advised by Mohan Kankanhalli)
B.Eng. in Computer Science and Technology > Taishan College > Shandong University
Research Interests: privacy-preserving machine learning
|
 |
Teng, Tong |滕茼|
Ph.D.
B.Eng. in Computer Science > Shandong University
Research Interests: probabilistic machine learning
|
 |
Dmitrii Kharkovskii |Дмитрий Алексеевич Харьковский|
Ph.D.
Specialist degree in Math > St. Petersburg State University
Research Interests: privacy-preserving machine learning, Bayesian optimization
|
 |
Yu, Haibin |于海斌|
Ph.D. (Co-advised by Patrick Jaillet, MIT)
Recipient of SMART SMA3 Graduate Fellowship
B.Eng. in Mechanical Engineering and Automation > Beihang University
Research Interests: Bayesian deep learning
|
 |
Nguyễn, Quốc Phong
Ph.D. (Co-advised by Patrick Jaillet, MIT)
Recipient of Research Achievement Award, SMART SMA3 Graduate Fellowship, Lee Kuan Yew Gold Medal for best performing graduate in B.Eng. (Computing Engineering) programme, & IES Gold Medal for top graduating student in B.Eng. in Computer Engineering
B.Eng. in Computer Engineering > National University of Singapore, 2013
Research Interests: probabilistic machine learning, inverse reinforcement learning
|
 |
Zhang, Yehong |张叶红|
Research Fellow
Recipient of AAAI 2016 Scholarship & Research Achievement Award
B.Eng. in Computer Science > Harbin Institute of Technology
Ph.D. in Computer Science > National University of Singapore, Dec 2017
Ph.D. Thesis: Data-Efficient Machine Learning with Multiple Output Types and High Input Dimensions
Research Interests: probabilistic machine learning, active learning, Bayesian optimization

+ oral defense ◊ nov 20 ◊ 2K+17
|
 |
Lim, Kar Wai
Research Fellow, NUS-Singtel Cyber Security Research and Development Laboratory (Co-advised by Mun Choon Chan)
Recipient of ACML 2016 Best Student Paper Award & AMP Prize for Honours Thesis in Actuarial Studies (Best Thesis Award)
Bachelor of Actuarial Studies (Hons. 1st Class) > Australian National University
Ph.D. in Computer Science > Australian National University, Dec 2016
Ph.D. Thesis: Nonparametric Bayesian Topic Modelling with Auxiliary Data
Research Interests: machine learning, Bayesian nonparametric methods, stochastic processes, point processes, Hawkes processes, Bayesian inference, Markov chain Monte Carlo methods
|
Former Members
 |
Chen, Jie |陈杰|
Associate Research Professor, College of Computer Science and Software Engineering, Shenzhen University, Apr 2018
Recipient of Dean's Graduate Research Excellence Award, Research Achievement Award, & UAI 2012 Scholarship
B.Eng. in Electrical Engineering > Taizhou University
M.Eng. in Computer Science > Zhejiang University
Ph.D. in Computer Science > National University of Singapore, Dec 2013
Postdoctoral Associate, SMART FM (Co-advised by Patrick Jaillet, MIT), Jan 2014 - Mar 2018
Ph.D. Thesis: Gaussian Process-Based Decentralized Data Fusion & Active Sensing Agents: Towards Large-Scale Modeling & Prediction of Spatiotemporal Traffic Phenomena
Research Interests: robotics, multi-agent planning, machine learning
|
 |
Hoàng, Trọng Nghĩa
Research Staff Member, MIT-IBM Watson AI Lab, Aug 2018
Recipient of Dean's Graduate Research Excellence Award, President's Graduate Fellowship, Research Achievement Awards x 2, IJCAI 2013 Travel Grant Award, & AAMAS 2012 Scholarship
B.Sc. in Computer Science (Hons) > University of Science > Vietnam National University
Ph.D. in Computer Science > National University of Singapore, Feb 2015
Postdoctoral Fellow > Massachusetts Institute of Technology (Advised by Jonathan How), Apr 2017
Ph.D. Thesis: New Advances on Bayesian and Decision-Theoretic Approaches for Interactive Machine Learning
Research Interests: machine learning, multi-agent planning

+ oral defense ◊ jan 21 ◊ 2K+15
|
 |
Ouyang, Ruofei |欧阳若飞|
Data Scientist, Wecash (Southeast Asia) Pte. Ltd.
Recipient of AAMAS 2014 Scholarship
B.Sc. in Computer Science > East China Normal University
Ph.D. in Computer Science > National University of Singapore, Dec 2016
Ph.D. Thesis: Exploiting Decentralized Multiagent Coordination for Large-Scale Machine Learning Problems
Research Interests: Gaussian process, active sensing/learning, data fusion, Bayesian optimization

+ oral defense ◊ nov 7 ◊ 2K+16
|
 |
Xu, Nuo |许诺|
Data Scientist, Grab
Recipient of AAAI 2014 Scholarship & Research Achievement Award
Bachelor of Software Engineering > Harbin Institute of Technology
Ph.D. in Computer Science > National University of Singapore, Jan 2017
Ph.D. Thesis: Online Gaussian Process Filtering for Persistent Robot Localization With Arbitrary Sensor Modalities
Research Interests: machine learning, robotics

+ oral defense ◊ jan 11 ◊ 2K+17
|
 |
Prabhu Natarajan
Assistant Professor, DigiPen Institute of Technology Singapore, Jun 2016
Recipient of ICDSC 2012 Best PhD Forum Paper Award, Research Achievement Award, & AAMAS 2012 Scholarship
B.Tech. in Information Technology > Anna University
M.Eng. in Computer Science & Engineering > Anna University
Ph.D. in Computer Science > National University of Singapore, Dec 2013 (Co-advised by Mohan Kankanhalli)
Ph.D. Thesis: A Decision-Theoretic Approach for Controlling & Coordinating Multiple Active Cameras in Surveillance
Research Interests: multi-camera surveillance, decision-theoretic planning & control for sensor networks

+ commencement ceremony ◊ aug 13 ◊ 2K+14
|
 |
Tan, Wesley
Ph.D. in Computer Science, Nanyang Technological University, Aug 2017
Recipient of President's Graduate Fellowship in Nanyang Technological University
B.Sc. (Honors) in Economics > Purdue University
M.Sc. in Risk Management & Financial Engineering > Imperial College London, Oct 2015
M.Comp. in Computer Science > National University of Singapore, Jun 2017
M.Sc. Thesis: Variational Bayesian Actor Critic
Research Interests: reinforcement learning
|
 |
Son, Jaemin |손재민|
Bachelor of Comp Sci & Eng > Osaka University
M.Sc. in Computer Science > National University of Singapore, Nov 2016 (Co-advised by Gary Tan)
M.Sc. Thesis: High-Dimensional Bayesian Optimization with Application to Traffic Simulation
Research Interests: machine learning
|
 |
Cao, Nannan |曹楠楠|
Former Research Assistant
Bachelor of Software Engineering > East China Normal University
M.Sc. in Computer Science > National University of Singapore, Sep 2012
M.Sc. Thesis: Information-Theoretic Multi-Robot Path Planning
Research Interests: machine learning, Gaussian process, environmental sensing

+ commencement ceremony ◊ sep 19 ◊ 2K+13
|
 |
Etkin Barış Özgül
B.Sc. in Computer Science > Bilkent University
M.Sc. in Computer Science > National University of Singapore, Jan 2017
M.Sc. Thesis: Shuttle-line Routing for Mobility-on-Demand Systems with Ridesharing
Research Interests: AI, multi-agent systems
|
 |
Hoang, Quang Minh
Ph.D. in Computer Science, CMU, Aug 2018
Recipient of Lee Kuan Yew Gold Medal for best performing graduate in B.Comp. (Computational Biology) programme, Outstanding Undergraduate Researcher Prize in National University of Singapore, & ICML 2015 Scholarship
B.Comp. in Computational Biology (Hons. 1st Class) > National University of Singapore, 2016
FYP Dissertation: A Probabilistic Approach for Protein Function Prediction with Hierarchical Structured Outputs
Research Interests: machine learning
|
 |
Ling, Chun Kai
Ph.D. in Computer Science, CMU, Aug 2017
Recipient of Lee Kuan Yew Gold Medal for best performing graduate in B.Eng. (Computing Engineering) programme, IES Gold Medal for top graduating student in B.Eng. in Computing Engineering, Defence Science Technology Agency Gold Medal for best local final year student for the degree of B.Eng. (Computer Engineering), Micron Prize for being one of the top two local Year 2 Computer Engineering students, & Alcatel-Lucent Telecommunications Prize for best performance in a module in the area of Communications and Networks in BEng (EE) or BEng (CEG) examinations
B.Eng. in Computer Engineering > National University of Singapore, 2015
FYP Dissertation: Planning and Learning in Spatiotemporal Environmental Phenomena
Research Interests: planning under uncertainty, machine learning
|
 |
Erik Alexander Daxberger
M.Sc. in Computer Science, ETH Zurich, Sep 2017
Recipient of LMU research award for excellent students for the Bachelor's thesis, and LMUexchange and PROSA scholarships for a student exchange program at NUS & ICML 2017 Travel Award
B.Sc. in Computer Science > Ludwig-Maximilians-Universität, Munich, 2017
B.Sc. Thesis: Distributed Batch Bayesian Optimization
Research Interests: machine learning, AI
|
 |
Khor, Shi-Jie
Software Engineer, Google Singapore
Recipient of Lee Kuan Yew Gold Medal for best performing graduate in B.Comp. (Computer Science) programme, IEEE Singapore Computer Society Book Prize for the best student in the Honours Year term project, and Tata Consultancy Services Asia Pacific Prize
B.Comp. in Computer Science (Honours Highest Distinction) > National University of Singapore, 2016
FYP Dissertation: Kernel Search for Gaussian Processes
Research Interests: machine learning
|
 |
Nathan Azaria
Software Engineer, Facebook London
Recipient of National Computer Systems Medal And Prize for the top student in B.Comp. (Computer Science) programme
B.Comp. in Computer Science (Honours Highest Distinction) > National University of Singapore, 2016
FYP Dissertation: Stochastic Variational Inference on Multi-Output Gaussian Process
Research Interests: machine learning
|
 |
Lim, Keng Kiat
Software Engineer, Facebook HQ
B.Comp. in Computer Science > National University of Singapore, 2016
FYP Dissertation: Learning with High-Dimensional Data
Research Interests: machine learning
|
 |
Akshay Viswanathan
Software Engineer, Visa Inc.
B.Eng. in Computer Engineering (Hons. 1st Class) > National University of Singapore, 2015
FYP Dissertation: Scaling up Machine Learning Techniques via Parallelization for Large Data
Research Interests: machine learning
|
 |
Shailendra Khemka
Business Solutions: Software Engineer, Deutsche Bank AG - SG Branch
Recipient of Tata Consultancy Services Asia Pacific Medal and Prize for 2nd best graduate throughout the course of study for B.Comp, Defence Science & Technology Agency Prize for top UROP student in B.Comp., & Sung Kah Kay Memorial Prize Winner in NUS University Scholars Programme (USP)
University Scholars Programme & von Neumann Programme for B.Comp in Computer Science > National University of Singapore, 2013
FYP Dissertation: Autonomous Search for Victims in a Disaster Situation
Research Interests: multi-agent planning
|
 |
Yu, Jiangbo |余江波|
KAI Square
B.Sc. in Computer Science > Peking University
Research Interests: statistical machine learning
|
publications
MOST FREQUENTLY USED WORDS extracted from paper titles and abstracts
ACCEPTED PAPERS
Co-authors : My students and postdocs Former thesis advisors Collaborators
- Collective Online Learning of Gaussian Processes in Massive Multi-Agent Systems.
Trong Nghia Hoang, Quang Minh Hoang, Kian Hsiang Low & Jonathan P. How.
In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI-19), Honolulu, Hawaii, Jan 27 - Feb 1, 2019.
16.2% acceptance rate (oral presentation)
Abstract. This paper presents a novel Collective Online Learning of Gaussian Processes (COOL-GP) framework for enabling a massive number of GP inference agents to simultaneously perform (a) efficient online updates of their GP models using their local streaming data with varying correlation structures and (b) decentralized fusion of their resulting online GP models with different learned hyperparameter settings and inducing inputs. To realize this, we exploit the notion of a common encoding structure to encapsulate the local streaming data gathered by any GP inference agent into summary statistics based on our proposed representation, which is amenable to both an efficient online update via an importance sampling trick as well as multi-agent model fusion via decentralized message passing that can exploit sparse connectivity among agents for improving efficiency and enhance the robustness of our framework against transmission loss. We provide a rigorous theoretical analysis of the approximation loss arising from our proposed representation to achieve efficient online updates and model fusion. Empirical evaluations show that COOL-GP is highly effective in model fusion, resilient to information disparity between agents, robust to transmission loss, and can scale to thousands of agents.
- Gaussian Process Decentralized Data Fusion Meets Transfer Learning in Large-Scale Distributed Cooperative Perception.
Ruofei Ouyang & Kian Hsiang Low.
In Autonomous Robots (Special Issue on Multi-Robot and Multi-Agent Systems), 2019.
Extended version of our
AAAI-18 paper
Abstract. This paper presents novel Gaussian process decentralized data fusion algorithms exploiting the notion of agent-centric support sets for distributed cooperative perception of large-scale environmental phenomena. To overcome the limitations of scale in existing works, our proposed algorithms allow every mobile sensing agent to utilize a different support set and dynamically switch to another during execution for encapsulating its own data into a local summary that, perhaps surprisingly, can still be assimilated with the other agents' local summaries (i.e., based on their current choices of support sets) into a globally consistent summary to be used for predicting the phenomenon. To achieve this, we propose a novel transfer learning mechanism for a team of agents capable of sharing and transferring information encapsulated in a summary based on a support set to that utilizing a different support set with some loss that can be theoretically bounded and analyzed. To alleviate the issue of information loss accumulating over multiple instances of transfer learning, we propose a new information sharing mechanism to be incorporated into our algorithms in order to achieve memory-efficient lazy transfer learning. Empirical evaluation on three real-world datasets for up to 128 agents show that our algorithms outperform the state-of-the-art methods.
|
REFEREED PUBLICATIONS
dblp
Sorted by year 2K + 18 |
17 |
16 |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2
- Decentralized High-Dimensional Bayesian Optimization with Factor Graphs.
Trong Nghia Hoang, Quang Minh Hoang, Ruofei Ouyang & Kian Hsiang Low.
In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), pages 3231-3238, New Orleans, LA, Feb 2-8, 2018.
24.55% acceptance rate
Abstract. This paper presents a novel decentralized high-dimensional Bayesian optimization (DEC-HBO) algorithm that, in contrast to existing HBO algorithms, can exploit the interdependent effects of various input components on the output of the unknown objective function f for boosting the BO performance and still preserve scalability in the number of input dimensions without requiring prior knowledge or the existence of a low (effective) dimension of the input space. To realize this, we propose a sparse yet rich factor graph representation of f to be exploited for designing an acquisition function that can be similarly represented by a sparse factor graph and hence be efficiently optimized in a decentralized manner using distributed message passing. Despite richly characterizing the interdependent effects of the input components on the output of f with a factor graph, DEC-HBO can still guarantee (asymptotic) no-regret performance. Empirical evaluation on synthetic and real-world experiments shows that DEC-HBO outperforms the state-of-the-art HBO algorithms.
- Gaussian Process Decentralized Data Fusion Meets Transfer Learning in Large-Scale Distributed Cooperative Perception.
Ruofei Ouyang & Kian Hsiang Low.
In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), pages 3876-3883, New Orleans, LA, Feb 2-8, 2018.
24.55% acceptance rate
Abstract. This paper presents novel Gaussian process decentralized data fusion algorithms exploiting the notion of agent-centric support sets for distributed cooperative perception of large-scale environmental phenomena. To overcome the limitations of scale in existing works, our proposed algorithms allow every mobile sensing agent to choose a different support set and dynamically switch to another during execution for encapsulating its own data into a local summary that, perhaps surprisingly, can still be assimilated with the other agents' local summaries (i.e., based on their current choices of support sets) into a globally consistent summary to be used for predicting the phenomenon. To achieve this, we propose a novel transfer learning mechanism for a team of agents capable of sharing and transferring information encapsulated in a summary based on a support set to that utilizing a different support set with some loss that can be theoretically bounded and analyzed. To alleviate the issue of information loss accumulating over multiple instances of transfer learning, we propose a new information sharing mechanism to be incorporated into our algorithms in order to achieve memory-efficient lazy transfer learning. Empirical evaluation on real-world datasets show that our algorithms outperform the state-of-the-art methods.
- Artificial Intelligence Research in Singapore: Assisting the Development of a Smart Nation.
Pradeep Varakantham, Bo An, Bryan Low & Jie Zhang.
In AI Magazine, volume 38, issue 3, pages 102-105, Fall 2017.
Abstract. Artificial Intelligence (AI) research in Singapore is focused on accelerating the country’s development into a Smart Nation. Specifically, AI has been employed extensively in either augmenting the intelligence of humans or in developing automated methods and systems to improve quality of life in Singapore.
- Distributed Batch Gaussian Process Optimization.
Erik Daxberger & Kian Hsiang Low.
In Proceedings of the 34th International Conference on Machine Learning (ICML-17), pages 951-960, Sydney, Australia, Aug 6-11, 2017.
25.9% acceptance rate
Abstract. This paper presents a novel distributed batch Gaussian process upper confidence bound (DB-GP-UCB) algorithm for performing batch Bayesian optimization (BO) of highly complex, costly-to-evaluate black-box objective functions. In contrast to existing batch BO algorithms, DB-GP-UCB can jointly optimize a batch of inputs (as opposed to selecting the inputs of a batch one at a time) while still preserving scalability in the batch size. To realize this, we generalize GP-UCB to a new batch variant amenable to a Markov approximation, which can then be naturally formulated as a multi-agent distributed constraint optimization problem in order to fully exploit the efficiency of its state-of-the-art solvers for achieving linear time in the batch size. Our DB-GP-UCB algorithm offers practitioners the flexibility to trade off between the approximation quality and time efficiency by varying the Markov order. We provide a theoretical guarantee for the convergence rate of DB-GP-UCB via bounds on its cumulative regret. Empirical evaluation on synthetic benchmark objective functions and a real-world optimization problem shows that DB-GP-UCB outperforms the state-of-the-art batch BO algorithms.
- A Generalized Stochastic Variational Bayesian Hyperparameter Learning Framework for Sparse Spectrum Gaussian Process Regression.
Quang Minh Hoang, Trong Nghia Hoang & Kian Hsiang Low.
In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17), pages 2007-2014, San Francisco, CA, Feb 4-9, 2017.
24.6% acceptance rate (oral presentation)
Abstract. While much research effort has been dedicated to scaling up sparse Gaussian process (GP) models based on inducing variables for big data, little attention is afforded to the other less explored class of low-rank GP approximations that exploit the sparse spectral representation of a GP kernel. This paper presents such an effort to advance the state of the art of sparse spectrum GP models to achieve competitive predictive performance for massive datasets. Our generalized framework of stochastic variational Bayesian sparse spectrum GP (sVBSSGP) models addresses their shortcomings by adopting a Bayesian treatment of the spectral frequencies to avoid overfitting, modeling these frequencies jointly in its variational distribution to enable their interaction a posteriori, and exploiting local data for boosting the predictive performance. However, such structural improvements result in a variational lower bound that is intractable to be optimized. To resolve this, we exploit a variational parameterization trick to make it amenable to stochastic optimization. Interestingly, the resulting stochastic gradient has a linearly decomposable structure that can be exploited to refine our stochastic optimization method to incur constant time per iteration while preserving its property of being an unbiased estimator of the exact gradient of the variational lower bound. Empirical evaluation on real-world datasets shows that sVBSSGP outperforms state-of-the-art stochastic implementations of sparse GP models.
- A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models.
Trong Nghia Hoang, Quang Minh Hoang & Kian Hsiang Low.
In Proceedings of the 33rd International Conference on Machine Learning (ICML-16), pages 382-391, New York City, NY, Jun 19-24, 2016.
24.3% acceptance rate
Abstract. This paper presents a novel distributed variational inference framework that unifies many parallel sparse Gaussian process regression (SGPR) models for scalable hyperparameter learning with big data. To achieve this, our framework exploits a structure of correlated noise process model that represents the observation noises as a finite realization of a high-order Gaussian Markov random process. By varying the Markov order and covariance function for the noise process model, different variational SGPR models result. This consequently allows the correlation structure of the noise process model to be characterized for which a particular variational SGPR model is optimal. We empirically evaluate the predictive performance and scalability of the distributed variational SGPR models unified by our framework on two real-world datasets.
- Near-Optimal Active Learning of Multi-Output Gaussian Processes.
Yehong Zhang, Trong Nghia Hoang, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI-16), pages 2351-2357, Phoenix, AZ, Feb 12-17, 2016.
25.75% acceptance rate
Abstract. This paper addresses the problem of active learning of a multi-output Gaussian process (MOGP) model representing multiple types of coexisting correlated environmental phenomena. In contrast to existing works, our active learning problem involves selecting not just the most informative sampling locations to be observed but also the types of measurements at each selected location for minimizing the predictive uncertainty (i.e., posterior joint entropy) of a target phenomenon of interest given a sampling budget. Unfortunately, such an entropy criterion scales poorly in the numbers of candidate sampling locations and selected observations when optimized. To resolve this issue, we first exploit a structure common to sparse MOGP models for deriving a novel active learning criterion. Then, we exploit a relaxed form of sub-modularity property of our new criterion for devising a polynomial-time approximation algorithm that guarantees a constant-factor approximation of that achieved by the optimal set of selected observations. Empirical evaluation on real-world datasets shows that our proposed approach outperforms existing algorithms for active learning of MOGP and single-output GP models.
- Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond.
Chun Kai Ling, Kian Hsiang Low & Patrick Jaillet.
In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI-16), pages 1860-1866, Phoenix, AZ, Feb 12-17, 2016.
25.75% acceptance rate
Abstract. This paper presents a novel nonmyopic adaptive Gaussian process planning (GPP) framework endowed with a general class of Lipschitz continuous reward functions that can unify some active learning/sensing and Bayesian optimization criteria and offer practitioners some flexibility to specify their desired choices for defining new tasks/problems. In particular, it utilizes a principled Bayesian sequential decision problem framework for jointly and naturally optimizing the exploration-exploitation trade-off. In general, the resulting induced GPP policy cannot be derived exactly due to an uncountable set of candidate observations. A key contribution of our work here thus lies in exploiting the Lipschitz continuity of the reward functions to solve for a nonmyopic adaptive ϵ-optimal GPP (ϵ-GPP) policy. To plan in real time, we further propose an asymptotically optimal, branch-and-bound anytime variant of ϵ-GPP with performance guarantee. We empirically demonstrate the effectiveness of our ϵ-GPP policy and its anytime variant in Bayesian optimization and an energy harvesting task.
- DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks.
Jie Fu, Hongyin Luo, Jiashi Feng, Kian Hsiang Low & Tat-Seng Chua.
In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI-16), pages 1469-1475, New York City, NY, Jul 9-15, 2016.
<25% acceptance rate
Abstract.
The performance of deep neural networks is well-known to be sensitive to the setting of their hyperparameters. Recent advances in reverse-mode automatic differentiation allow for optimizing hyperparameters with gradients. The standard way of computing these gradients involves a forward and backward pass of computations. However, the backward pass usually needs to consume unaffordable memory to store all the intermediate variables to exactly reverse the forward training procedure. In this work, we propose a simple but effective method, DrMAD, to distill the knowledge of the forward pass into a shortcut path, through which we approximately reverse the training trajectory. Experiments on two image benchmark datasets show that DrMAD is at least 45 times faster and consumes 100 times less memory compared to state-of-the-art methods for optimizing hyperparameters with minimal compromise to its effectiveness. To the best of our knowledge, DrMAD is the first research attempt to make it practical to automatically tune thousands of hyperparameters of deep neural networks.
- Multi-Agent Continuous Transportation with Online Balanced Partitioning.
Chao Wang, Somchaya Liemhetcharat & Kian Hsiang Low.
In Proceedings of the
15th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-16), pages 1303-1304, Singapore, May 9-13, 2016.
Abstract. We introduce the concept of continuous transportation task to the context of multi-agent systems. A continuous transportation task is one in which a multi-agent team visits a number of fixed locations, picks up objects, and delivers them to a transportation hub. The goal is to maximize the rate of transportation while the objects are replenished over time. In this extended abstract, we present a hybrid of centralized and distributed approaches that minimize communications in the multi-agent team. We contribute a novel online partitioning-transportation algorithm with information gathering in the multi-agent team.
- Concept-based Hybrid Fusion of Multimodal Event Signals.
Yuhui Wang, Christian von der Weth, Yehong Zhang, Kian Hsiang Low, Vivek Singh & Mohan Kankanhalli.
In Proceedings of the
IEEE International Symposium on Multimedia (ISM'16), pages 14-19, San Jose, CA, Dec 11-13, 2016.
26.1% acceptance rate
Abstract. Recent years have seen a significant increase in the number of sensors and resulting event related sensor data, allowing for a better monitoring and understanding of real-world events and situations. Event-related data come from not only physical sensors (e.g., CCTV cameras, webcams) but also social or microblogging platforms (e.g., Twitter). Given the wide-spread availability of sensors, we observe that sensors of different modalities often independently observe the same events. We argue that fusing multimodal data about an event can be helpful for more accurate detection, localization and detailed description of events of interest. However, multimodal data often include noisy observations, varying information densities and heterogeneous representations, which makes the fusion a challenging task. In this paper, we propose a hybrid fusion approach that takes the spatial and semantic characteristics of sensor signals about events into account. For this, we first adopt the concept of an image-based representation that expresses the situation of particular visual concepts (e.g., "crowdedness", "people marching") called Cmage for both physical and social sensor data. Based on this Cmage representation, we model sparse sensor information using a Gaussian process, fuses multimodal event signals with a Bayesian approach, and incorporates spatial relations between the sensor and social observations. We demonstrate the effectiveness of our approach as a proof-of-concept over real-world data. Our early results show that the proposed approach can reliably reduce the sensor-related noise, localize event place, improve event detection reliability, and add semantic context so that the fused data provide a better picture of the observed events or situations.
- Inverse Reinforcement Learning with Locally Consistent Reward Functions.
Quoc Phong Nguyen, Kian Hsiang Low & Patrick Jaillet.
In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, R. Garnett, editors, Advances in Neural Information Processing Systems 28: 29th Annual Conference on Neural Information Processing Systems (NIPS-15), pages 1747-1755, Curran Associates, Inc., Montreal, Canada, Dec 7-12, 2015.
21.9% acceptance rate
Abstract. Existing inverse reinforcement learning (IRL) algorithms have assumed each expert's demonstrated trajectory to be produced by only a single reward function. This paper presents a novel generalization of the IRL problem that allows each trajectory to be generated by multiple locally consistent reward functions, hence catering to more realistic and complex experts' behaviors.
Solving our generalized IRL problem thus involves not only learning these reward functions but also the stochastic transitions between them at any state (including unvisited states).
By representing our IRL problem with a probabilistic graphical model, an expectation-maximization (EM) algorithm can be devised to iteratively learn the reward functions and stochastic transitions between them in order to jointly improve the likelihood of the expert's demonstrated trajectories.
As a result, the most likely partition of a trajectory into segments that are generated from different locally consistent reward functions selected by EM can be derived.
Empirical evaluation on synthetic and real-world datasets shows that our IRL algorithm outperforms the state-of-the-art EM clustering with maximum likelihood IRL, which is, interestingly, a reduced variant of our approach.
- Gaussian Process Decentralized Data Fusion and Active Sensing for Spatiotemporal Traffic Modeling and Prediction in Mobility-on-Demand Systems.
Jie Chen, Kian Hsiang Low, Patrick Jaillet & Yujian Yao.
In IEEE Transactions on Automation Science and Engineering
(Special Issue on Networked Cooperative Autonomous Systems), volume 12, issue 3, pages 901-921, Jul 2015.
Abstract. Mobility-on-demand (MoD) systems have recently emerged as a
promising paradigm of one-way vehicle sharing for sustainable personal
urban mobility in densely populated cities. We assume the capability of
a MoD system to be enhanced by deploying robotic shared vehicles that
can autonomously cruise the streets to be hailed by users. A key
challenge of the MoD system is that of real-time, fine-grained mobility
demand and traffic flow sensing and prediction. This paper presents
novel Gaussian process (GP) decentralized data fusion and active
sensing algorithms for real-time, fine-grained traffic modeling and
prediction with a fleet of MoD vehicles. The predictive performance of
our decentralized data fusion algorithms are theoretically guaranteed to
be equivalent to that of sophisticated centralized sparse GP
approximations. We derive consensus filtering variants requiring only
local communication between neighboring vehicles. We theoretically
guarantee the performance of our decentralized active sensing
algorithms. When they are used to gather informative data for mobility
demand prediction, they can achieve a dual effect of fleet rebalancing
to service mobility demands. Empirical evaluation on real-world datasets
shows that our algorithms are significantly more time-efficient and
scalable in the size of data and fleet while achieving predictive
performance comparable to that of state-of-the-art algorithms.
- A Unifying Framework of Anytime Sparse Gaussian Process Regression Models with Stochastic Variational Inference for Big Data.
Trong Nghia Hoang, Quang Minh Hoang & Kian Hsiang Low.
In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 569-578, Lille, France, Jul 6-11, 2015.
26.0% acceptance rate
Abstract. This paper presents a novel unifying framework of anytime sparse Gaussian process regression (SGPR) models that can produce good predictive performance fast and improve their predictive performance over time. Our proposed unifying framework reverses the variational inference procedure to theoretically construct a non-trivial, concave functional that is maximized at the predictive distribution of any SGPR model of our choice.
As a result, a stochastic natural gradient ascent method can be derived that involves iteratively following the stochastic natural gradient of the functional to improve its estimate of the predictive distribution of the chosen SGPR model
and is guaranteed to achieve asymptotic convergence to it. Interestingly, we show that if the predictive distribution of the chosen SGPR model
satisfies certain decomposability conditions, then the stochastic natural gradient is an unbiased estimator of the exact natural gradient and can be computed in constant time (i.e., independent of data size) at each iteration. We empirically evaluate the trade-off between the predictive performance vs. time efficiency of the anytime SGPR models on two real-world million-sized datasets.
- Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation.
Kian Hsiang Low, Jiangbo Yu, Jie Chen & Patrick Jaillet.
In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI-15), pages 2821-2827, Austin, TX, Jan 25-29, 2015.
26.67% acceptance rate
Abstract. The expressive power of a Gaussian process (GP) model comes at a cost of poor scalability in the data size.
To improve its scalability, this paper presents a low-rank-cum-Markov approximation (LMA) of the GP model that is novel in leveraging the dual computational advantages stemming from complementing a low-rank approximate representation of the full-rank GP based on a support set of inputs with a Markov approximation of the resulting residual process; the latter approximation is guaranteed to be closest in the Kullback-Leibler distance criterion subject to some constraint
and is considerably more refined than that of existing sparse GP models utilizing low-rank representations due to its more relaxed conditional independence assumption (especially with larger data).
As a result, our LMA method can trade off between the size of the support set and the order of the Markov property to (a) incur lower computational cost than such sparse GP models while achieving predictive performance comparable to them and (b) accurately represent features/patterns of any scale.
Interestingly, varying the Markov order produces a spectrum of LMAs
with PIC approximation and full-rank GP at the two extremes.
An advantage of our LMA method is that it is amenable to parallelization on multiple machines/cores, thereby gaining greater scalability.
Empirical evaluation on three real-world datasets in clusters of up to 32 computing nodes shows that our centralized and parallel LMA methods are significantly more time-efficient and scalable than state-of-the-art sparse and full-rank GP regression methods
while achieving comparable predictive performances.
- Recent Advances in Scaling up Gaussian Process Predictive Models for Large Spatiotemporal Data.
Kian Hsiang Low, Jie Chen, Trong Nghia Hoang, Nuo Xu & Patrick Jaillet.
In S. Ravela, A. Sandu, editors,
Dynamic Data-Driven Environmental Systems Science - First International Conference, DyDESS'14, LNCS 8964, pages 167-181, Springer International Publishing, MIT, Cambridge, MA, Nov 5-7, 2014.
Oral presentation
Abstract. The expressive power of Gaussian process (GP) models comes at a cost of poor scalability in the size of the data. To improve their scalability, this paper presents an overview of our recent progress in scaling up GP models for large spatiotemporally correlated data through parallelization on clusters of machines, online learning, and nonmyopic active sensing/learning.
- Multi-Agent Ad Hoc Team Partitioning by Observing and Modeling Single-Agent Performance.
Etkin Baris Ozgul, Somchaya Liemhetcharat & Kian Hsiang Low.
In Proceedings of the
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC'14), pages 1-7, Siem Reap, city of Angkor Wat, Cambodia, Dec 9-12, 2014.
Abstract. Multi-agent research has focused on finding the optimal team for a task. Many approaches assume that the performance of the agents are known a priori. We are interested in ad hoc teams, where the agents' algorithms and performance are initially unknown. We focus on the task of modeling the performance of single agents through observation in training environments, and using the learned models to partition a new environment for a multi-agent team. The goal is to minimize the number of agents used, while maintaining a performance threshold of the multi-agent team. We contribute a novel model to learn the agent's performance through observations, and a partitioning algorithm that minimizes the team size. We evaluate our algorithms in simulation, and show the efficacy of our learn model and partitioning algorithm.
- Scalable Decision-Theoretic Coordination and Control for Real-time Active Multi-Camera Surveillance.
Prabhu Natarajan, Trong Nghia Hoang, Yongkang Wong, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
8th ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC'14) (Invited Paper to Special Session on Smart Cameras for Smart Environments), pages 115-120, Venezia, Italy, Nov 4-7, 2014.
Abstract. This paper presents an overview of our novel decision-theoretic multi-agent approach for controlling and coordinating multiple active cameras in surveillance. In this approach, a surveillance task is modeled as a stochastic optimization problem, where the active cameras are controlled and coordinated to achieve the desired surveillance goal in presence of uncertainties. We enumerate the practical issues in active camera surveillance and discuss how these issues are addressed in our decision-theoretic approach. We focus on two novel surveillance tasks: maximize the number of targets observed in active cameras with guaranteed image resolution and to improve the fairness in observation of multiple targets. We discuss the overview of our novel decision-theoretic frameworks: Markov decision process and partially observable Markov decision process frameworks for coordinating active cameras in uncertain and partially occluded environments.
- Active Learning is Planning: Nonmyopic ϵ-Bayes-Optimal Active Learning of Gaussian Processes.
Trong Nghia Hoang, Kian Hsiang Low, Patrick Jaillet and Mohan Kankanhalli.
In T. Calders, F. Esposito, E. Hüllermeier, R. Meo, editors, Machine Learning and Knowledge Discovery in Databases - European Conference, ECML/PKDD-14 Nectar (New Scientific and Technical Advances in Research) Track, Part III, LNCS 8726, pages 494-498, Springer Berlin Heidelberg, Nancy, France, Sep 15-19, 2014.
Abstract. A fundamental issue in active learning of Gaussian processes is that of the exploration-exploitation trade-off. This paper presents a novel nonmyopic ϵ-Bayes-optimal active learning (ϵ-BAL) approach that jointly optimizes the trade-off. In contrast, existing works have primarily developed greedy algorithms or performed exploration and exploitation separately. To perform active learning in real time, we then propose an anytime algorithm based on ϵ-BAL with performance guarantee and empirically demonstrate using a real-world dataset that, with limited budget, it outperforms the state-of-the-art algorithms.
- Generalized Online Sparse Gaussian Processes with Application to Persistent Mobile Robot Localization.
Kian Hsiang Low, Nuo Xu, Jie Chen, Keng Kiat Lim & Etkin Baris Ozgul.
In T. Calders, F. Esposito, E. Hüllermeier, R. Meo, editors, Machine Learning and Knowledge Discovery in Databases - European Conference, ECML/PKDD-14 Nectar (New Scientific and Technical Advances in Research) Track, Part III, LNCS 8726, pages 499-503, Springer Berlin Heidelberg, Nancy, France, Sep 15-19, 2014.
Abstract. This paper presents a novel online sparse Gaussian process (GP) approximation method that is capable of achieving constant time and memory (i.e., independent of the size of the data) per time step. We theoretically guarantee its predictive performance to be equivalent to that of a sophisticated offline sparse GP approximation method. We empirically demonstrate the practical feasibility of using our online sparse GP approximation method through a real-world persistent mobile robot localization experiment.
- No One is Left "Unwatched": Fairness in Observation of Crowds of Mobile Targets in Active Camera Surveillance.
Prabhu Natarajan, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
21st European Conference on Artificial Intelligence (ECAI-14), including Prestigious Applications of Intelligent Systems (PAIS-14), pages 1155-1160, Prague, Czech Republic, Aug 18-22, 2014.
Abstract. Central to the problem of active multi-camera surveillance is the fundamental issue of fairness in the observation of crowds of targets such that no target is "starved" of observation by the cameras for a long time. This paper presents a principled decision-theoretic multi-camera coordination and control (MC2) algorithm called fair-MC2 that can coordinate and control the active cameras to achieve max-min fairness in the observation of crowds of targets moving stochastically. Our fair-MC2 algorithm is novel in demonstrating how (a) the uncertainty in the locations, directions, speeds, and observation times of the targets arising from the stochasticity of their motion can be modeled probabilistically, (b) the notion of fairness in observing targets can be formally realized in the domain of multi-camera surveillance for the first time by exploiting the max-min fairness metric to formalize our surveillance objective, that is, to maximize the expected minimum observation time over all targets while guaranteeing a predefined image resolution of observing them, and (c) a structural assumption in the state transition dynamics of a surveillance environment can be exploited to improve its scalability to linear time in the number of targets to be observed during surveillance. Empirical evaluation through extensive simulations in realistic surveillance environments shows that fair-MC2 outperforms the state-of-the-art and baseline MC2 algorithms. We have also demonstrated the feasibility of deploying our fair-MC2 algorithm on real AXIS 214 PTZ cameras.
- GP-Localize: Persistent Mobile Robot Localization using Online Sparse Gaussian Process Observation Model.
Nuo Xu, Kian Hsiang Low, Jie Chen, Keng Kiat Lim & Etkin Baris Ozgul.
In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI-14), pages 2585-2592, Quebec City, Canada, Jul 27-31, 2014.
16.6% acceptance rate (oral presentation)
Also appeared in
RSS-14 Workshop on Non-Parametric Learning in Robotics, Berkeley, CA, Jul 12, 2014.
Abstract. Central to robot exploration and mapping is the task of persistent localization in environmental fields characterized by spatially correlated measurements. This paper presents a Gaussian process localization (GP-Localize) algorithm that, in contrast to existing works, can exploit the spatially correlated field measurements taken during a robot's exploration (instead of relying on prior training data) for efficiently and scalably learning the GP observation model online through our proposed novel online sparse GP. As a result, GP-Localize is capable of achieving constant time and memory (i.e., independent of the size of the data) per filtering step, which demonstrates the practical feasibility of using GPs for persistent robot localization and autonomy. Empirical evaluation via simulated experiments with real-world datasets and a real robot experiment shows that GP-Localize outperforms existing GP localization algorithms.
- Nonmyopic ϵ-Bayes-Optimal Active Learning of Gaussian Processes.
Trong Nghia Hoang, Kian Hsiang Low, Patrick Jaillet and Mohan Kankanhalli.
In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 739-747, Beijing, China, Jun 21-26, 2014.
22.4% acceptance rate (cycle 2)
Also appeared in
RSS-14 Workshop on Non-Parametric Learning in Robotics, Berkeley, CA, Jul 12, 2014.
Abstract. A fundamental issue in active learning of Gaussian processes is that of the exploration-exploitation trade-off.
This paper presents a novel nonmyopic ϵ-Bayes-optimal active learning (ϵ-BAL) approach that jointly and naturally optimizes the trade-off.
In contrast, existing works have primarily developed myopic/greedy algorithms or performed exploration and exploitation separately.
To perform active learning in real time, we then propose an anytime algorithm based on ϵ-BAL with performance guarantee and empirically demonstrate using synthetic and real-world datasets that, with limited budget, it outperforms the state-of-the-art algorithms.
- Multi-Robot Active Sensing of Non-Stationary Gaussian Process-Based Environmental Phenomena.
Ruofei Ouyang, Kian Hsiang Low, Jie Chen & Patrick Jaillet.
In Proceedings of the
13th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-14), pages 573-580, Paris, France, May 5-9, 2014.
23.8% acceptance rate
Also appeared in
RSS-14 Workshop on Non-Parametric Learning in Robotics, Berkeley, CA, Jul 12, 2014.
Abstract. A key challenge of environmental sensing and monitoring is that of sensing, modeling, and predicting large-scale, spatially correlated environmental phenomena, especially when they are unknown and non-stationary.
This paper presents a decentralized multi-robot active sensing (DEC-MAS) algorithm that can efficiently coordinate the exploration of multiple robots to gather the most informative observations for predicting an unknown, non-stationary phenomenon.
By modeling the phenomenon using a Dirichlet process mixture of Gaussian processes (DPM-GPs), our work here is novel in demonstrating how DPM-GPs and its structural properties can be exploited to (a) formalize an active sensing criterion that trades off between gathering the most informative observations for estimating the unknown, non-stationary spatial correlation structure vs. that for predicting the phenomenon given the current, imprecise estimate of the correlation structure, and (b) support efficient decentralized coordination.
We also provide a theoretical performance guarantee for DEC-MAS and analyze its time complexity.
We empirically demonstrate using two real-world datasets that DEC-MAS outperforms state-of-the-art MAS algorithms.
- Decision-Theoretic Approach to Maximizing Fairness in Multi-Target Observation in Multi-Camera Surveillance.
Prabhu Natarajan, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
13th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-14), pages 1521-1522, Paris, France, May 5-9, 2014.
Abstract. Central to the problem of active multi-camera surveillance is the fundamental issue of fairness in the observation of multiple targets such that no target is left unobserved by the cameras for a long time. To address this important issue, we propose a novel principled decision-theoretic approach to control and coordinate multiple active cameras to achieve fairness in the observation of multiple moving targets.
- Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with Self-Interested Agents.
Trong Nghia Hoang & Kian Hsiang Low.
In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI-13), pages 2298-2305, Beijing, China, Aug 3-9, 2013.
13.2% acceptance rate (oral presentation)
Abstract. A key challenge in non-cooperative multi-agent systems is that of developing efficient planning algorithms for intelligent agents to interact and perform effectively among boundedly rational, self-interested agents (e.g., humans). The practicality of existing works addressing this challenge is being undermined due to either the restrictive assumptions of the other agents' behavior, the failure in accounting for their rationality, or the prohibitively expensive cost of modeling and predicting their intentions. To boost the practicality of research in this field, we investigate how intention prediction can be efficiently exploited and made practical in planning, thereby leading to efficient intention-aware planning frameworks capable of predicting the intentions of other agents and acting optimally with respect to their predicted intentions. We show that the performance losses incurred by the resulting planning policies are linearly bounded by the error of intention prediction. Empirical evaluations through a series of stochastic games demonstrate that our policies can achieve better and more robust performance than the state-of-the-art algorithms.
- A General Framework for Interacting Bayes-Optimally with Self-Interested Agents using Arbitrary Parametric Model and Model Prior.
Trong Nghia Hoang & Kian Hsiang Low.
In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI-13), pages 1394-1400, Beijing, China, Aug 3-9, 2013.
28.0% acceptance rate
Abstract. Recent advances in Bayesian reinforcement learning (BRL) have shown that Bayes-optimality is theoretically achievable by modeling the environment's latent dynamics using Flat-Dirichlet-Multinomial (FDM) prior. In self-interested multi-agent environments, the transition dynamics are mainly controlled by the other agent's stochastic behavior for which FDM's independence and modeling assumptions do not hold. As a result, FDM does not allow the other agent's behavior to be generalized across different states nor specified using prior domain knowledge. To overcome these practical limitations of FDM, we propose a generalization of BRL to integrate the general class of parametric models and model priors, thus allowing practitioners' domain knowledge to be exploited to produce a fine-grained and compact representation of the other agent's behavior. Empirical evaluation shows that our approach outperforms existing multi-agent reinforcement learning algorithms.
- Parallel Gaussian Process Regression with Low-Rank Covariance Matrix Approximations.
Jie Chen, Nannan Cao, Kian Hsiang Low, Ruofei Ouyang, Colin Keng-Yan Tan & Patrick Jaillet.
In Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI-13), pages 152-161, Bellevue, WA, Jul 11-15, 2013.
31.3% acceptance rate
Abstract. Gaussian processes (GP) are Bayesian non-parametric models that are widely used for probabilistic regression. Unfortunately, it cannot scale well with large data nor perform real-time predictions due to its cubic time cost in the data size. This paper presents two parallel GP regression methods that exploit low-rank covariance matrix approximations for distributing the computational load among parallel machines to achieve time efficiency and scalability. We theoretically guarantee the predictive performances of our proposed parallel GPs to be equivalent to that of some centralized approximate GP regression methods: The computation of their centralized counterparts can be distributed among parallel machines, hence achieving greater time efficiency and scalability. We analytically compare the properties of our parallel GPs such as time, space, and communication complexity. Empirical evaluation on two real-world datasets in a cluster of 20 computing nodes shows that our parallel GPs are significantly more time-efficient and scalable than their centralized counterparts and exact/full GP while achieving predictive performances comparable to full GP.
- Gaussian Process-Based Decentralized Data Fusion and Active Sensing for Mobility-on-Demand System.
Jie Chen, Kian Hsiang Low & Colin Keng-Yan Tan.
In Proceedings of the
Robotics: Science and Systems Conference (RSS-13), Berlin, Germany, Jun 24-28, 2013.
30.1% acceptance rate
- Multi-Robot Informative Path Planning for Active Sensing of Environmental Phenomena: A Tale of Two Algorithms.
Nannan Cao, Kian Hsiang Low & John M. Dolan.
In Proceedings of the
12th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-13), pages 7-14, Saint Paul, MN, May 6-10, 2013.
22.9% acceptance rate
Abstract. A key problem of robotic environmental sensing and monitoring is that of active sensing: How can a team of robots plan the most informative observation paths to minimize the uncertainty in modeling and predicting an environmental phenomenon? This paper presents two principled approaches to efficient information-theoretic path planning based on entropy and mutual information criteria for in situ active sensing of an important broad class of widely-occurring environmental phenomena called anisotropic fields. Our proposed algorithms are novel in addressing a trade-off between active sensing performance and time efficiency. An important practical consequence is that our algorithms can exploit the spatial correlation structure of Gaussian process-based anisotropic fields to improve time efficiency while preserving near-optimal active sensing performance. We analyze the time complexity of our algorithms and prove analytically that they scale better than state-of-the-art algorithms with increasing planning horizon length. We provide theoretical guarantees on the active sensing performance of our algorithms for a class of exploration tasks called transect sampling, which, in particular, can be improved with longer planning time and/or lower spatial correlation along the transect. Empirical evaluation on real-world anisotropic field data shows that our algorithms can perform better or at least as well as the state-of-the-art algorithms while often incurring a few orders of magnitude less computational time, even when the field conditions are less favorable.
- Adaptive Sampling of Time Series with Application to Remote Exploration.
David R. Thompson, Nathalie Cabrol, Michael Furlong, Craig Hardgrove, Kian Hsiang Low, Jeffrey Moersch & David Wettergreen.
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA'13), pages 3463-3468, Karlsruhe, Germany, May 6-10, 2013.
Abstract. We address the problem of adaptive information-optimal data collection in time series. Here a remote sensor or explorer agent throttles its sampling rate in order to track anomalous events while obeying constraints on time and power. This problem is challenging because the agent has limited visibility - all collected datapoints lie in the past, but its resource allocation decisions require predicting far into the future. Our solution is to continually fit a Gaussian process model to the latest data and optimize the sampling plan on line to maximize information gain. We compare the performance characteristics of stationary and nonstationary Gaussian process models. We also describe an application based on geologic analysis during planetary rover exploration. Here adaptive sampling can improve coverage of localized anomalies and potentially benefit mission science yield of long autonomous traverses.
- Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena.
Jie Chen, Kian Hsiang Low, Colin Keng-Yan Tan, Ali Oran, Patrick Jaillet, John M. Dolan & Gaurav S. Sukhatme.
In Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (UAI-12), pages 163-173, Catalina Island, CA, Aug 15-17, 2012.
31.6% acceptance rate
Also appeared in AAMAS-12 Workshop on Agents in Traffic and Transportation (ATT-12), Valencia, Spain, June 4-8, 2012.
Abstract. The problem of modeling and predicting spatiotemporal traffic phenomena over an urban road network is important to many traffic applications such as detecting and forecasting congestion hotspots. This paper presents a decentralized data fusion and active sensing (D2FAS) algorithm for mobile sensors to actively explore the road network to gather and assimilate the most informative data for predicting the traffic phenomenon. We analyze the time and communication complexity of D2FAS and demonstrate that it can scale well with a large number of observations and sensors. We provide a theoretical guarantee on its predictive performance to be equivalent to that of a sophisticated centralized sparse approximation for the Gaussian process (GP) model: The computation of such a sparse approximate GP model can thus be parallelized and distributed among the mobile sensors (in a Google-like MapReduce paradigm), thereby achieving efficient and scalable prediction. We also theoretically guarantee its active sensing performance that improves under various practical environmental conditions. Empirical evaluation on real-world urban road network data shows that our D2FAS algorithm is significantly more time-efficient and scalable than state-of-the-art centralized algorithms while achieving comparable predictive performance.
- Hierarchical Bayesian Nonparametric Approach to Modeling and Learning the Wisdom of Crowds of Urban Traffic Route Planning Agents.
Jiangbo Yu, Kian Hsiang Low, Ali Oran & Patrick Jaillet.
In Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT'12)
(Invited Paper to Special Session on Large-Scale Application-Focused Multi-Agent Systems), pages 478-485, Macau, Dec 4-7, 2012.
Abstract. Route prediction is important to analyzing and understanding the route patterns and behavior of traffic crowds. Its objective is to predict the most likely or "popular" route of road segments from a given point in a road network. This paper presents a hierarchical Bayesian non-parametric approach to efficient and scalable route prediction that can harness the wisdom of crowds of route planning agents by aggregating their sequential routes of possibly varying lengths and origin-destination pairs. In particular, our approach has the advantages of (a) not requiring a Markov assumption to be imposed and (b) generalizing well with sparse data, thus resulting in significantly improved prediction accuracy, as demonstrated empirically using real-world taxi route data. We also show two practical applications of our route prediction algorithm: predictive taxi ranking and route recommendation.
- Decision-Theoretic Coordination and Control for Active Multi-Camera Surveillance in Uncertain, Partially Observable Environments.
Prabhu Natarajan, Trong Nghia Hoang, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
6th ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC'12), pages 1-6, Hong Kong, Oct 30 - Nov 2, 2012.
Abstract. A central problem of surveillance is to monitor multiple targets moving in a large-scale, obstacle-ridden environment with occlusions. This paper presents a novel principled Partially Observable Markov Decision Process-based approach to coordinating and controlling a network of active cameras for tracking and observing multiple mobile targets at high resolution in such surveillance environments. Our proposed approach is capable of (a) maintaining a belief over the targets' states (i.e., locations, directions, and velocities) to track them, even when they may not be observed directly by the cameras at all times, (b) coordinating the cameras' actions to simultaneously improve the belief over the targets' states and maximize the expected number of targets observed with a guaranteed resolution, and (c) exploiting the inherent structure of our surveillance problem to improve its scalability (i.e., linear time) in the number of targets to be observed. Quantitative comparisons with state-of-the-art multi-camera coordination and control techniques show that our approach can achieve higher surveillance quality in real time. The practical feasibility of our approach is also demonstrated using real AXIS 214 PTZ cameras.
- Decentralized Active Robotic Exploration and Mapping for Probabilistic Field Classification in Environmental Sensing.
Kian Hsiang Low, Jie Chen, John M. Dolan, Steve Chien & David R. Thompson.
In Proceedings of the
11th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-12), pages 105-112, Valencia, Spain, June 4-8, 2012.
20.4% acceptance rate
Also appeared in
IROS'11 Workshop on Robotics for Environmental Monitoring (WREM-11), San Francisco, CA, Sep 30, 2011.
Abstract. A central problem in environmental sensing and monitoring is to classify/label the hotspots in a large-scale environmental field. This paper presents a novel decentralized active robotic exploration (DARE) strategy for probabilistic classification/labeling of hotspots in a Gaussian process (GP)-based field. In contrast to existing state-of-the-art exploration strategies for learning environmental field maps, the time needed to solve the DARE strategy is independent of the map resolution and the number of robots, thus making it practical for in situ, real-time active sampling. Its exploration behavior exhibits an interesting formal trade-off between that of boundary tracking until the hotspot region boundary can be accurately predicted and wide-area coverage to find new boundaries in sparsely sampled areas to be tracked. We provide a theoretical guarantee on the active exploration performance of the DARE strategy: under reasonable conditional independence assumption, we prove that it can optimally achieve two formal cost-minimizing exploration objectives based on the misclassification and entropy criteria. Importantly, this result implies that the uncertainty of labeling the hotspots in a GP-based field is greatest at or close to the hotspot region boundaries. Empirical evaluation on real-world plankton density and temperature field data shows that, subject to limited observations, DARE strategy can achieve more superior classification of hotspots and time efficiency than state-of-the-art active exploration strategies.
- Decision-Theoretic Approach to Maximizing Observation of Multiple Targets in Multi-Camera Surveillance.
Prabhu Natarajan, Trong Nghia Hoang, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
11th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-12), pages 155-162, Valencia, Spain, June 4-8, 2012.
20.4% acceptance rate
Abstract. This paper presents a novel decision-theoretic approach to control and coordinate multiple active cameras for observing a number of moving targets in a surveillance system. This approach offers the advantages of being able to (a) account for the stochasticity of targets' motion via probabilistic modeling, and (b) address the trade-off between maximizing the expected number of observed targets and the resolution of the observed targets through stochastic optimization. One of the key issues faced by existing approaches in multi-camera surveillance is that of scalability with increasing number of targets. We show how its scalability can be improved by exploiting the problem structure: as proven analytically, our decision-theoretic approach incurs time that is linear in the number of targets to be observed during surveillance. As demonstrated empirically through simulations, our proposed approach can achieve high-quality surveillance of up to 50 targets in real time and its surveillance performance degrades gracefully with increasing number of targets. We also demonstrate our proposed approach with real AXIS 214 PTZ cameras in maximizing the number of Lego robots observed at high resolution over a surveyed rectangular area. The results are promising and clearly show the feasibility of our decision-theoretic approach in controlling and coordinating the active cameras in real surveillance system.
- Intention-Aware Planning under Uncertainty for Interacting with Self-Interested, Boundedly Rational Agents.
Trong Nghia Hoang & Kian Hsiang Low.
In Proceedings of the
11th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-12), pages 1233-1234, Valencia, Spain, June 4-8, 2012.
Abstract. A key challenge in non-cooperative multi-agent systems is that of developing efficient planning algorithms for intelligent agents to perform effectively among boundedly rational, self-interested (i.e., non-cooperative) agents (e.g., humans). To address this challenge, we investigate how intention prediction can be efficiently exploited and made practical in planning, thereby leading to efficient intention-aware planning frameworks capable of predicting the intentions of other agents and acting optimally with respect to their predicted intentions.
- Active Markov Information-Theoretic Path Planning for Robotic Environmental Sensing.
Kian Hsiang Low, John M. Dolan & Pradeep Khosla.
In Proceedings of the
10th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-11), pages 753-760, Taipei, Taiwan, May 2-6, 2011.
22.1% acceptance rate
Abstract. Recent research in multi-robot exploration and mapping has focused on sampling environmental fields, which are typically modeled using the Gaussian process (GP). Existing information-theoretic exploration strategies for learning GP-based environmental field maps adopt the non-Markovian problem structure and consequently scale poorly with the length of history of observations. Hence, it becomes computationally impractical to use these strategies for in situ, real-time active sampling. To ease this computational burden, this paper presents a Markov-based approach to efficient information-theoretic path planning for active sampling of GP-based fields. We analyze the time complexity of solving the Markov-based path planning problem, and demonstrate analytically that it scales better than that of deriving the non-Markovian strategies with increasing length of planning horizon. For a class of exploration tasks called the transect sampling task, we provide theoretical guarantees on the active sampling performance of our Markov-based policy, from which ideal environmental field conditions and sampling task settings can be established to limit its performance degradation due to violation of the Markov assumption. Empirical evaluation on real-world temperature and plankton density field data shows that our Markov-based policy can generally achieve active sampling performance comparable to that of the widely-used non-Markovian greedy policies under less favorable realistic field conditions and task settings while enjoying significant computational gain over them.
- Autonomous Personal Vehicle for the First- and Last-Mile Transportation Services.
Zhuang Jie Chong, Baoxing Qin, Tirthankar Bandyopadhyay, Tichakorn Wongpiromsarn, Edward Samuel Rankin, Marcelo H. Ang, Jr., Emilio Frazzoli, Daniela Rus, David Hsu & Kian Hsiang Low.
In Proceedings of the
5th IEEE International Conference on Cybernetics and
Intelligent Systems and 5th IEEE International
Conference on Robotics, Automation and Mechatronics (CIS-RAM'11), pages 253-260, Qingdao, China, Sep 17-19, 2011.
Also appeared in IROS'11 Workshop on Perception and Navigation for Autonomous Vehicles in Human Environment, San Francisco, CA, Sep 30, 2011.
Abstract. This paper describes an autonomous vehicle testbed that aims at providing the first- and last- mile transportation services. The vehicle mainly operates in a crowded urban environment whose features can be extracted a priori. To ensure that the system is economically feasible, we take a minimalistic approach and exploit prior knowledge of the environment and the availability of the existing infrastructure such as cellular networks and traffic cameras. We present three main components of the system: pedestrian detection, localization (even in the presence of tall buildings) and navigation. The performance of each component is evaluated. Finally, we describe the role of the existing infrastructural sensors and show the improved performance of the system when they are utilized.
- Telesupervised Remote Surface
Water Quality Sensing.
Gregg Podnar, John M. Dolan, Kian Hsiang Low & Alberto Elfes.
In Proceedings of the IEEE Aerospace Conference, Big Sky, MT, Mar 6-13, 2010.
Abstract. We present a fleet of autonomous Robot Sensor Boats (RSBs) developed for lake and river fresh water quality assessment and controlled by our Multilevel Autonomy Robot Telesupervision Architecture (MARTA). The RSBs are low cost, highly maneuverable, shallow draft sensor boats, developed as part of the Sensor Web program supported under the Advanced Information Systems Technology program of NASA's Earth Systems Technology Office. They can scan large areas of lakes, and navigate up tributaries to measure water quality near outfalls that larger research vessels cannot reach. The MARTA telesupervision architecture has been applied to a number of domains from multi-platform autonomous wide area planetary mineral prospecting, to multi-platform ocean monitoring. The RSBs are a complementary expansion of a fleet of NOAA/NASA-developed extended-deployment surface autonomous vehicles that enable in-situ study of meteorological factors of the ocean/atmosphere interface, and which have been adapted to investigate harmful algal blooms under this program. The flexibility of the MARTA telesupervision architecture was proven as it supported simultaneous operation of these heterogenous autonomous sensor platforms while geographically widely separated. Results and analysis are presented of multiple tests carried out over three months using a multi-sensor water sonde to assess water quality in a small recreational lake. Inference Grids were used to produce maps representing temperature, pH, and dissolved oxygen. The tests were performed under various water conditions (clear vs. hair algae-laden) and both before and after heavy rains. Data from each RSB was relayed to a data server in our lab in Pittsburgh, Pennsylvania, and made available over the World Wide Web where it was acquired by team members at the Jet Propulsion Laboratory of NASA in Pasadena, California who monitored the boats and their sensor readings in real time, as well as using these data to model the water quality by producing Inference Grid-based maps.
- Information-Theoretic Approach to Efficient Adaptive Path Planning for Mobile Robotic Environmental Sensing.
Kian Hsiang Low, John M. Dolan & Pradeep Khosla.
In Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS-09), pages 233-240, Thessaloniki, Greece, Sep 19-23, 2009.
33.9% acceptance rate
Also appeared in IPSN-09 Workshop on Sensor Networks for Earth and Space Science Applications (ESSA-09), San Francisco, CA, Apr 16, 2009.
Also orally presented in RSS-09 Workshop on Aquatic Robots and Ocean Sampling, Seattle, WA, Jun 29, 2009.
Abstract. Recent research in robot exploration and mapping has focused on sampling environmental hotspot fields. This exploration task is formalized by Low, Dolan, and Khosla (2008) in a sequential decision-theoretic planning under uncertainty framework called MASP. The time complexity of solving MASP approximately depends on the map resolution, which limits its use in large-scale, high-resolution exploration and mapping. To alleviate this computational difficulty, this paper presents an information-theoretic approach to MASP (iMASP) for efficient adaptive path planning; by reformulating the cost-minimizing iMASP as a reward-maximizing problem, its time complexity becomes independent of map resolution and is less sensitive to increasing robot team size as demonstrated both theoretically and empirically. Using the reward-maximizing dual, we derive a novel adaptive variant of maximum entropy sampling, thus improving the induced exploration policy performance. It also allows us to establish theoretical bounds quantifying the performance advantage of optimal adaptive over non-adaptive policies and the performance quality of approximately optimal vs. optimal adaptive policies. We show analytically and empirically the superior performance of iMASP-based policies for sampling the log-Gaussian process to that of policies for the widely-used Gaussian process in mapping the hotspot field. Lastly, we provide sufficient conditions that, when met, guarantee adaptivity has no benefit under an assumed environment model.
- Cooperative Aquatic Sensing using the Telesupervised Adaptive Ocean Sensor Fleet.
John M. Dolan, Gregg W. Podnar, Stephen Stancliff, Kian Hsiang Low, Alberto Elfes, John Higinbotham, Jeffrey C. Hosler, Tiffany A. Moisan & John Moisan.
In Proceedings of the SPIE Conference on Remote Sensing of the Ocean, Sea Ice, and Large Water Regions, volume 7473, Berlin, Germany, Aug 31 - Sep 3, 2009.
Abstract. Earth science research must bridge the gap between the atmosphere and the ocean to foster understanding of Earth's climate and ecology. Typical ocean sensing is done with satellites or in situ buoys and research ships which are slow to reposition. Cloud cover inhibits study of localized transient phenomena such as Harmful Algal Blooms (HAB). A fleet of extended-deployment surface autonomous vehicles will enable in situ study of characteristics of HAB, coastal pollutants, and related phenomena. We have developed a multiplatform telesupervision architecture that supports adaptive reconfiguration based on environmental sensor inputs. Our system allows the autonomous repositioning of smart sensors for HAB study by networking a fleet of NOAA OASIS (Ocean Atmosphere Sensor Integration System) surface autonomous vehicles. In situ measurements intelligently modify the search for areas of high concentration. Inference Grid and complementary information-theoretic techniques support sensor fusion and analysis. Telesupervision supports sliding autonomy from high-level mission tasking, through vehicle and data monitoring, to teleoperation when direct human interaction is appropriate. This paper reports on experimental results from multi-platform tests conducted in the Chesapeake Bay and in Pittsburgh, Pennsylvania waters using OASIS platforms, autonomous kayaks, and multiple simulated platforms to conduct cooperative sensing of chlorophyll-a and water quality.
- Robot Boats as a Mobile Aquatic Sensor Network.
Kian Hsiang Low, Gregg Podnar, Stephen Stancliff, John M. Dolan & Alberto Elfes.
In Proceedings of the IPSN-09 Workshop on Sensor Networks for Earth and Space Science Applications (ESSA-09), San Francisco, CA, Apr 16, 2009.
Abstract. This paper describes the Multilevel Autonomy Robot Telesupervision Architecture (MARTA), an architecture for supervisory control of a heterogeneous fleet of net-worked unmanned autonomous aquatic surface vessels carrying a payload of environmental science sensors. This architecture allows a land-based human scientist to effectively supervise data gathering by multiple robotic assets that implement a web of widely dispersed mobile sensors for in situ study of physical, chemical or bio-logical processes in water or in the water/atmosphere interface.
- Adaptive Multi-Robot Wide-Area Exploration And Mapping.
Kian Hsiang Low, John M. Dolan & Pradeep Khosla.
In Proceedings of the
7th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-08), pages 23-30, Estoril, Portugal, May 12-16, 2008.
22.2% acceptance rate
Also presented as a poster in RSS-09 Workshop on Aquatic Robots and Ocean Sampling, Seattle, WA, Jun 29, 2009.
Abstract. The exploration problem is a central issue in mobile robotics. A complete terrain coverage is not practical if the environment is large with only a few small hotspots. This paper presents an adaptive multi-robot exploration strategy that is novel in performing both wide-area coverage and hotspot sampling using non-myopic path planning. As a result, the environmental phenomena can be accurately mapped. It is based on a dynamic programming formulation, which we call the Multi-robot Adaptive Sampling Problem (MASP). A key feature of MASP is in covering the entire adaptivity spectrum, thus allowing strategies of varying adaptivity to be formed and theoretically analyzed in their performance; a more adaptive strategy improves mapping accuracy. We apply MASP to sampling the Gaussian and log-Gaussian processes, and analyze if the resulting strategies are adaptive and maximize wide-area coverage and hotspot sampling. Solving MASP is non-trivial as it comprises continuous state components. So, it is reformulated for convex analysis, which allows discrete-state monotone-bounding approximation to be developed. We provide a theoretical guarantee on the policy quality of the approximate MASP (aMASP) for using in MASP. Although aMASP can be solved exactly, its state size grows exponentially with the number of stages. To alleviate this computational difficulty, anytime algorithms are proposed based on aMASP, one of which can guarantee its policy quality for MASP in real time.
- Adaptive Sampling for Multi-Robot Wide-Area Exploration.
Kian Hsiang Low, Geoffrey J. Gordon, John M. Dolan & Pradeep Khosla.
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA'07), pages 755-760, Rome, Italy, Apr 10-14, 2007.
Abstract. The exploration problem is a central issue in mobile robotics. A complete coverage is not practical if the environment is large with a few small hotspots, and the sampling cost is high. So, it is desirable to build robot teams that can coordinate to maximize sampling at these hotspots while minimizing resource costs, and consequently learn more accurately about properties of such environmental phenomena. An important issue in designing such teams is the exploration strategy. The contribution of this paper is in the evaluation of an adaptive exploration strategy called adaptive cluster sampling (ACS), which is demonstrated to reduce the resource costs (i.e., mission time and energy consumption) of a robot team, and yield more information about the environment by directing robot exploration towards hotspots. Due to the adaptive nature of the strategy, it is not obvious how the sampled data can be used to provide unbiased, low-variance estimates of the properties. This paper therefore discusses how estimators that are Rao-Blackwellized can be used to achieve low error. This paper also presents the first analysis of the characteristics of the environmental phenomena that favor the ACS strategy and estimators. Quantitative experimental results in a mineral prospecting task simulation show that our approach is more efficient in exploration by yielding more minerals and information with fewer resources and providing more precise mineral density estimates than previous methods.
- Autonomic Mobile Sensor Network with Self-Coordinated Task Allocation and Execution.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews
(Special Issue on Engineering Autonomic Systems), volume 36, issue 3, pages 315-327, May 2006.
Andrew P. Sage Best Transactions Paper Award for the best paper published in IEEE Trans. SMC - Part A, B, and C in 2006
Abstract. This paper describes a distributed layered architecture for resource-constrained multirobot cooperation, which is utilized in autonomic mobile sensor network coverage. In the upper layer, a dynamic task allocation scheme self-organizes the robot coalitions to track efficiently across regions. It uses concepts of ant behavior to self-regulate the regional distributions of robots in proportion to that of the moving targets to be tracked in a nonstationary environment. As a result, the adverse effects of task interference between robots are minimized and network coverage is improved. In the lower task execution layer, the robots use self-organizing neural networks to coordinate their target tracking within a region. Both layers employ self-organization techniques, which exhibit autonomic properties such as self-configuring, self-optimizing, self-healing, and self-protecting. Quantitative comparisons with other tracking strategies such as static sensor placements, potential fields, and auction-based negotiation show that our layered approach can provide better coverage, greater robustness to sensor failures, and greater flexibility to respond to environmental changes.
- An Ensemble of Cooperative Extended Kohonen Maps for Complex Robot Motion Tasks.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Neural Computation, volume 17, issue 6, pages 1411-1445, Jun 2005.
Abstract. Self-organizing feature maps such as extended Kohonen maps (EKMs) have been very successful at learning sensorimotor control for mobile robot tasks. This letter presents a new ensemble approach, cooperative EKMs with indirect mapping, to achieve complex robot motion. An indirect-mapping EKM self-organizes to map from the sensory input space to the motor control space indirectly via a control parameter space. Quantitative evaluation reveals that indirect mapping can provide finer, smoother, and more efficient motion control than does direct mapping by operating in a continuous, rather than discrete, motor control space. It is also shown to outperform basis function neural networks. Furthermore, training its control parameters with recursive least squares enables faster convergence and better performance compared to gradient descent. The cooperation and competition of multiple self-organized EKMs allow a nonholonomic mobile robot to negotiate unforeseen, concave, closely spaced, and dynamic obstacles. Qualitative and quantitative comparisons with neural network ensembles employing weighted sum reveal that our method can achieve more sophisticated motion tasks even though the weighted-sum ensemble approach also operates in continuous motor control space.
- Task Allocation via Self-Organizing Swarm Coalitions in Distributed Mobile Sensor Network.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the 19th National Conference on Artificial Intelligence (AAAI-04), pages 28-33, San Jose, CA, Jul 25-29, 2004.
26.7% acceptance rate
Abstract. This paper presents a task allocation scheme via self-organizing swarm coalitions for distributed mobile sensor network coverage. Our approach uses the concepts of ant behavior to self-regulate the regional distributions of sensors in proportion to that of the moving targets to be tracked in a non-stationary environment. As a result, the adverse effects of task interference between robots are minimized and sensor network coverage is improved. Quantitative comparisons with other tracking strategies such as static sensor placement, potential fields, and auction-based negotiation show that our approach can provide better coverage and greater flexibility to respond to environmental changes.
- Reactive, Distributed Layered Architecture for Resource-Bounded Multi-Robot Cooperation: Application to Mobile Sensor Network Coverage.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA'04), pages 3747-3752, New Orleans, LA, Apr 26 - May 1, 2004.
Abstract. This paper describes a reactive, distributed layered architecture for cooperation of multiple resource-bounded robots, which is utilized in mobile sensor network coverage. In the upper layer, a dynamic task allocation scheme self-organizes the robot coalitions to track efficiently in separate regions. It uses the concepts of ant behavior to self-regulate the regional distributions of robots in proportion to that of the targets to be tracked in the changing environment. As a result, the adverse effects of task interference between robots are minimized and sensor network coverage is improved. In the lower layer, the robots use self-organizing neural networks to coordinate their target tracking within a region. Quantitative comparisons with other tracking strategies such as static sensor placements, potential fields, and auction-based negotiation show that our approach can provide better coverage and greater flexibility in responding to environmental changes.
- Continuous-Spaced Action Selection for Single- and Multi-Robot Tasks Using Cooperative Extended Kohonen Maps.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the IEEE International Conference on Networking, Sensing and Control (ICNSC'04)
(Invited Paper to Special Session on Visual Surveillance), pages 198-203, Taipei, Taiwan, Mar 21-23, 2004.
Abstract. Action selection is a central issue in the design of behavior-based control architectures for autonomous mobile robots. This paper presents an action selection framework based on an assemblage of self-organizing neural networks called Cooperative Extended Kohonen Maps. This framework encapsulates two features that significantly enhance a robot's action selection capability: self-organization in the continuous state and action spaces to provide smooth, efficient and fine motion control; action selection via the cooperation and competition of Extended Kohonen Maps so that more complex motion tasks can be achieved. Qualitative and quantitative comparisons for both single- and multi-robot motion tasks show that our framework can provide better action selection than do action superposition methods.
- Action Selection for Single- and Multi-Robot Tasks Using Cooperative Extended Kohonen Maps.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI-03), pages 1505-1506, Acapulco, Mexico, Aug 9-15, 2003.
27.6% acceptance rate
Abstract. This paper presents an action selection framework based on an assemblage of self-organizing neural networks called Cooperative Extended Kohonen Maps. This framework encapsulates two features that significantly enhance a robot's action selection capability: self-organization in the continuous state and action spaces to provide smooth, efficient and fine motion control; action selection via the cooperation and competition of Extended Kohonen Maps to achieve more complex motion tasks. Qualitative and quantitative comparisons for single- and multi-robot tasks show our framework can provide better action selection than do potential fields method.
- Action Selection in Continuous State and Action Spaces by Cooperation and Competition of Extended Kohonen Maps.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the
2nd International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS-03), pages 1056-1057, Melbourne, Australia, Jul 14-18, 2003.
Abstract. This paper presents an action selection framework based on an assemblage of self-organizing neural networks called Cooperative Extended Kohonen Maps. This framework encapsulates two features that significantly enhance a robot's action selection capability: self-organization in the continuous state and action spaces to provide smooth, efficient and fine motion control; action selection via the cooperation and competition of Extended Kohonen Maps to achieve more complex motion tasks. Qualitative tests demonstrate the capability of our action selection method for both single- and multi-robot motion tasks.
- Enhancing the Reactive Capabilities of Integrated Planning and Control with Cooperative Extended Kohonen Maps.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the
IEEE International Conference on Robotics and Automation (ICRA'03), pages 3428-3433, Taipei, Taiwan, May 12-17, 2003.
Abstract. Despite the many significant advances made in robot motion research, few works have focused on the tight integration of high-level deliberative planning with reactive control at the lowest level. In particular, the real-time performance of existing integrated planning and control architectures is still not optimal because the reactive control capabilities have not been fully realized. This paper aims to enhance the low-level reactive capabilities of integrated planning and control with Cooperative Extended Kohonen Maps for handling complex, unpredictable environments so that the work-load of the high-level planner can be consequently eased. The enhancements include fine, smooth motion control, execution of more complex motion tasks such as overcoming unforeseen concave obstacles and traversing between closely spaced obstacles, and asynchronous execution of behaviors.
- A Hybrid Mobile Robot Architecture with Integrated Planning and Control.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the
1st International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS-02), pages 219-226, Bologna, Italy, Jul 15-19, 2002.
26% acceptance rate
Abstract. Research in the planning and control of mobile robots has received much attention in the past two decades. Two basic approaches have emerged from these research efforts: deliberative vs. reactive. These two approaches can be distinguished by their different usage of sensed data and global knowledge, speed of response, reasoning capability, and complexity of computation. Their strengths are complementary and their weaknesses can be mitigated by combining the two approaches in a hybrid architecture. This paper describes a method for goal-directed, collision-free navigation in unpredictable environments that employs a behavior-based hybrid architecture with asynchronously operating behavioral modules. It differs from existing hybrid architectures in two important ways: (1) the planning module produces a sequence of checkpoints instead of a conventional complete path, and (2) in addition to obstacle avoidance, the reactive module also performs target reaching under the control of a self-organizing neural network. The neural network is trained to perform fine, smooth motor control that moves the robot through the checkpoints. These two aspects facilitate a tight integration between high-level planning and low-level control, which permits real-time performance and easy path modification even when the robot is en route to the goal position.
- Integrated Planning and Control of Mobile Robot with Self-Organizing Neural Network.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the
IEEE International Conference on Robotics and Automation (ICRA'02), pages 3870-3875, Washington, DC, May 11-15, 2002.
Abstract. Despite the many significant advances made in robotics research, few works have focused on the tight integration of task planning and motion control. Most integration works involve the task planner providing discrete commands to the low-level controller, which performs kinematics and control computations to command the motor and joint actuators. This paper presents a framework of the integrated planning and control for mobile robot navigation. Unlike existing integrated approaches, it produces a sequence of checkpoints instead of a complete path at the planning level. At the motion control level, a neural network is trained to perform motor control that moves the robot from one checkpoint to the next. This method allows for a tight integration between high-level planning and low-level control, which permits real-time performance and easy modification of motion path while the robot is enroute to the goal position.
|
TECHNICAL REPORTS
- New Advances on Bayesian and Decision-Theoretic Approaches for Interactive Machine Learning.
Trong Nghia Hoang.
Ph.D. Thesis, Department of Computer Science, National University of Singapore, Feb 2015.
Abstract.
The exploration-exploitation trade-off is a fundamental dilemma in many interactive learning scenarios which include both aspects of reinforcement learning (RL) and active learning (AL): An autonomous agent, situated in an unknown environment, has to actively extract knowledge from the environment by taking actions (or conducting experiments) based on its previously collected information to make accurate predictions or to optimize some utility functions. Thus, to make the most effective use of their resource-constrained budget (e.g., processing time, experimentation cost), the agent must choose carefully between (a) exploiting options (e.g., actions, experiments) which are recommended by its current, possibly incomplete model of the environment, and (b) exploring the other ostensibly sub-optimal choices to gather more information.
For example, an RL agent has to face a dilemma between (a) exploiting the most-rewarding action according to the current statistical model of the environment at the risk of running into catastrophic situations if the model is not accurate, and (b) exploring a sub-optimal action to gather more information so as to improve the models accuracy at the potential price of losing the short-term reward. Similarly, an AL algorithm/agent has to consider between (a) conducting the most informative experiments according to its current estimation of the environment models parameters (i.e., exploitation), and (b) running experiments that help improving the estimation accuracy of these parameters (i.e., exploration).
More often, learning strategies that ignore exploration will likely exhibit sub-optimal performance due to their imperfect knowledge while, conversely, those that entirely focus on exploration might suffer the cost of learning without benefitting from it. Therefore, a good exploration-exploitation trade-off is critical to the success of those interactive learning agents: In order to perform well, they must strike the right balance between these two conflicting objectives. Unfortunately, while this trade-off has been well-recognized since the early days of RL, the studies of exploration-exploitation have mostly been developed for theoretical settings in the respective field of RL and, perhaps surprisingly, glossed over in the existing AL literature. From a practical point of view, we see three limiting factors:
- Previous works addressing the exploration-exploitation trade-off in RL have largely focused on simple choices of the environment model and consequently, are not practical enough to accommodate real-world applications that have far more complicated environment structures. In fact, we find that most recent advances in Bayesian reinforcement learning (BRL) have only been able to analytically trade off between exploration and exploitation under a simple choice of models such as Flat-Dirichlet-Multinomial (FDM) whose independence and modeling assumptions do not hold for many real-world applications.
- Nearly all of the notable works in the AL literature primarily advocate the use of greedy/myopic algorithms whose rates of convergence (i.e., the number of experiments required by the learning algorithm to achieve a desired performance in the worst case) are provably minimax optimal for simple classes of learning tasks (e.g., threshold learning). While these results have greatly ad- vanced our understanding about the limit of myopic AL in worst-case scenarios, significantly less is presently known about whether it is possible to devise nonmyopic AL strategies which optimize the exploration-exploitation trade-off to achieve the best expected performance in budgeted learning scenarios.
- The issue of scalability of the existing predictive models (e.g., Gaussian processes) used in AL has generally been underrated since the majority of literature considers small-scale environments which only consist of a few thousand candidate experiments to be selected by single-mode AL algorithms one at a time prior to retraining the model. In contrast, large-scale environments usually have a massive set of million candidate experiments among which tens or hundreds of thousands should be actively selected for learning. For such data-intensive problems, it is often more cost-effective to consider batch-mode AL algorithms which select and conduct multiple experiments in parallel at each stage to collect observations in batch. Retraining the predictive model after incorporating each batch of observations then becomes a computational bottleneck as the collected dataset at each stage quickly grows up to tens or even hundreds of thousand data points.
This thesis outlines some recent progresses that we have been able to make while working toward satisfactory answers to the above challenges, along with practical algorithms that achieve them:
- In particular, in order to put BRL into practice for more complicated and practical problems, we propose a novel framework called Interactive Bayesian Reinforcement Learning (I-BRL) to integrate the general class of parametric models and model priors, thus allowing the practitioners' domain knowledge to be exploited to produce a fine-grained and compact representation of the environment as often required in many real-world applications. Interestingly, we show how the nonmyopic Bayes-optimal policy can be derived analytically by solving I-BRL exactly and propose an approximation algorithm to compute it efficiently in polynomial time. Our empirical studies show that the proposed approach performs competitively with the existing state-of-the-art algorithms.
- Then, to establish a theoretical foundation for the exploration-exploitation trade-off in single-mode active learning scenarios with resource-constrained budgets, we present a novel ϵ-Bayes-optimal Decision-Theoretic Active Learning (ϵ-BAL) framework which advocates the use of differential entropy as a performance measure and consequently, derives a learning policy that can approximate the optimal expected performance arbitrarily closely (i.e., within an arbitrary loss bound ϵ). To meet the real-time requirement in time-critical applications, we then propose an asymptotically ϵ-optimal, branch-and-bound anytime algorithm based on ϵ-BAL with performance guarantees. In practice, we empirically demonstrate with both synthetic and real-world datasets that the proposed approach outperforms the state-of-the-art algorithms in budgeted scenarios.
- Lastly, to facilitate the future developments of large-scale, nonmyopic AL applications, we further introduce a highly scalable family of anytime predictive models for AL which provably converge toward a well-known class of sparse Gaussian processes (SGPs). Unlike the existing predictive models of AL which cannot be updated incrementally and are only capable of processing middle-sized datasets (i.e., a few thousands of data points), our proposed models can process massive datasets in an anytime fashion, thus providing a principled trade-off between the processing time and the predictive accuracy. The efficiency of our framework is then demonstrated empirically on a variety of large-scale real-world datasets which contains hundreds of thousand data points.
- Gaussian Process-Based Decentralized Data Fusion and Active Sensing Agents: Towards Large-Scale Modeling and Prediction of Spatiotemporal Traffic Phenomena.
Jie Chen.
Ph.D. Thesis, Department of Computer Science, National University of Singapore, Dec 2013.
Abstract.
Knowing and understanding the environmental phenomena is important to many real world applications. This thesis is devoted to study large-scale modeling and prediction of spatiotemporal environmental phenomena (i.e., urban traffic phenomena). Towards this goal, our proposed approaches rely on a class of Bayesian non-parametric models: Gaussian processes (GP).
To accurately model spatiotemporal urban traffic phenomena in real world situation, a novel relational GP taking into account both the road segment features and road network topology information is proposed to model real world traffic conditions over road network. Additionally, a GP variant called log-Gaussian process (lGP) is exploited to model an urban mobility demand pattern which contains skewness and extremity in demand measurements.
To achieve efficient and scalable urban traffic phenomenon prediction given a large phenomenon data, we propose three novel parallel GPs: parallel partially independent training conditional (pPITC), parallel partially independent conditional(pPIC) and parallel incomplete Cholesky factorization (pICF)-based approximations of GP model, which can distribute their computational load into a cluster of parallel/multi-core machines, thereby achieving time efficiency. The predictive performances of such parallel GPs are theoretically guaranteed to be equivalent to that of some centralized approaches to approximate full/exact GP regression. The proposed parallel GPs are implemented using the message passing interface (MPI) framework and tested on two large real world datasets. The theoretical and empirical results show that our parallel GPs achieve significantly better time efficiency and scalability than that of full GP, while achieving comparable accuracy. They also achieve fine speedup performance that is the ratio of time required by the parallel algorithms and their centralized counterparts.
To exploit active mobile sensors to perform decentralized perception of the spatiotemporal urban traffic phenomenon, we propose a decentralized algorithm framework: Gaussian process-based decentralized data fusion and active sensing (D2FAS) which is composed of a decentralized data fusion (DDF) component and a decentralized active sensing (DAS) component. The DDF component includes a novel Gaussian process-based decentralized data fusion (GP-DDF) algorithm that can achieve remarkably efficient and scalable prediction of phenomenon and a novel Gaussian process-based decentralized data fusion with local augmentation (GP-DDF+) algorithm that can achieve better predictive accuracy while preserving time efficiency of GP-DDF. The predictive performances of both GP-DDF and GP-DDF+ are theoretically guaranteed to be equivalent to that of some sophisticated centralized sparse approximations of exact/full GP. For the DAS component, we propose a novel partially decentralized active sensing (PDAS) algorithm that exploits property in correlation structure of GP-DDF to enable mobile sensors cooperatively gathering traffic phenomenon data along a near-optimal joint walk with theoretical guarantee, and a fully decentralized active sensing (FDAS) algorithm that guides each mobile sensor gather phenomenon data along its locally optimal walk.
Lastly, to justify the practicality of the D2FAS framework, we develop and test D2FAS algorithms running with active mobile sensors on real world datasets for monitoring traffic conditions and sensing/servicing urban mobility demands. Theoretical and empirical results show that the proposed algorithms are significantly more time-efficient, more scalable in the size of data and in the number of sensors than the state-of-the-art centralized approaches, while achieving comparable predictive accuracy.
- A Decision-Theoretic Approach for Controlling and Coordinating Multiple Active Cameras in Surveillance.
Prabhu Natarajan.
Ph.D. Thesis, Department of Computer Science, National University of Singapore, Dec 2013.
Abstract.
The use of active cameras in surveillance is becoming increasingly popular as they try to meet the demands of capturing high-resolution images/videos of targets in surveillance for face recognition, target identification, forensic video analysis, etc. These active cameras are endowed with pan, tilt, and zoom capabilities, which can be exploited to provide high-quality surveillance. In order to achieve effective, real-time surveillance, an efficient collaborative mechanism is needed to control and coordinate these cameras' actions, which is the focus of this thesis. The central problem in surveillance is to monitor a set of targets with guaranteed image resolution. Controlling and coordinating multiple active cameras to achieve this surveillance task is non-trivial and challenging because: (a) presence of inherent uncertainties in the surveillance environment (targets motion, location, and noisy camera observation); (b) there exists a non-trivial trade-off between number of targets and the resolution of observing these targets; and (c) more importantly, the coordination framework should be scalable with increasing number of targets and cameras.
In this thesis, we formulate a novel decision-theoretic multi-agent planning approach for controlling and coordinating multiple active cameras in surveillance. Our decision-theoretic approach offers advantages of (a) accounting the uncertainties using probabilistic models; (b) the non-trivial trade-off is addressed by coordinating the active cameras' actions to maximize the number of targets with guaranteed resolution; and (c) the scalability in number of targets and cameras is achieved by exploiting the structures and properties that are present in our surveillance problem. We focus on two novel problems in active camera surveillance: (a) maximizing observations of multiple targets (MOMT), i.e., maximizing the number of targets observed in active cameras with guaranteed image resolution; and (b) improving fairness in observation of multiple targets (FOMT), i.e., no target is "starved" of observation by active cameras for long duration of time.
We propose two formal decision-theoretic frameworks (a) Markov Decision Process (MDP) and (b) Partially Observable Markov Decision Process (POMDP) frameworks for coordinating active cameras in surveillance. MDP framework controls active cameras in fully observable surveillance environments where the active cameras are supported by one or more wide-view static/fixed cameras to observe the entire surveillance environment at low-resolution. POMDP framework controls active cameras in partially observable surveillance environments where it is impractical to observe the entire surveillance environment using static/fixed cameras due to occlusions caused by physical infrastructures. Hence the POMDP framework do not have a complete view of the surveillance environment.
Specifically, we propose (a) MDP frameworks to solve MOMT problem and FOMT problem in fully observable surveillance environment; and (b) POMDP framework to solve MOMT problem in partially observable surveillance environment. As proven analytically, our MDP and POMDP frameworks incurs time that is linear in number of targets to be observed during surveillance. We have used max-plus algorithm with our MDP framework to improve its scalability in number of cameras for MOMT problem. Empirical evaluation through simulations in realistic surveillance environment reveals that our proposed approach can achieve high-quality surveillance in real time. We also demonstrate our pro- posed approach with real Axis 214 PTZ cameras to show the practicality of our approach in real world surveillance. Both the simulations and real camera experiments show that our decision-theoretic approach can control and coordinate active cameras efficiently and hence contributes significantly towards improving the active camera surveillance research.
- Information-Theoretic Multi-Robot Path Planning.
Nannan Cao.
M.Sc. Thesis, Department of Computer Science, National University of Singapore, Sep 2012.
Abstract.
Research in environmental sensing and monitoring is especially important in supporting environmental sustainability efforts worldwide, and has recently attracted significant attention and interest. A key direction of this research lies in modeling and predicting the spatiotemporally varying environmental phenomena. One approach is to use a team of robots to sample the area and model the measurement values at unobserved points. For smoothly varying and hotspot fields, there is some work which has been done to model the fields well. However, there is still a class of common environmental fields called anisotropic fields in which the spatial phenomena are highly correlated along one direction and less correlated along the perpendicular direction. We exploit the environmental structure to improve the sampling performance and time efficiency of planning for anisotropic fields.
In this thesis, we cast the planning problem into a stagewise decision-theoretic problem. we adopt Gaussian Process to model spatial phenomena. Maximum entropy criterion and maximum mutual information criterion are used to measure the informativeness of the observation paths. It is found that for many GPs, correlation of two points exponentially decreases with the distance between the two points. With this property, for maximum entropy criterion, we propose a polynomial-time approximation algorithm, MEPP, to find the maximum entropy paths. We also provide a theoretical performance guarantee for this algorithm. For maximum mutual information criterion, we propose another polynomial-time approximation algorithm, M2IPP. Similar to the MEPP, a performance guarantee is also provided for this algorithm. We demonstrate the performance advantages of our algorithms on two real data sets. To get lower prediction error, three priciples have also been proposed to select the criterion for different environmental fields.
- Multi-Robot Adaptive Exploration and Mapping for Environmental Sensing Applications.
Kian Hsiang Low.
Ph.D. Thesis, Technical Report CMU-ECE-2009-024, Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, Aug 2009.
Abstract.
Recent research in robot exploration and mapping has focused on sampling hotspot fields, which often arise in environmental and ecological sensing applications. Such a hotspot field is characterized by continuous, positively skewed, spatially correlated measurements with the hotspots exhibiting extreme measurements and much higher spatial variability than the rest of the field.
To map a hotspot field of the above characterization, we assume that it is realized from non-parametric probabilistic models such as the Gaussian and log-Gaussian processes (respectively, GP and lGP), which can provide formal measures of map uncertainty. To learn a hotspot field map, the exploration strategy of a robot team then has to plan resource-constrained observation paths that minimize the uncertainty of a spatial model of the hotspot field. This exploration problem is formalized in a sequential decision-theoretic planning under uncertainty framework called the multi-robot adaptive sampling problem (MASP). So, MASP can be viewed as a sequential, non-myopic version of active learning. In contrast to finite-state Markov decision problems, MASP adopts a more complex but realistic continuous-state, non-Markovian problem structure so that its induced exploration policy can be informed by the complete history of continuous, spatially correlated observations for selecting paths. It is unique in unifying formulations of non-myopic exploration problems along the entire adaptivity spectrum, thus subsuming existing non-adaptive formulations and allowing the performance advantage of a more adaptive policy to be theoretically realized. Through MASP, it is demonstrated that a more adaptive strategy can exploit clustering phenomena in a hotspot field to produce lower expected map uncertainty. By measuring map uncertainty using the mean-squared error criterion, a MASP-based exploration strategy consequently plans adaptive observation paths that minimize the expected posterior map error or equivalently, maximize the expected map error reduction.
The time complexity of solving MASP (approximately) depends on the map resolution, which limits its practical use in large-scale, high-resolution exploration and mapping. This computational difficulty is alleviated through an information-theoretic approach to MASP (iMASP), which measures map uncertainty based on the entropy criterion instead. As a result, an iMASP-based exploration strategy plans adaptive observation paths that minimize the expected posterior map entropy or equivalently, maximize the expected entropy of observation paths. Unlike MASP, reformulating the cost-minimizing iMASP as a reward-maximizing dual problem causes its time complexity of being solved approximately to be independent of the map resolution and less sensitive to larger robot team size as demonstrated both analytically and empirically. Furthermore, this reward-maximizing dual transforms the widely-used non-adaptive maximum entropy sampling problem into a novel adaptive variant, thus improving the performance of the induced exploration policy.
One advantage stemming from the reward-maximizing dual formulations of MASP and iMASP is that they allow observation selection properties of the induced exploration policies to be realized for sampling the hotspot field. These properties include adaptivity, hotspot sampling, and wide-area coverage. We show that existing GP-based exploration strategies may not explore and map the hotspot field well with the selected observations because they are non-adaptive and perform only wide-area coverage. In contrast, the lGP-based exploration policies can learn a high-quality hotspot field map because they are adaptive and perform both wide-area coverage and hotspot sampling.
The other advantage is that even though MASP and iMASP are non-trivial to solve due to their continuous state components, the convexity of their reward-maximizing duals can be exploited to derive, in a computationally tractable manner, discrete-state monotone-bounding approximations and subsequently, approximately optimal exploration policies with theoretical performance guarantees. Anytime algorithms based on approximate MASP and iMASP are then proposed to alleviate the computational difficulty that arises from their non-Markovian structure.
It is of practical interest to be able to quantitatively characterize the "hotspotness" of an environmental field. We propose a novel "hotspotness" index, which is defined in terms of the spatial correlation properties of the hotspot field. As a result, this index can be related to the intensity, size, and diffuseness of the hotspots in the field.
We also investigate how the spatial correlation properties of the hotspot field affect the performance advantage of adaptivity. In particular, we derive sufficient and necessary conditions of the spatial correlation properties for adaptive exploration to yield no performance advantage.
Lastly, we develop computationally efficient approximately optimal exploration strategies for sampling the GP by assuming the Markov property in iMASP planning. We provide theoretical guarantees on the performance of the Markov-based policies, which improve with decreasing spatial correlation. We evaluate empirically the effects of varying spatial correlations on the mapping performance of the Markov-based policies as well as whether these Markov-based path planners are time-efficient for the transect sampling task.
Through the abovementioned work, this thesis establishes the following two claims: (1) adaptive, non-myopic exploration strategies can exploit clustering phenomena to plan observation paths that produce lower map uncertainty than non-adaptive, greedy methods; and (2) Markov-based exploration strategies can exploit small spatial correlation to plan observation paths which achieve map uncertainty comparable to that of non-Markovian policies using significantly less planning time.
- Adaptive Sampling for Multi-Robot Wide Area Prospecting.
Kian Hsiang Low, Geoffrey J. Gordon, John M. Dolan, and Pradeep Khosla.
In Technical Report CMU-RI-TR-05-51, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, Oct 2005.
Abstract. Prospecting for in situ mineral resources is essential for establishing settlements on the Moon and Mars. To reduce human effort and risk, it is desirable to build robotic systems to perform this prospecting. An important issue in designing such systems is the sampling strategy: how do the robots choose where to prospect next? This paper argues that a strategy called Adaptive Cluster Sampling (ACS) has a number of desirable properties: compared to conventional strategies, (1) it reduces the total mission time and energy consumption of a team of robots, and (2) returns a higher mineral yield and more information about the prospected region by directing exploration towards areas of high mineral density, thus providing detailed maps of the boundaries of such areas. Due to the adaptive nature of the sampling scheme, it is not immediately obvious how the resulting sampled data can be used to provide an unbiased, low-variance estimate of the regional mineral density. This paper therefore investigates new mineral density estimators, which have lower error than previously-developed estimators; they are derived from the older estimators via a process called Rao-Blackwellization. Since the efficiency of estimators depends on the type of mineralogical population sampled, the population characteristics that favor ACS estimators are also analyzed. The ACS scheme and our new estimators are evaluated empirically in a detailed simulation of the prospecting task, and the quantitative results show that our approach can yield more minerals with less resources and provide more accurate mineral density estimates than previous methods.
- Integrated Robot Planning and Control with Extended Kohonen Maps.
Kian Hsiang Low.
M.Sc. Thesis, Department of Computer Science, School of Computing, National University of Singapore, Jul 2002.
Singapore Computer Society Prize for best M.Sc. Thesis 2002-2003
Abstract. The problem of goal-directed, collision-free motion in a complex, unpredictable environment can be solved by tightly integrating high-level deliberative planning with low-level reactive control. This thesis presents two such architectures for a nonholonomic mobile robot. To achieve real-time performance, reactive control capabilities have to be fully realized so that the deliberative planner can be simplified. These architectures are enriched with reactive target reaching and obstacle avoidance modules. Their target reaching modules use indirect-mapping Extended Kohonen Map to provide finer and smoother motion control than direct-mapping methods. While one architecture fuses these modules indirectly via command fusion, the other one couples them directly using cooperative Extended Kohonen Maps, enabling the robot to negotiate unforeseen concave obstacles. The planner for both architectures use a slippery cells technique to decompose the free workspace into fewer cells, thus reducing search time. Any two points in the cell can still be traversed by reactive motion.
- Mobile Robots That Learn to Navigate.
Kian Hsiang Low.
Honors Thesis, Department of Computer Science,
School of Computing, National University of Singapore, Apr 2001.
Abstract. A sensorimotor controller has been implemented to enable a mobile robot to learn its motion control autonomously and perform simple target-reaching movements. This controller is able to perform fine motion by reducing its self-positioning error and also, reach a designated target location with minimum delay. The control architecture is in the form of a neural network known as the Self-Organizing Map. Besides implementing the motor control and the online learning algorithms, the essentiality of a pre-learning phase is also evaluated. Then, we explore the possibility of incorporating a novel concept known as Local Linear Smoothing into our batch training algorithm; this notion allows the elimination of the boundary bias phenomenon. Lastly, we suggest a simple approach to learning in an obstacle-ridden environment.
This document, research.html, has been accessed 58560 times since 22-Mar-15 23:49:40 SGT.
This is the 6th time it has been accessed today.
A total of 21021 different hosts have accessed this document in the
last 1432 days; your host, ec2-18-212-222-217.compute-1.amazonaws.com, has accessed it 1 times.
If you're interested, complete statistics for
this document are also available, including breakdowns by top-level
domain, host name, and date.
|
AUTOMATIC MACHINE LEARNING : BAYESIAN OPTIMIZATION
PROJECT DURATION : Feb 2016 - Present
PROJECT AFFILIATION
-
Singapore-MIT Alliance for Research and Technology (SMART) Future Urban Mobility (FM) IRG (Collaborator: Patrick Jaillet, MIT)
-
Sensor-Enhanced Social Media (SeSaMe) Centre (Collaborator: Mohan Kankanhalli)
PROJECT FUNDING
- MOE AcRF Tier 2 Grant : Scaling up Gaussian Process Predictive Models for Big Data, SGD $637,884, Jul 2017 - Jul 2020
- SMART Subaward Agreement - FM IRG :
Automatic Probabilistic Machine Learning for Traffic Modeling and Prediction,
SGD $90,000, Apr 2017 - Mar 2019
- Research Collaboration Agreement with Panasonic R&D Center Singapore : Hyperparameters Tuning using Bayesian Optimization, SGD $69,336, Mar 2016 - Mar 2017
PROBLEM MOTIVATION
- Batch Bayesian Optimization. Bayesian optimization (BO) has recently gained considerable traction due to its capability of finding the global maximum of a highly complex (e.g., non-convex, no closed-form expression nor derivative), noisy black-box objective function with a limited budget of (often costly) function evaluations, consequently witnessing its use in an increasing diversity of application domains such as robotics, environmental sensing/monitoring, automatic machine learning, among others.
A number of acquisition functions (e.g., probability of improvement or expected improvement over the currently found maximum, entropy-based, and upper confidence bound (UCB)) have been devised to perform BO: They repeatedly select an input for evaluating/querying the black-box function (i.e., until the budget is depleted) that intuitively trades off between sampling where the maximum is likely to be given the current, possibly imprecise belief of the function modeled by a Gaussian process (GP) (i.e., exploitation) vs. improving the GP belief of the function over the entire input domain (i.e., exploration) to guarantee finding the global maximum.
The rapidly growing affordability and availability of hardware resources (e.g., computer clusters, sensor networks, robot teams/swarms) have motivated the recent development of BO algorithms that can repeatedly select a batch of inputs for querying the black-box function in parallel instead. Such batch/parallel BO algorithms can be classified into two types: On one extreme, batch BO algorithms like multi-points expected improvement, parallel predictive entropy search, and the parallel knowledge gradient method jointly optimize the batch of inputs and hence scale poorly in the batch size.
On the other extreme, greedy batch BO algorithms boost the scalability by selecting the inputs of the batch one at a time. We argue that such a highly suboptimal approach to gain scalability is an overkill: In practice, each function evaluation is often much more computationally and/or economically costly (e.g., hyperparameter tuning for deep learning, drug testing on human subjects), which justifies dedicating more time to obtain better BO performance.
- Nonmyopic Bayesian Optimization. The fundamental challenge of integrated planning and learning is to design an autonomous agent that can plan its actions to maximize its expected total rewards while interacting with an unknown task environment. Recent research efforts tackling this challenge have progressed from the use of simple Markov models assuming discrete-valued, independent observations to that of a rich class of Bayesian nonparametric Gaussian process (GP) models characterizing continuous-valued, correlated observations in order to represent the latent structure of complex, possibly noisy task environments with higher fidelity. Such a challenge is posed by the problem of Bayesian optimization (BO).
Its objective is to select and gather the most informative (possibly noisy) observations for finding the global maximum of an unknown, highly complex (e.g., non-convex, no closed-form expression nor derivative) objective function (i.e., task environment) modeled by a GP given a sampling budget (e.g., number of costly function evaluations). The rewards of a BO agent are defined using an improvement-based (e.g., probability of improvement or expected improvement over currently found maximum), entropy-based, or upper confidence bound (UCB) acquisition function. A limitation of most BO algorithms is that they are myopic. To overcome this limitation, approximation algorithms for nonmyopic adaptive BO have been proposed, but their performances are not theoretically guaranteed.
PROPOSED METHODOLOGY
- Batch Bayesian Optimization. To tackle the first problem, we show that it is in fact possible to jointly optimize the batch of inputs and still preserve scalability in the batch size by giving practitioners the flexibility to trade off BO performance for time efficiency.
To achieve this, we first observe that, interestingly, batch BO can be perceived as a cooperative multi-agent decision making problem whereby each agent optimizes a separate input of the batch while coordinating with the other agents doing likewise.
To the best of our knowledge, this has not been considered in the BO literature.
In particular, if batch BO can be framed as some known class of multi-agent decision making problems, then it can be solved efficiently and scalably by the latter's state-of-the-art solvers.
The key technical challenge would therefore be to investigate how batch BO can be cast as one of such to exploit its advantage of scalability in the number of agents (hence, batch size) while at the same time theoretically guaranteeing the resulting BO performance.
To tackle the above challenge, our first work presents a novel distributed batch BO algorithm that, in contrast to greedy batch BO algorithms,
can jointly optimize a batch of inputs and, unlike the batch BO algorithms, still preserve scalability in the batch size.
To realize this, we generalize GP-UCB to a new batch variant amenable to a Markov approximation, which can then be naturally formulated as a multi-agent distributed constraint optimization problem (DCOP) in order to fully exploit the efficiency of its state-of-the-art solvers for achieving linear time in the batch size.
Our proposed distributed batch GP-UCB (DB-GP-UCB) algorithm offers practitioners the flexibility to trade off between the approximation quality and time efficiency by varying the Markov order. We provide a theoretical guarantee for the convergence rate of our DB-GP-UCB algorithm via bounds on its cumulative regret. We empirically evaluate the cumulative regret incurred by our DB-GP-UCB algorithm and its scalability in the batch size on synthetic benchmark objective functions and a real-world optimization problem.
- Nonmyopic Bayesian Optimization. To address the second problem, our second work presents a novel nonmyopic adaptive Gaussian process planning (GPP) framework endowed with a general class of Lipschitz continuous reward functions that can unify some active learning and BO criteria (e.g., UCB) and offer practitioners some flexibility to specify their desired choices for defining new tasks/problems. In particular, it utilizes a principled Bayesian sequential decision problem framework for jointly and naturally optimizing the exploration-exploitation trade-off, consequently allowing planning and learning to be integrated seamlessly and performed simultaneously instead of separately. In general, the resulting induced GPP policy cannot be derived exactly due to an uncountable set of candidate observations. A key contribution of our work here thus lies in exploiting the Lipschitz continuity of the reward functions to solve for a nonmyopic adaptive ε-optimal GPP (ε-GPP) policy given an arbitrarily user-specified loss bound ε. To plan in real time, we further propose an asymptotically optimal, branch-and-bound anytime variant of ε-GPP with performance guarantee. Finally, we empirically evaluate the performances of our ε-GPP policy and its anytime variant in BO and an energy harvesting task on simulated and real-world environmental fields.
PUBLICATIONS
- Decentralized High-Dimensional Bayesian Optimization with Factor Graphs.
Trong Nghia Hoang, Quang Minh Hoang, Ruofei Ouyang & Kian Hsiang Low.
In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), pages 3231-3238, New Orleans, LA, Feb 2-8, 2018.
24.55% acceptance rate
Abstract. This paper presents a novel decentralized high-dimensional Bayesian optimization (DEC-HBO) algorithm that, in contrast to existing HBO algorithms, can exploit the interdependent effects of various input components on the output of the unknown objective function f for boosting the BO performance and still preserve scalability in the number of input dimensions without requiring prior knowledge or the existence of a low (effective) dimension of the input space. To realize this, we propose a sparse yet rich factor graph representation of f to be exploited for designing an acquisition function that can be similarly represented by a sparse factor graph and hence be efficiently optimized in a decentralized manner using distributed message passing. Despite richly characterizing the interdependent effects of the input components on the output of f with a factor graph, DEC-HBO can still guarantee (asymptotic) no-regret performance. Empirical evaluation on synthetic and real-world experiments shows that DEC-HBO outperforms the state-of-the-art HBO algorithms.
- Distributed Batch Gaussian Process Optimization.
Erik Daxberger & Kian Hsiang Low.
In Proceedings of the 34th International Conference on Machine Learning (ICML-17), pages 951-960, Sydney, Australia, Aug 6-11, 2017.
25.9% acceptance rate
Abstract. This paper presents a novel distributed batch Gaussian process upper confidence bound (DB-GP-UCB) algorithm for performing batch Bayesian optimization (BO) of highly complex, costly-to-evaluate black-box objective functions. In contrast to existing batch BO algorithms, DB-GP-UCB can jointly optimize a batch of inputs (as opposed to selecting the inputs of a batch one at a time) while still preserving scalability in the batch size. To realize this, we generalize GP-UCB to a new batch variant amenable to a Markov approximation, which can then be naturally formulated as a multi-agent distributed constraint optimization problem in order to fully exploit the efficiency of its state-of-the-art solvers for achieving linear time in the batch size. Our DB-GP-UCB algorithm offers practitioners the flexibility to trade off between the approximation quality and time efficiency by varying the Markov order. We provide a theoretical guarantee for the convergence rate of DB-GP-UCB via bounds on its cumulative regret. Empirical evaluation on synthetic benchmark objective functions and a real-world optimization problem shows that DB-GP-UCB outperforms the state-of-the-art batch BO algorithms.
- Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond.
Chun Kai Ling, Kian Hsiang Low & Patrick Jaillet.
In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI-16), pages 1860-1866, Phoenix, AZ, Feb 12-17, 2016.
25.75% acceptance rate
Abstract. This paper presents a novel nonmyopic adaptive Gaussian process planning (GPP) framework endowed with a general class of Lipschitz continuous reward functions that can unify some active learning/sensing and Bayesian optimization criteria and offer practitioners some flexibility to specify their desired choices for defining new tasks/problems. In particular, it utilizes a principled Bayesian sequential decision problem framework for jointly and naturally optimizing the exploration-exploitation trade-off. In general, the resulting induced GPP policy cannot be derived exactly due to an uncountable set of candidate observations. A key contribution of our work here thus lies in exploiting the Lipschitz continuity of the reward functions to solve for a nonmyopic adaptive ϵ-optimal GPP (ϵ-GPP) policy. To plan in real time, we further propose an asymptotically optimal, branch-and-bound anytime variant of ϵ-GPP with performance guarantee. We empirically demonstrate the effectiveness of our ϵ-GPP policy and its anytime variant in Bayesian optimization and an energy harvesting task.
PRESENTATIONS
- Informative Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Adaptive Sampling, Bayesian Optimization, Active Learning, and Beyond.
Kian Hsiang Low.
Invited speaker at the ICRA'18 Workshop on Informative Path Planning and Adaptive Sampling, Brisbane, Australia, May 21, 2018.
Invited speaker at the Symposium on Oceanographic Data Analytics, Norwegian University of Science and Technology, Trondheim, Norway, Nov 27-30, 2018.
EXPLORATION-EXPLOITATION DILEMMA IN ACTIVE LEARNING OF GAUSSIAN PROCESSES
PROJECT DURATION : Aug 2013 - Present
PROJECT AFFILIATION
-
Singapore-MIT Alliance for Research and Technology (SMART) Future Urban Mobility (FM) IRG (Collaborator: Patrick Jaillet, MIT)
-
Sensor-Enhanced Social Media (SeSaMe) Centre (Collaborator: Mohan Kankanhalli)
PROJECT FUNDING
- MOE AcRF Tier 2 Grant : Scaling up Gaussian Process Predictive Models for Big Data, SGD $637,884, Jul 2017 - Jul 2020
- SMART Subaward Agreements - FM IRG : Spatiotemporal Modeling and Prediction of Traffic Patterns,
SGD $361,456.17, Oct 2011 - Mar 2017
PROBLEM MOTIVATION
The exploration-exploitation dilemma arises in the following three problems of active learning of Gaussian processes:
- Non-Stationary Gaussian Processes. A key challenge of environmental sensing and monitoring is that of sensing, modeling, and predicting complex urban and natural environmental phenomena, which are typically characterized by spatially correlated measurements. To tackle this challenge, recent research efforts in the robotics community have focused on developing multi-robot active sensing (MAS) algorithms: Their objective is to coordinate the exploration of a team of mobile robots to actively
gather the most informative observations for predicting a spatially varying phenomenon of interest while being subject to resource cost constraints (e.g., number of deployed robots, energy consumption, mission time). To achieve this, a number of MAS algorithms have modeled the phenomenon as a Gaussian process (GP), which allows its spatial correlation structure to be formally characterized and its predictive uncertainty to be formally quantified (e.g., based on mean-squared error, entropy, or mutual information criterion) and subsequently exploited for directing the robots to explore its highly uncertain areas. In order not to incur high computational expense, these algorithms have assumed the spatial correlation structure to be known (or estimated crudely using sparse prior data) and stationary (i.e., degree of smoothness in the spatial variation of the measurements is the same across the entire phenomenon), properties of which are often violated in real-world environmental sensing applications and limited to small-scale phenomena.
In practice, the spatial correlation structure of possibly large-scale environmental phenomena is usually not known and non-stationary (i.e., separate areas of a phenomenon exhibit different local degrees of smoothness in the spatial variation of the measurements). For example, in some ocean phenomena (e.g., temperature, salinity, sea surface height), their measurements far offshore are more smoothly varying (i.e., more spatially correlated) in the cross-shore direction than nearshore. Urban traffic networks also display non-stationary phenomena (e.g., traffic speeds, taxi demands), which pose important considerations to traffic routing and signal control.

Existing MAS algorithms can still be used for sampling a non-stationary phenomenon by assuming, albeit incorrectly, its spatial correlation structure to be known and stationary in order to preserve time efficiency. So, though they can gather the most informative observations under an assumed stationary correlation structure, they will perform sub-optimally with respect to the true non-stationary correlation structure.
A more desirable MAS algorithm should instead be designed to consider the informativeness of its selected observations for both estimating the unknown spatial correlation structure of a phenomenon (i.e., exploration) as well as predicting the phenomenon given the true correlation structure (i.e., exploitation). According to previous geostatistical studies, the most informative observations that are gathered for achieving the former active sensing criterion are not necessarily as informative for satisfying the latter. This raises a fundamental issue faced by active sensing: How can a MAS algorithm trade off between these two possibly conflicting criteria?
- Nonmyopic Active Learning. Active learning has become an increasingly important focal theme in many environmental sensing and monitoring applications (e.g., precision agriculture, mineral prospecting, monitoring of ocean and freshwater phenomena like harmful algal blooms, forest ecosystems, or pollution) where a high-resolution in situ sampling of the spatial phenomenon of interest is impractical due to prohibitively costly sampling budget requirements (e.g., number of deployed sensors, energy consumption, mission time): For such applications, it is thus desirable to select and gather the most informative observations/data for modeling and predicting the spatially varying phenomenon subject to some budget constraints, which is the goal of active learning and also known as the active sensing problem.
To elaborate, solving the active sensing problem amounts to deriving an optimal sequential policy that plans/decides the most informative locations to be observed for minimizing the predictive uncertainty of the unobserved areas of a phenomenon given a sampling budget. To achieve this, many existing active sensing algorithms have modeled the phenomenon as a Gaussian process (GP), which allows its spatial correlation structure to be formally characterized and its predictive uncertainty to be formally quantified (e.g., based on mean-squared error, entropy, or mutual information criterion). However, they have assumed the spatial correlation structure (specifically, the parameters defining it) to be known, which is often violated in real-world applications, or estimated crudely using sparse prior data. So, though they aim to select sampling locations that are optimal with respect to the assumed or estimated parameters, these locations tend to be sub-optimal with respect to the true parameters, thus degrading the predictive performance of the learned GP model.
In practice, the spatial correlation structure of a phenomenon is usually not known. Then, the predictive performance of the GP modeling the phenomenon depends on how informative the gathered observations/data are for both parameter estimation as well as spatial prediction given the true parameters. Interestingly, as revealed in previous geostatistical studies, policies that are efficient for parameter estimation are not necessarily efficient for spatial prediction with respect to the true model. Thus, the active sensing problem involves a potential trade-off between sampling the most informative locations for spatial prediction given the current, possibly incomplete knowledge of the model parameters (i.e., exploitation) vs. observing locations that gain more information about the parameters (i.e., exploration): How then does an active sensing algorithm trade off between these two possibly conflicting sampling objectives?

To tackle this question, one principled approach is to frame active sensing as a sequential decision problem that jointly and naturally optimizes the above exploration-exploitation trade-off while maintaining a Bayesian belief over the model parameters. This intuitively means a policy that biases towards observing informative locations for spatial prediction given the current model prior may be penalized if it entails a highly dispersed posterior over the model parameters. So, the resulting induced policy is guaranteed to be optimal in the expected active sensing performance. Unfortunately, such a nonmyopic Bayes-optimal policy cannot be derived exactly due to an uncountable set of candidate observations and unknown model parameters. As a result, most existing works have circumvented the trade-off by resorting to the use of myopic/greedy (hence, sub-optimal) policies. To the best of our knowledge, the only notable nonmyopic active sensing algorithm for GPs advocates tackling exploration and exploitation separately, instead of jointly and naturally optimizing their trade-off, to sidestep the difficulty of solving the Bayesian sequential decision problem. Specifically, it performs a probably approximately correct (PAC)-style exploration until it can verify that the performance loss of greedy exploitation lies within a user-specified threshold. But, such an algorithm is sub-optimal in the presence of budget con- straints due to the following limitations: (a) It is unclear how an optimal threshold for exploration can be determined given a sampling budget, and (b) even if such a threshold is available, the PAC-style exploration is typically designed to satisfy a worst-case sample complexity rather than to be optimal in the expected active sensing performance, thus resulting in an overly-aggressive exploration.
- Multi-Output Gaussian Processes. For many budget-constrained environmental sensing and monitoring applications in the real world, active learning/sensing is an attractive, frugal alternative to passive high-resolution (hence, prohibitively costly) sampling of the spatially varying target phenomenon of interest. Different from the latter, active learning aims to select and gather the most informative observations for modeling and predicting the spatially varying phenomenon given some sampling budget constraints (e.g., quantity of deployed sensors, energy consumption, mission time).
In practice, the target phenomenon often coexists and correlates well with some auxiliary type(s) of phenomena whose measurements may be more spatially correlated, less noisy (e.g., due to higher-quality sensors), and/or less tedious to sample (e.g., due to greater availability/quantity, higher sampling rate, and/or lower sampling cost of deployed sensors of these type(s)) and can consequently be exploited for improving its prediction. For example, to monitor soil pollution by some heavy metal (e.g., Cadmium), its complex and time-consuming extraction from soil samples can be alleviated by supplementing its prediction with correlated auxiliary types of soil measurements (e.g., pH) that are easier to sample. Similarly, to monitor algal bloom in the coastal ocean, plankton abundance correlates well with auxiliary types of ocean measurements (e.g., chlorophyll a, temperature, and salinity) that can be sampled more readily. Other examples of real-world applications include remote sensing, traffic monitoring, monitoring of groundwater and indoor environmental quality, and precision agriculture, among others. All of the above practical examples motivate the need to design and develop an active learning algorithm that selects not just the most informative sampling locations to be observed but also the types of measurements (i.e., target and/or auxiliary) at each selected location for minimizing the predictive uncertainty of unobserved areas of a target phenomenon given a sampling budget, which is the focus of our work here.

To achieve this, we model all types of coexisting phenomena (i.e., target and auxiliary) jointly as a multi-output Gaussian process (MOGP), which allows the spatial correlation structure of each type of phenomenon and the cross-correlation structure between different types of phenomena to be formally characterized. More importantly, unlike the non-probabilistic multivariate regression methods, the probabilistic MOGP regression model allows the predictive uncertainty of the target phenomenon (as well as the auxiliary phenomena) to be formally quantified (e.g., based on entropy or mutual information criterion) and consequently exploited for deriving the active learning criterion.
PROPOSED METHODOLOGY
- Non-Stationary Gaussian Processes. To address the first problem, our first work presents a decentralized multi-robot active sensing (DEC-MAS) algorithm that can efficiently coordinate the exploration of multiple robots to jointly optimize the above trade-off for sampling unknown, non-stationary environmental phenomena. Our DEC-MAS algorithm models a non-stationary phenomenon as a Dirichlet process mixture of Gaussian processes (DPM-GPs): Using the gathered observations, DPM-GPs can learn to automatically partition the phenomenon into separate local areas, each of which comprises measurements that vary according to a stationary spatial correlation structure and can thus be modeled by a locally stationary GP. The main contributions of our work here are novel in demonstrating how DPM-GPs and its structural properties can be exploited to (a) formalize an active sensing criterion that trades off between gathering the most informative observations for estimating the unknown partition (i.e., a key component of the non-stationary correlation structure) vs. that for predicting the phenomenon given the current, possibly imprecise estimate of the partition, and (b) support effective and efficient decentralized coordination. We also provide a theoretical performance guarantee for DEC-MAS and analyze its time complexity. Finally, we empirically demonstrate using two real-world datasets that DEC-MAS outperforms the state-of-the-art MAS algorithms.
- Nonmyopic Active Learning. To tackle the second problem, our second work presents an efficient decision-theoretic planning approach to nonmyopic active sensing/learning that can still preserve and exploit the principled Bayesian sequential decision problem framework for jointly and naturally optimizing the exploration-exploitation trade-off and consequently does not incur the limitations of the algorithm of Krause & Guestrin (2007). In particular, although the exact Bayes-optimal policy to the active sensing problem cannot be derived, we show that it is in fact possible to solve for a nonmyopic ϵ-Bayes-optimal active learning (ϵ-BAL) policy given a user-defined bound ϵ, which is the main contribution of our work here. In other words, our proposed ϵ-BAL policy can approximate the optimal expected active sensing performance arbitrarily closely (i.e., within an arbitrary loss bound ϵ). In contrast, the algorithm of Krause & Guestrin (2007) can only yield a sub-optimal performance bound. To meet the real-time requirement in time-critical applications, we then propose an asymptotically ϵ-optimal, branch-and-bound anytime algorithm based on ϵ-BAL with performance guarantee. We empirically demonstrate using both synthetic and real-world datasets that, with limited budget, our proposed approach outperforms state-of-the-art algorithms.
- Multi-Output Gaussian Processes. To solve the third problem,
our third work is the first to present an efficient algorithm for active learning of a MOGP model. We consider utilizing the entropy criterion to measure the predictive uncertainty of a target phenomenon, which is widely used for active learning of a single-output GP model. Unfortunately, for the MOGP model, such a criterion scales poorly in the number of candidate sampling locations of the target phenomenon and even more so in the number of selected observations (i.e., sampling budget) when optimized. To resolve this scalability issue, we first exploit a structure common to a unifying framework of sparse MOGP models for deriving a novel active learning criterion.
Our novel active learning criterion exhibits an interesting exploration-exploitation trade-off between
selecting locations with the most uncertain measurements of the target phenomenon to be observed given the latent structure of the sparse MOGP model (i.e., exploitation) vs. selecting locations to be observed (i.e., possibly of auxiliary types of phenomena) so as to rely less on measurements at the remaining unobserved locations (i.e., won't be sampled) of the target phenomenon to infer the latent model structure (i.e., exploration).
Then, we define a relaxed notion of submodularity called ϵ-submodularity and exploit the ϵ-submodularity property of our new criterion for devising a polynomial-time approximation algorithm that guarantees a constant-factor approximation of that achieved by the optimal set of selected observations. We empirically evaluate the performance of our proposed algorithm using three real-world datasets.
PUBLICATIONS
- Near-Optimal Active Learning of Multi-Output Gaussian Processes.
Yehong Zhang, Trong Nghia Hoang, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI-16), pages 2351-2357, Phoenix, AZ, Feb 12-17, 2016.
25.75% acceptance rate
Abstract. This paper addresses the problem of active learning of a multi-output Gaussian process (MOGP) model representing multiple types of coexisting correlated environmental phenomena. In contrast to existing works, our active learning problem involves selecting not just the most informative sampling locations to be observed but also the types of measurements at each selected location for minimizing the predictive uncertainty (i.e., posterior joint entropy) of a target phenomenon of interest given a sampling budget. Unfortunately, such an entropy criterion scales poorly in the numbers of candidate sampling locations and selected observations when optimized. To resolve this issue, we first exploit a structure common to sparse MOGP models for deriving a novel active learning criterion. Then, we exploit a relaxed form of sub-modularity property of our new criterion for devising a polynomial-time approximation algorithm that guarantees a constant-factor approximation of that achieved by the optimal set of selected observations. Empirical evaluation on real-world datasets shows that our proposed approach outperforms existing algorithms for active learning of MOGP and single-output GP models.
- New Advances on Bayesian and Decision-Theoretic Approaches for Interactive Machine Learning.
Trong Nghia Hoang.
Ph.D. Thesis, Department of Computer Science, National University of Singapore, Feb 2015.
Abstract.
The exploration-exploitation trade-off is a fundamental dilemma in many interactive learning scenarios which include both aspects of reinforcement learning (RL) and active learning (AL): An autonomous agent, situated in an unknown environment, has to actively extract knowledge from the environment by taking actions (or conducting experiments) based on its previously collected information to make accurate predictions or to optimize some utility functions. Thus, to make the most effective use of their resource-constrained budget (e.g., processing time, experimentation cost), the agent must choose carefully between (a) exploiting options (e.g., actions, experiments) which are recommended by its current, possibly incomplete model of the environment, and (b) exploring the other ostensibly sub-optimal choices to gather more information.
For example, an RL agent has to face a dilemma between (a) exploiting the most-rewarding action according to the current statistical model of the environment at the risk of running into catastrophic situations if the model is not accurate, and (b) exploring a sub-optimal action to gather more information so as to improve the models accuracy at the potential price of losing the short-term reward. Similarly, an AL algorithm/agent has to consider between (a) conducting the most informative experiments according to its current estimation of the environment models parameters (i.e., exploitation), and (b) running experiments that help improving the estimation accuracy of these parameters (i.e., exploration).
More often, learning strategies that ignore exploration will likely exhibit sub-optimal performance due to their imperfect knowledge while, conversely, those that entirely focus on exploration might suffer the cost of learning without benefitting from it. Therefore, a good exploration-exploitation trade-off is critical to the success of those interactive learning agents: In order to perform well, they must strike the right balance between these two conflicting objectives. Unfortunately, while this trade-off has been well-recognized since the early days of RL, the studies of exploration-exploitation have mostly been developed for theoretical settings in the respective field of RL and, perhaps surprisingly, glossed over in the existing AL literature. From a practical point of view, we see three limiting factors:
- Previous works addressing the exploration-exploitation trade-off in RL have largely focused on simple choices of the environment model and consequently, are not practical enough to accommodate real-world applications that have far more complicated environment structures. In fact, we find that most recent advances in Bayesian reinforcement learning (BRL) have only been able to analytically trade off between exploration and exploitation under a simple choice of models such as Flat-Dirichlet-Multinomial (FDM) whose independence and modeling assumptions do not hold for many real-world applications.
- Nearly all of the notable works in the AL literature primarily advocate the use of greedy/myopic algorithms whose rates of convergence (i.e., the number of experiments required by the learning algorithm to achieve a desired performance in the worst case) are provably minimax optimal for simple classes of learning tasks (e.g., threshold learning). While these results have greatly ad- vanced our understanding about the limit of myopic AL in worst-case scenarios, significantly less is presently known about whether it is possible to devise nonmyopic AL strategies which optimize the exploration-exploitation trade-off to achieve the best expected performance in budgeted learning scenarios.
- The issue of scalability of the existing predictive models (e.g., Gaussian processes) used in AL has generally been underrated since the majority of literature considers small-scale environments which only consist of a few thousand candidate experiments to be selected by single-mode AL algorithms one at a time prior to retraining the model. In contrast, large-scale environments usually have a massive set of million candidate experiments among which tens or hundreds of thousands should be actively selected for learning. For such data-intensive problems, it is often more cost-effective to consider batch-mode AL algorithms which select and conduct multiple experiments in parallel at each stage to collect observations in batch. Retraining the predictive model after incorporating each batch of observations then becomes a computational bottleneck as the collected dataset at each stage quickly grows up to tens or even hundreds of thousand data points.
This thesis outlines some recent progresses that we have been able to make while working toward satisfactory answers to the above challenges, along with practical algorithms that achieve them:
- In particular, in order to put BRL into practice for more complicated and practical problems, we propose a novel framework called Interactive Bayesian Reinforcement Learning (I-BRL) to integrate the general class of parametric models and model priors, thus allowing the practitioners' domain knowledge to be exploited to produce a fine-grained and compact representation of the environment as often required in many real-world applications. Interestingly, we show how the nonmyopic Bayes-optimal policy can be derived analytically by solving I-BRL exactly and propose an approximation algorithm to compute it efficiently in polynomial time. Our empirical studies show that the proposed approach performs competitively with the existing state-of-the-art algorithms.
- Then, to establish a theoretical foundation for the exploration-exploitation trade-off in single-mode active learning scenarios with resource-constrained budgets, we present a novel ϵ-Bayes-optimal Decision-Theoretic Active Learning (ϵ-BAL) framework which advocates the use of differential entropy as a performance measure and consequently, derives a learning policy that can approximate the optimal expected performance arbitrarily closely (i.e., within an arbitrary loss bound ϵ). To meet the real-time requirement in time-critical applications, we then propose an asymptotically ϵ-optimal, branch-and-bound anytime algorithm based on ϵ-BAL with performance guarantees. In practice, we empirically demonstrate with both synthetic and real-world datasets that the proposed approach outperforms the state-of-the-art algorithms in budgeted scenarios.
- Lastly, to facilitate the future developments of large-scale, nonmyopic AL applications, we further introduce a highly scalable family of anytime predictive models for AL which provably converge toward a well-known class of sparse Gaussian processes (SGPs). Unlike the existing predictive models of AL which cannot be updated incrementally and are only capable of processing middle-sized datasets (i.e., a few thousands of data points), our proposed models can process massive datasets in an anytime fashion, thus providing a principled trade-off between the processing time and the predictive accuracy. The efficiency of our framework is then demonstrated empirically on a variety of large-scale real-world datasets which contains hundreds of thousand data points.
- Nonmyopic ϵ-Bayes-Optimal Active Learning of Gaussian Processes.
Trong Nghia Hoang, Kian Hsiang Low, Patrick Jaillet and Mohan Kankanhalli.
In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 739-747, Beijing, China, Jun 21-26, 2014.
22.4% acceptance rate (cycle 2)
Also appeared in
RSS-14 Workshop on Non-Parametric Learning in Robotics, Berkeley, CA, Jul 12, 2014.
Abstract. A fundamental issue in active learning of Gaussian processes is that of the exploration-exploitation trade-off.
This paper presents a novel nonmyopic ϵ-Bayes-optimal active learning (ϵ-BAL) approach that jointly and naturally optimizes the trade-off.
In contrast, existing works have primarily developed myopic/greedy algorithms or performed exploration and exploitation separately.
To perform active learning in real time, we then propose an anytime algorithm based on ϵ-BAL with performance guarantee and empirically demonstrate using synthetic and real-world datasets that, with limited budget, it outperforms the state-of-the-art algorithms.
- Active Learning is Planning: Nonmyopic ϵ-Bayes-Optimal Active Learning of Gaussian Processes.
Trong Nghia Hoang, Kian Hsiang Low, Patrick Jaillet and Mohan Kankanhalli.
In T. Calders, F. Esposito, E. Hüllermeier, R. Meo, editors, Machine Learning and Knowledge Discovery in Databases - European Conference, ECML/PKDD-14 Nectar (New Scientific and Technical Advances in Research) Track, Part III, LNCS 8726, pages 494-498, Springer Berlin Heidelberg, Nancy, France, Sep 15-19, 2014.
Abstract. A fundamental issue in active learning of Gaussian processes is that of the exploration-exploitation trade-off. This paper presents a novel nonmyopic ϵ-Bayes-optimal active learning (ϵ-BAL) approach that jointly optimizes the trade-off. In contrast, existing works have primarily developed greedy algorithms or performed exploration and exploitation separately. To perform active learning in real time, we then propose an anytime algorithm based on ϵ-BAL with performance guarantee and empirically demonstrate using a real-world dataset that, with limited budget, it outperforms the state-of-the-art algorithms.
- Multi-Robot Active Sensing of Non-Stationary Gaussian Process-Based Environmental Phenomena.
Ruofei Ouyang, Kian Hsiang Low, Jie Chen & Patrick Jaillet.
In Proceedings of the
13th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-14), pages 573-580, Paris, France, May 5-9, 2014.
23.8% acceptance rate
Also appeared in
RSS-14 Workshop on Non-Parametric Learning in Robotics, Berkeley, CA, Jul 12, 2014.
Abstract. A key challenge of environmental sensing and monitoring is that of sensing, modeling, and predicting large-scale, spatially correlated environmental phenomena, especially when they are unknown and non-stationary.
This paper presents a decentralized multi-robot active sensing (DEC-MAS) algorithm that can efficiently coordinate the exploration of multiple robots to gather the most informative observations for predicting an unknown, non-stationary phenomenon.
By modeling the phenomenon using a Dirichlet process mixture of Gaussian processes (DPM-GPs), our work here is novel in demonstrating how DPM-GPs and its structural properties can be exploited to (a) formalize an active sensing criterion that trades off between gathering the most informative observations for estimating the unknown, non-stationary spatial correlation structure vs. that for predicting the phenomenon given the current, imprecise estimate of the correlation structure, and (b) support efficient decentralized coordination.
We also provide a theoretical performance guarantee for DEC-MAS and analyze its time complexity.
We empirically demonstrate using two real-world datasets that DEC-MAS outperforms state-of-the-art MAS algorithms.
ONLINE AND ANYTIME SPARSE GAUSSIAN PROCESSES FOR BIG DATA
PROJECT DURATION : Aug 2013 - Present
PROJECT AFFILIATION
-
Singapore-MIT Alliance for Research and Technology (SMART) Future Urban Mobility (FM) IRG (Collaborators: Emilio Frazzoli, MIT; Daniela Rus, MIT)
-
Sensor-Enhanced Social Media (SeSaMe) Centre (Collaborator: Mohan Kankanhalli)
PROJECT FUNDING
- MOE AcRF Tier 2 Grant : Scaling up Gaussian Process Predictive Models for Big Data, SGD $637,884, Jul 2017 - Jul 2020
- SMART Subaward Agreements - FM IRG :
Autonomy in Mobility-On-Demand Systems,
SGD $1,348,638.22, Aug 2010 - Dec 2015
- Research Collaboration Agreements with Panasonic R&D Center Singapore : Sonar Data Fusion Algorithm for Object Distance Estimation, SGD $84,230.40, Feb 2016 - Jul 2016, Dec 2016 - Jul 2017
PROBLEM MOTIVATION
A Gaussian process regression (GPR) model is a Bayesian nonparametric model for performing nonlinear regression that provides a Gaussian predictive distribution with formal measures of predictive uncertainty. The expressivity of a full-rank GPR (FGPR) model, however, comes at a cost of cubic time in the size of the data, thus rendering it computationally impractical for training with massive datasets. To improve its scalability, a number of sparse GPR (SGPR) models
exploiting low-rank approximate representations have been proposed, many of which share a similar structural assumption of conditional independence (albeit of varying degrees) based on the notion of inducing variables and consequently incur only linear time in the data size. The work of Quinonero-Candela & Rasmussen (2005) has in fact presented a unifying view of such SGPR models, which include the subset of regressors (SoR), deterministic training conditional (DTC), fully independent training conditional (FITC), fully independent conditional (FIC), partially independent training conditional (PITC), and partially independent conditional (PIC) approximations.
To scale up these SGPR models further for performing real-time predictions necessary in many time-critical applications and decision support systems (e.g., ocean sensing, traffic monitoring), the work of Gal et al. (2014) has parallelized DTC while that of Chen et al. (2013) has parallelized FITC, FIC, PITC, and PIC to be run on multiple machines. The recent work of Low et al. (2015) has produced a spectrum of SGPR models with PIC and FGPR at the two extremes that are also amenable to parallelization on multiple machines. Ideally, these parallel SGPR models can reduce the incurred time of their centralized counterparts by a factor close to the number of machines. In practice, since the number of machines is limited due to budget constraints, their incurred time will still grow with an increasing size of data. Like their centralized counterparts, they can be trained using all the data.
When the data is expected to stream in over a (possibly indefinitely) long time, it is also computationally impractical to repeatedly use these existing offline sparse GP approximation methods or even the online GP model (i.e., quadratic time in the data size) for training at each time step.
PROPOSED METHODOLOGY
A more affordable alternative is to instead train a SGPR model in either an (1) online or (2) anytime fashion with a small, randomly sampled subset of the data at each iteration, which requires only a single machine:
- Our first work presents a novel online sparse GP approximation method that, in contrast to existing works mentioned above, is capable of achieving constant time and memory (i.e., independent of the size of the data/observations) per time step. We provide a theoretical guarantee on its predictive performance to be equivalent to that of the offline sparse PITC approximation method. Our proposed method generalizes the sparse online GP model of Csato & Opper (2002) by relaxing its conditional independence assumption significantly, hence potentially improving the predictive performance. We empirically demonstrate the practical feasibility of using our generalized online sparse GP approximation method through a real-world persistent mobile robot localization experiment.

- To the best of our knowledge, the only notable anytime SGPR model exploits a result of Titsias (2009) that DTC can alternatively be obtained using variational inference by minimizing the Kullback-Leibler (KL) distance between the variational approximation and the GP posterior distribution of some latent variables given the data, from which a stochastic natural gradient ascent (SNGA) method can be derived to achieve an asymptotic convergence of its predictive performance to that of DTC while incurring constant time per iteration.
This anytime variant of DTC promises a huge speedup if the number of sampled subsets of data needed for convergence is much smaller than the total number of possible disjoint subsets that can be formed and sampled from all the data. But, it can be observed in our experiments that DTC often does not predict as well as the other SGPR models (except SoR) encompassed by the unifying view of Quinonero-Candela & Rasmussen (2005) because it imposes the most restrictive structural assumption. This motivates us to consider the possibility of constructing an anytime variant of any SGPR model of our choice whose derived SNGA method can achieve an asymptotic convergence of its predictive performance to that of the chosen SGPR model while preserving constant time per iteration.
However, no alternative formulation based on variational inference exists for any SGPR model other than DTC in order to derive such a SNGA method.
To address the above challenge, our second work presents a novel unifying framework of anytime SGPR models that can produce good predictive performance fast and improve their predictive performance over time. Our proposed unifying framework, perhaps surprisingly, reverses the variational inference procedure to theoretically construct a non-trivial, concave functional (i.e., of distributions) that is maximized at the predictive distribution of any SGPR model of our choice. Consequently, a SNGA method can be derived that involves iteratively following the stochastic natural gradient of the functional to improve its estimate of the predictive distribution of the chosen SGPR model and is guaranteed to achieve asymptotic convergence to it.
Interestingly, we show that if the predictive distribution of the chosen SGPR model satisfies certain decomposability conditions (e.g., DTC, FITC, PIC), then the stochastic natural gradient is an unbiased estimator of the exact natural gradient and can be computed in constant time (i.e., independent of data size) at each iteration. We empirically evaluate the trade-off between the predictive performance vs. time efficiency of the anytime SGPR models spanned by our unifying framework (i.e., including state-of-the-art anytime variant of DTC) on two real-world million-sized datasets.
PUBLICATIONS
- A Generalized Stochastic Variational Bayesian Hyperparameter Learning Framework for Sparse Spectrum Gaussian Process Regression.
Quang Minh Hoang, Trong Nghia Hoang & Kian Hsiang Low.
In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17), pages 2007-2014, San Francisco, CA, Feb 4-9, 2017.
24.6% acceptance rate (oral presentation)
Abstract. While much research effort has been dedicated to scaling up sparse Gaussian process (GP) models based on inducing variables for big data, little attention is afforded to the other less explored class of low-rank GP approximations that exploit the sparse spectral representation of a GP kernel. This paper presents such an effort to advance the state of the art of sparse spectrum GP models to achieve competitive predictive performance for massive datasets. Our generalized framework of stochastic variational Bayesian sparse spectrum GP (sVBSSGP) models addresses their shortcomings by adopting a Bayesian treatment of the spectral frequencies to avoid overfitting, modeling these frequencies jointly in its variational distribution to enable their interaction a posteriori, and exploiting local data for boosting the predictive performance. However, such structural improvements result in a variational lower bound that is intractable to be optimized. To resolve this, we exploit a variational parameterization trick to make it amenable to stochastic optimization. Interestingly, the resulting stochastic gradient has a linearly decomposable structure that can be exploited to refine our stochastic optimization method to incur constant time per iteration while preserving its property of being an unbiased estimator of the exact gradient of the variational lower bound. Empirical evaluation on real-world datasets shows that sVBSSGP outperforms state-of-the-art stochastic implementations of sparse GP models.
- A Unifying Framework of Anytime Sparse Gaussian Process Regression Models with Stochastic Variational Inference for Big Data.
Trong Nghia Hoang, Quang Minh Hoang & Kian Hsiang Low.
In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 569-578, Lille, France, Jul 6-11, 2015.
26.0% acceptance rate
Abstract. This paper presents a novel unifying framework of anytime sparse Gaussian process regression (SGPR) models that can produce good predictive performance fast and improve their predictive performance over time. Our proposed unifying framework reverses the variational inference procedure to theoretically construct a non-trivial, concave functional that is maximized at the predictive distribution of any SGPR model of our choice.
As a result, a stochastic natural gradient ascent method can be derived that involves iteratively following the stochastic natural gradient of the functional to improve its estimate of the predictive distribution of the chosen SGPR model
and is guaranteed to achieve asymptotic convergence to it. Interestingly, we show that if the predictive distribution of the chosen SGPR model
satisfies certain decomposability conditions, then the stochastic natural gradient is an unbiased estimator of the exact natural gradient and can be computed in constant time (i.e., independent of data size) at each iteration. We empirically evaluate the trade-off between the predictive performance vs. time efficiency of the anytime SGPR models on two real-world million-sized datasets.
- New Advances on Bayesian and Decision-Theoretic Approaches for Interactive Machine Learning.
Trong Nghia Hoang.
Ph.D. Thesis, Department of Computer Science, National University of Singapore, Feb 2015.
Abstract.
The exploration-exploitation trade-off is a fundamental dilemma in many interactive learning scenarios which include both aspects of reinforcement learning (RL) and active learning (AL): An autonomous agent, situated in an unknown environment, has to actively extract knowledge from the environment by taking actions (or conducting experiments) based on its previously collected information to make accurate predictions or to optimize some utility functions. Thus, to make the most effective use of their resource-constrained budget (e.g., processing time, experimentation cost), the agent must choose carefully between (a) exploiting options (e.g., actions, experiments) which are recommended by its current, possibly incomplete model of the environment, and (b) exploring the other ostensibly sub-optimal choices to gather more information.
For example, an RL agent has to face a dilemma between (a) exploiting the most-rewarding action according to the current statistical model of the environment at the risk of running into catastrophic situations if the model is not accurate, and (b) exploring a sub-optimal action to gather more information so as to improve the models accuracy at the potential price of losing the short-term reward. Similarly, an AL algorithm/agent has to consider between (a) conducting the most informative experiments according to its current estimation of the environment models parameters (i.e., exploitation), and (b) running experiments that help improving the estimation accuracy of these parameters (i.e., exploration).
More often, learning strategies that ignore exploration will likely exhibit sub-optimal performance due to their imperfect knowledge while, conversely, those that entirely focus on exploration might suffer the cost of learning without benefitting from it. Therefore, a good exploration-exploitation trade-off is critical to the success of those interactive learning agents: In order to perform well, they must strike the right balance between these two conflicting objectives. Unfortunately, while this trade-off has been well-recognized since the early days of RL, the studies of exploration-exploitation have mostly been developed for theoretical settings in the respective field of RL and, perhaps surprisingly, glossed over in the existing AL literature. From a practical point of view, we see three limiting factors:
- Previous works addressing the exploration-exploitation trade-off in RL have largely focused on simple choices of the environment model and consequently, are not practical enough to accommodate real-world applications that have far more complicated environment structures. In fact, we find that most recent advances in Bayesian reinforcement learning (BRL) have only been able to analytically trade off between exploration and exploitation under a simple choice of models such as Flat-Dirichlet-Multinomial (FDM) whose independence and modeling assumptions do not hold for many real-world applications.
- Nearly all of the notable works in the AL literature primarily advocate the use of greedy/myopic algorithms whose rates of convergence (i.e., the number of experiments required by the learning algorithm to achieve a desired performance in the worst case) are provably minimax optimal for simple classes of learning tasks (e.g., threshold learning). While these results have greatly ad- vanced our understanding about the limit of myopic AL in worst-case scenarios, significantly less is presently known about whether it is possible to devise nonmyopic AL strategies which optimize the exploration-exploitation trade-off to achieve the best expected performance in budgeted learning scenarios.
- The issue of scalability of the existing predictive models (e.g., Gaussian processes) used in AL has generally been underrated since the majority of literature considers small-scale environments which only consist of a few thousand candidate experiments to be selected by single-mode AL algorithms one at a time prior to retraining the model. In contrast, large-scale environments usually have a massive set of million candidate experiments among which tens or hundreds of thousands should be actively selected for learning. For such data-intensive problems, it is often more cost-effective to consider batch-mode AL algorithms which select and conduct multiple experiments in parallel at each stage to collect observations in batch. Retraining the predictive model after incorporating each batch of observations then becomes a computational bottleneck as the collected dataset at each stage quickly grows up to tens or even hundreds of thousand data points.
This thesis outlines some recent progresses that we have been able to make while working toward satisfactory answers to the above challenges, along with practical algorithms that achieve them:
- In particular, in order to put BRL into practice for more complicated and practical problems, we propose a novel framework called Interactive Bayesian Reinforcement Learning (I-BRL) to integrate the general class of parametric models and model priors, thus allowing the practitioners' domain knowledge to be exploited to produce a fine-grained and compact representation of the environment as often required in many real-world applications. Interestingly, we show how the nonmyopic Bayes-optimal policy can be derived analytically by solving I-BRL exactly and propose an approximation algorithm to compute it efficiently in polynomial time. Our empirical studies show that the proposed approach performs competitively with the existing state-of-the-art algorithms.
- Then, to establish a theoretical foundation for the exploration-exploitation trade-off in single-mode active learning scenarios with resource-constrained budgets, we present a novel ϵ-Bayes-optimal Decision-Theoretic Active Learning (ϵ-BAL) framework which advocates the use of differential entropy as a performance measure and consequently, derives a learning policy that can approximate the optimal expected performance arbitrarily closely (i.e., within an arbitrary loss bound ϵ). To meet the real-time requirement in time-critical applications, we then propose an asymptotically ϵ-optimal, branch-and-bound anytime algorithm based on ϵ-BAL with performance guarantees. In practice, we empirically demonstrate with both synthetic and real-world datasets that the proposed approach outperforms the state-of-the-art algorithms in budgeted scenarios.
- Lastly, to facilitate the future developments of large-scale, nonmyopic AL applications, we further introduce a highly scalable family of anytime predictive models for AL which provably converge toward a well-known class of sparse Gaussian processes (SGPs). Unlike the existing predictive models of AL which cannot be updated incrementally and are only capable of processing middle-sized datasets (i.e., a few thousands of data points), our proposed models can process massive datasets in an anytime fashion, thus providing a principled trade-off between the processing time and the predictive accuracy. The efficiency of our framework is then demonstrated empirically on a variety of large-scale real-world datasets which contains hundreds of thousand data points.
- GP-Localize: Persistent Mobile Robot Localization using Online Sparse Gaussian Process Observation Model.
Nuo Xu, Kian Hsiang Low, Jie Chen, Keng Kiat Lim & Etkin Baris Ozgul.
In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI-14), pages 2585-2592, Quebec City, Canada, Jul 27-31, 2014.
16.6% acceptance rate (oral presentation)
Also appeared in
RSS-14 Workshop on Non-Parametric Learning in Robotics, Berkeley, CA, Jul 12, 2014.
Abstract. Central to robot exploration and mapping is the task of persistent localization in environmental fields characterized by spatially correlated measurements. This paper presents a Gaussian process localization (GP-Localize) algorithm that, in contrast to existing works, can exploit the spatially correlated field measurements taken during a robot's exploration (instead of relying on prior training data) for efficiently and scalably learning the GP observation model online through our proposed novel online sparse GP. As a result, GP-Localize is capable of achieving constant time and memory (i.e., independent of the size of the data) per filtering step, which demonstrates the practical feasibility of using GPs for persistent robot localization and autonomy. Empirical evaluation via simulated experiments with real-world datasets and a real robot experiment shows that GP-Localize outperforms existing GP localization algorithms.
- Generalized Online Sparse Gaussian Processes with Application to Persistent Mobile Robot Localization.
Kian Hsiang Low, Nuo Xu, Jie Chen, Keng Kiat Lim & Etkin Baris Ozgul.
In T. Calders, F. Esposito, E. Hüllermeier, R. Meo, editors, Machine Learning and Knowledge Discovery in Databases - European Conference, ECML/PKDD-14 Nectar (New Scientific and Technical Advances in Research) Track, Part III, LNCS 8726, pages 499-503, Springer Berlin Heidelberg, Nancy, France, Sep 15-19, 2014.
Abstract. This paper presents a novel online sparse Gaussian process (GP) approximation method that is capable of achieving constant time and memory (i.e., independent of the size of the data) per time step. We theoretically guarantee its predictive performance to be equivalent to that of a sophisticated offline sparse GP approximation method. We empirically demonstrate the practical feasibility of using our online sparse GP approximation method through a real-world persistent mobile robot localization experiment.
PARALLEL AND DISTRIBUTED SPARSE GAUSSIAN PROCESSES FOR BIG DATA

PROJECT DURATION : Aug 2010 - Present
PROJECT AFFILIATION
-
Singapore-MIT Alliance for Research and Technology (SMART) Future Urban Mobility (FM) IRG (Collaborator: Patrick Jaillet, MIT)
-
Sensor-Enhanced Social Media (SeSaMe) Centre (Collaborator: Mohan Kankanhalli)
PROJECT FUNDING
- MOE AcRF Tier 2 Grant : Scaling up Gaussian Process Predictive Models for Big Data, SGD $637,884, Jul 2017 - Jul 2020
- SMART Subaward Agreements - FM IRG :
Spatiotemporal Modeling and Prediction of Traffic Patterns,
SGD $361,456.17, Oct 2011 - Mar 2017
- Research Collaboration Agreement with Sumitomo Electric Industries, Ltd. : Estimation/Prediction Algorithm for Traffic Volume without Rich Installation of Detectors, JPY $3,000,000, Sep 2013 - Nov 2014
PROBLEM MOTIVATION
Gaussian process (GP) models are a rich class of Bayesian non-parametric models that can perform probabilistic regression by providing Gaussian predictive distributions with formal measures of the predictive uncertainty.
Unfortunately, a GP model is handicapped by its poor scalability in the size of the data, hence limiting its practical use to small data. To improve its scalability, two families of sparse GP regression methods have been proposed: (a) Low-rank approximate representations
of the full-rank GP (FGP) model are well-suited for modeling slowly-varying functions with large correlation and can use all the data for predictions. But, they require a relatively high rank to capture small-scale features/patterns (i.e., of small correlation) with high fidelity, thus losing their computational advantage. (b) In contrast, localized regression and covariance tapering methods (e.g., local GPs and compactly supported covariance functions) are particularly useful for modeling rapidly-varying functions with small correlation. However, they can only utilize local data for predictions, thereby performing poorly in input regions with little/no data. Furthermore, to accurately represent large-scale features/patterns (i.e., of large correlation), the locality/tapering range has to be increased considerably, thus sacrificing their time efficiency.
Recent sparse GP regression methods have unified approaches from the two families described above to harness their complementary modeling and predictive capabilities (hence, eliminating their deficiencies) while retaining their computational advantages. Specifically, after approximating the FGP (in particular, its covariance matrix) with a low-rank representation based on the notion of inducing variables, a sparse covariance matrix approximation of the resulting residual process is made. However, this sparse residual covariance matrix approximation imposes a fairly strong conditional independence assumption given the inducing variables since the number of inducing variables cannot be too large to preserve time efficiency. We argue in this work that such a strong assumption is an overkill: It is in fact possible to construct a more refined, dense residual covariance matrix approximation by exploiting a Markov assumption and, perhaps surprisingly, still achieve scalability, which distinguishes our work here from existing sparse GP regression methods utilizing low-rank representations (i.e., including the unified approaches) described earlier.
As a result, our proposed residual covariance matrix approximation can significantly relax the conditional independence assumption (especially with larger data), hence potentially improving the predictive performance.
PROPOSED METHODOLOGY
This work presents a low-rank-cum-Markov approximation (LMA) of the FGP model that is novel in leveraging the dual computational advantages stemming from complementing the reduced-rank covariance matrix approximation based on the inducing variables with the residual covariance matrix approximation due to the Markov assumption;
the latter approximation is guaranteed to be closest in the Kullback-Leibler distance criterion subject to some constraint. Consequently, our proposed LMA method can trade off between the number of inducing variables and the order of the Markov property to (a) incur lower computational cost than sparse GP regression methods utilizing low-rank representations with only the number of inducing variables or spectral points as the varying parameter while achieving predictive performance comparable to them and (b) accurately represent features/patterns of any scale.
Interestingly, varying the Markov order produces a spectrum of LMAs with the partially independent conditional (PIC) approximation and FGP at the two extremes. An important advantage of LMA over most existing sparse GP regression methods is that it is amenable to parallelization on multiple machines/cores, thus gaining greater scalability for performing real-time predictions necessary in many time-critical applications and decision support systems (e.g., ocean sensing, traffic monitoring). Our parallel LMA method is implemented using the message passing interface (MPI) framework to run in clusters of up to 32 computing nodes and its predictive performance, scalability, and speedup are empirically evaluated on three real-world datasets (i.e., including a million-sized dataset).
PUBLICATIONS
- A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models.
Trong Nghia Hoang, Quang Minh Hoang & Kian Hsiang Low.
In Proceedings of the 33rd International Conference on Machine Learning (ICML-16), pages 382-391, New York City, NY, Jun 19-24, 2016.
24.3% acceptance rate
Abstract. This paper presents a novel distributed variational inference framework that unifies many parallel sparse Gaussian process regression (SGPR) models for scalable hyperparameter learning with big data. To achieve this, our framework exploits a structure of correlated noise process model that represents the observation noises as a finite realization of a high-order Gaussian Markov random process. By varying the Markov order and covariance function for the noise process model, different variational SGPR models result. This consequently allows the correlation structure of the noise process model to be characterized for which a particular variational SGPR model is optimal. We empirically evaluate the predictive performance and scalability of the distributed variational SGPR models unified by our framework on two real-world datasets.
- Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation.
Kian Hsiang Low, Jiangbo Yu, Jie Chen & Patrick Jaillet.
In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI-15), pages 2821-2827, Austin, TX, Jan 25-29, 2015.
26.67% acceptance rate
Abstract. The expressive power of a Gaussian process (GP) model comes at a cost of poor scalability in the data size.
To improve its scalability, this paper presents a low-rank-cum-Markov approximation (LMA) of the GP model that is novel in leveraging the dual computational advantages stemming from complementing a low-rank approximate representation of the full-rank GP based on a support set of inputs with a Markov approximation of the resulting residual process; the latter approximation is guaranteed to be closest in the Kullback-Leibler distance criterion subject to some constraint
and is considerably more refined than that of existing sparse GP models utilizing low-rank representations due to its more relaxed conditional independence assumption (especially with larger data).
As a result, our LMA method can trade off between the size of the support set and the order of the Markov property to (a) incur lower computational cost than such sparse GP models while achieving predictive performance comparable to them and (b) accurately represent features/patterns of any scale.
Interestingly, varying the Markov order produces a spectrum of LMAs
with PIC approximation and full-rank GP at the two extremes.
An advantage of our LMA method is that it is amenable to parallelization on multiple machines/cores, thereby gaining greater scalability.
Empirical evaluation on three real-world datasets in clusters of up to 32 computing nodes shows that our centralized and parallel LMA methods are significantly more time-efficient and scalable than state-of-the-art sparse and full-rank GP regression methods
while achieving comparable predictive performances.
- Parallel Gaussian Process Regression with Low-Rank Covariance Matrix Approximations.
Jie Chen, Nannan Cao, Kian Hsiang Low, Ruofei Ouyang, Colin Keng-Yan Tan & Patrick Jaillet.
In Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI-13), pages 152-161, Bellevue, WA, Jul 11-15, 2013.
31.3% acceptance rate
Abstract. Gaussian processes (GP) are Bayesian non-parametric models that are widely used for probabilistic regression. Unfortunately, it cannot scale well with large data nor perform real-time predictions due to its cubic time cost in the data size. This paper presents two parallel GP regression methods that exploit low-rank covariance matrix approximations for distributing the computational load among parallel machines to achieve time efficiency and scalability. We theoretically guarantee the predictive performances of our proposed parallel GPs to be equivalent to that of some centralized approximate GP regression methods: The computation of their centralized counterparts can be distributed among parallel machines, hence achieving greater time efficiency and scalability. We analytically compare the properties of our parallel GPs such as time, space, and communication complexity. Empirical evaluation on two real-world datasets in a cluster of 20 computing nodes shows that our parallel GPs are significantly more time-efficient and scalable than their centralized counterparts and exact/full GP while achieving predictive performances comparable to full GP.
- Gaussian Process-Based Decentralized Data Fusion and Active Sensing Agents: Towards Large-Scale Modeling and Prediction of Spatiotemporal Traffic Phenomena.
Jie Chen.
Ph.D. Thesis, Department of Computer Science, National University of Singapore, Dec 2013.
Abstract.
Knowing and understanding the environmental phenomena is important to many real world applications. This thesis is devoted to study large-scale modeling and prediction of spatiotemporal environmental phenomena (i.e., urban traffic phenomena). Towards this goal, our proposed approaches rely on a class of Bayesian non-parametric models: Gaussian processes (GP).
To accurately model spatiotemporal urban traffic phenomena in real world situation, a novel relational GP taking into account both the road segment features and road network topology information is proposed to model real world traffic conditions over road network. Additionally, a GP variant called log-Gaussian process (lGP) is exploited to model an urban mobility demand pattern which contains skewness and extremity in demand measurements.
To achieve efficient and scalable urban traffic phenomenon prediction given a large phenomenon data, we propose three novel parallel GPs: parallel partially independent training conditional (pPITC), parallel partially independent conditional(pPIC) and parallel incomplete Cholesky factorization (pICF)-based approximations of GP model, which can distribute their computational load into a cluster of parallel/multi-core machines, thereby achieving time efficiency. The predictive performances of such parallel GPs are theoretically guaranteed to be equivalent to that of some centralized approaches to approximate full/exact GP regression. The proposed parallel GPs are implemented using the message passing interface (MPI) framework and tested on two large real world datasets. The theoretical and empirical results show that our parallel GPs achieve significantly better time efficiency and scalability than that of full GP, while achieving comparable accuracy. They also achieve fine speedup performance that is the ratio of time required by the parallel algorithms and their centralized counterparts.
To exploit active mobile sensors to perform decentralized perception of the spatiotemporal urban traffic phenomenon, we propose a decentralized algorithm framework: Gaussian process-based decentralized data fusion and active sensing (D2FAS) which is composed of a decentralized data fusion (DDF) component and a decentralized active sensing (DAS) component. The DDF component includes a novel Gaussian process-based decentralized data fusion (GP-DDF) algorithm that can achieve remarkably efficient and scalable prediction of phenomenon and a novel Gaussian process-based decentralized data fusion with local augmentation (GP-DDF+) algorithm that can achieve better predictive accuracy while preserving time efficiency of GP-DDF. The predictive performances of both GP-DDF and GP-DDF+ are theoretically guaranteed to be equivalent to that of some sophisticated centralized sparse approximations of exact/full GP. For the DAS component, we propose a novel partially decentralized active sensing (PDAS) algorithm that exploits property in correlation structure of GP-DDF to enable mobile sensors cooperatively gathering traffic phenomenon data along a near-optimal joint walk with theoretical guarantee, and a fully decentralized active sensing (FDAS) algorithm that guides each mobile sensor gather phenomenon data along its locally optimal walk.
Lastly, to justify the practicality of the D2FAS framework, we develop and test D2FAS algorithms running with active mobile sensors on real world datasets for monitoring traffic conditions and sensing/servicing urban mobility demands. Theoretical and empirical results show that the proposed algorithms are significantly more time-efficient, more scalable in the size of data and in the number of sensors than the state-of-the-art centralized approaches, while achieving comparable predictive accuracy.
PRESENTATIONS
- Gaussian Process-Based Decentralized Data Fusion
and Active Sensing Agents:
Towards Large-Scale Modeling & Prediction of Spatiotemporal Traffic Phenomena.
Kian Hsiang Low.
Invited speaker at the RSS-13 Workshop on Robotic Exploration, Monitoring, and Information Collection: Nonparametric Modeling, Information-based Control, and Planning under Uncertainty, Berlin, Germany, Jun 27-28, 2013.
INTENTION-AWARE PLANNING UNDER UNCERTAINTY FOR INTERACTING OPTIMALLY WITH SELF-INTERESTED AGENTS
PROJECT DURATION : May 2011 - Present
PROJECT AFFILIATION :
Singapore-MIT Alliance for Research and Technology (SMART) Future Urban Mobility (FM) IRG (Collaborators: Emilio Frazzoli, MIT; Daniela Rus, MIT)
PROJECT FUNDING : SMART Subaward Agreements - FM IRG :
Autonomy in Mobility-On-Demand Systems,
SGD $1,348,638.22, Aug 2010 - Dec 2015
PROBLEM MOTIVATION
Designing and developing efficient planning algorithms for intelligent agents to interact and perform effectively among other self-interested agents has recently emerged as a grand challenge in non-cooperative multi-agent systems. Such a challenge is posed by many real-world applications, which include automated electronic trading markets where software agents interact, and traffic intersections where autonomous cars have to negotiate with human-driven vehicles to cross them, among others. Modeling, predicting, and learning the other agents' intentions efficiently is therefore critical to overcoming this challenge.
In practice, it is highly non-trivial to model and predict the other agents' intentions efficiently. Existing works addressing this challenge are often undermined due to either the restrictive assumptions on the agents' behaviors or the prohibitively expensive cost of modeling and predicting their intentions:
- Game-theoretic approaches tend to assume the agents' behaviors to be perfectly rational using the well-founded solution concepts of classical game theory such as Nash equilibrium that suffers from the following drawbacks: (a) Multiple equilibria may exist, (b) only the optimal actions corresponding to the equilibria are specified, and (c) they assume that the agents do not collaborate to beneficially deviate from the equilibrium, which is often violated by human agents.
- In contrast, decision-theoretic approaches propose to extend single-agent sequential decision making frameworks under partial observability such as POMDP to explicitly characterize the bounded rationality of self-interested agents. In particular, Interactive POMDP (I-POMDP) replaces a POMDP's flat belief over physical states with an interactive belief of k levels of hierarchy over both the physical state space and the other agent's beliefs, the latter of which are recursively defined as interactive beliefs of k-1 levels of hierarchy.
As a consequence, the agent's "optimal" behavior computed at hierarchical level k is expected to be the best response to the other agent's expected behavior at hierarchical level k-1. This surprisingly coincides with the well-founded cognitive hierarchy model of games where k is referred to as the reasoning depth. Here, the bounded rationality of the agents is explicitly accounted for by making k finite and defining their expected behaviors at level 0 as uniformly random. Empowered by such enriched and highly expressive belief space, I-POMDP can explicitly model and predict the other agent's intention. Unfortunately, solving I-POMDP (e.g., solving for the agent's expected behavior at level k) is fraught with computational curses of dimensionality, history, and nested reasoning due to its highly sophisticated structure.
Furthermore, there is another practical concern for these decision-theoretic approaches: They often require the behavioral model's parameters (e.g., hierarchical level k) to be completely specified by the practitioners, which can be very impractical in many situations where it is non-trivial to do so or the prior knowledge is insufficient to reliably derive these parameters. This essentially boils down to the need of learning while interacting with the other self-interested agents and, interestingly, an exploitation-exploration trade-off to be made while doing so: Should an agent exploit the "best" action based on its (possibly misleading) knowledge to maximize the payoff or explore a "sub-optimal" action to refine its knowledge?
Naively, one may attempt to solve it by directly tapping into the huge body of existing works in Bayesian Reinforcement Learning (BRL), which offers a broad range of principled treatments of this issue under single-agent contexts. However, most of these works often assume very simple and specific parameterizations of the unknown environments, thus rendering them inapplicable to the context where the other agent's behavior has a far more complicated parameterization. More importantly, the other agent's behavior often needs to be modeled differently depending on the specific context. Grounding in the context of existing BRL frameworks, either the domain expert struggles to best fit his prior knowledge to the supported set of parameterizations or the agent developer has to re-design the framework to incorporate a new modeling scheme. Arguably, there is no free lunch when it comes to modeling the agent's behavior across various applications.
The main focus of our work here is thus to investigate and address the following questions:
- How can intention prediction be efficiently exploited and made practical in planning under partial observability? In particular, how can the bounded rationality of the other agents be explicitly modeled without incurring prohibitive computational cost?
- How can existing BRL frameworks be refined to allow a domain expert to freely incorporate his choice of design in modeling the other agents' behaviors?
This question is signficant in putting theory into practice and, when answered, can potentially bridge the gap between learning in single- and (self-interested) multi-agent systems.
PROPOSED METHODOLOGY
- To answer the first question, we first develop a novel intention-aware nested MDP framework for planning in fully observable multi-agent environments. Inspired by the cognitive hierarchy model of games, nested MDP constitutes a recursive reasoning formalism to predict the other agent's intention and then exploit it to plan our agent's optimal interaction policy. We show that nested MDP incurs linear time in the planning horizon length and reasoning depth. Then, we propose an intention-aware I-POMDP Lite framework for planning in partially observable multi-agent environments that, in particular, exploits a practical structural assumption: The intention of the other agent is driven by nested MDP, which is demonstrated theoretically to be an effective surrogate of its true intention when the agents have fine sensing and actuation capabilities. This assumption will allow the other agent's intention to be predicted efficiently and, consequently, I-POMDP Lite to be solved effectively, as demonstrated theoretically and empirically in our work.
- To tackle the second question, we present a novel generalization of BRL called Interactive BRL (I-BRL) to integrate any parametric model and model prior of the other agent's behavior specified by domain experts, thus effectively allowing the other agent's sophisticated behavior to be represented in a fine-grained manner based on the practitioners' prior knowledge. In particular, we show how the non-myopic Bayes-optimal policy can be derived analytically by solving I-BRL exactly and propose an approximation algorithm to compute it efficiently in polynomial time. Empirically, we demonstrate I-BRL's performance using an interesting traffic problem modeled after a real-world situation.
PUBLICATIONS
- New Advances on Bayesian and Decision-Theoretic Approaches for Interactive Machine Learning.
Trong Nghia Hoang.
Ph.D. Thesis, Department of Computer Science, National University of Singapore, Feb 2015.
Abstract.
The exploration-exploitation trade-off is a fundamental dilemma in many interactive learning scenarios which include both aspects of reinforcement learning (RL) and active learning (AL): An autonomous agent, situated in an unknown environment, has to actively extract knowledge from the environment by taking actions (or conducting experiments) based on its previously collected information to make accurate predictions or to optimize some utility functions. Thus, to make the most effective use of their resource-constrained budget (e.g., processing time, experimentation cost), the agent must choose carefully between (a) exploiting options (e.g., actions, experiments) which are recommended by its current, possibly incomplete model of the environment, and (b) exploring the other ostensibly sub-optimal choices to gather more information.
For example, an RL agent has to face a dilemma between (a) exploiting the most-rewarding action according to the current statistical model of the environment at the risk of running into catastrophic situations if the model is not accurate, and (b) exploring a sub-optimal action to gather more information so as to improve the models accuracy at the potential price of losing the short-term reward. Similarly, an AL algorithm/agent has to consider between (a) conducting the most informative experiments according to its current estimation of the environment models parameters (i.e., exploitation), and (b) running experiments that help improving the estimation accuracy of these parameters (i.e., exploration).
More often, learning strategies that ignore exploration will likely exhibit sub-optimal performance due to their imperfect knowledge while, conversely, those that entirely focus on exploration might suffer the cost of learning without benefitting from it. Therefore, a good exploration-exploitation trade-off is critical to the success of those interactive learning agents: In order to perform well, they must strike the right balance between these two conflicting objectives. Unfortunately, while this trade-off has been well-recognized since the early days of RL, the studies of exploration-exploitation have mostly been developed for theoretical settings in the respective field of RL and, perhaps surprisingly, glossed over in the existing AL literature. From a practical point of view, we see three limiting factors:
- Previous works addressing the exploration-exploitation trade-off in RL have largely focused on simple choices of the environment model and consequently, are not practical enough to accommodate real-world applications that have far more complicated environment structures. In fact, we find that most recent advances in Bayesian reinforcement learning (BRL) have only been able to analytically trade off between exploration and exploitation under a simple choice of models such as Flat-Dirichlet-Multinomial (FDM) whose independence and modeling assumptions do not hold for many real-world applications.
- Nearly all of the notable works in the AL literature primarily advocate the use of greedy/myopic algorithms whose rates of convergence (i.e., the number of experiments required by the learning algorithm to achieve a desired performance in the worst case) are provably minimax optimal for simple classes of learning tasks (e.g., threshold learning). While these results have greatly advanced our understanding about the limit of myopic AL in worst-case scenarios, significantly less is presently known about whether it is possible to devise nonmyopic AL strategies which optimize the exploration-exploitation trade-off to achieve the best expected performance in budgeted learning scenarios.
- The issue of scalability of the existing predictive models (e.g., Gaussian processes) used in AL has generally been underrated since the majority of literature considers small-scale environments which only consist of a few thousand candidate experiments to be selected by single-mode AL algorithms one at a time prior to retraining the model. In contrast, large-scale environments usually have a massive set of million candidate experiments among which tens or hundreds of thousands should be actively selected for learning. For such data-intensive problems, it is often more cost-effective to consider batch-mode AL algorithms which select and conduct multiple experiments in parallel at each stage to collect observations in batch. Retraining the predictive model after incorporating each batch of observations then becomes a computational bottleneck as the collected dataset at each stage quickly grows up to tens or even hundreds of thousand data points.
This thesis outlines some recent progresses that we have been able to make while working toward satisfactory answers to the above challenges, along with practical algorithms that achieve them:
- In particular, in order to put BRL into practice for more complicated and practical problems, we propose a novel framework called Interactive Bayesian Reinforcement Learning (I-BRL) to integrate the general class of parametric models and model priors, thus allowing the practitioners' domain knowledge to be exploited to produce a fine-grained and compact representation of the environment as often required in many real-world applications. Interestingly, we show how the nonmyopic Bayes-optimal policy can be derived analytically by solving I-BRL exactly and propose an approximation algorithm to compute it efficiently in polynomial time. Our empirical studies show that the proposed approach performs competitively with the existing state-of-the-art algorithms.
- Then, to establish a theoretical foundation for the exploration-exploitation trade-off in single-mode active learning scenarios with resource-constrained budgets, we present a novel ϵ-Bayes-optimal Decision-Theoretic Active Learning (ϵ-BAL) framework which advocates the use of differential entropy as a performance measure and consequently, derives a learning policy that can approximate the optimal expected performance arbitrarily closely (i.e., within an arbitrary loss bound ϵ). To meet the real-time requirement in time-critical applications, we then propose an asymptotically ϵ-optimal, branch-and-bound anytime algorithm based on ϵ-BAL with performance guarantees. In practice, we empirically demonstrate with both synthetic and real-world datasets that the proposed approach outperforms the state-of-the-art algorithms in budgeted scenarios.
- Lastly, to facilitate the future developments of large-scale, nonmyopic AL applications, we further introduce a highly scalable family of anytime predictive models for AL which provably converge toward a well-known class of sparse Gaussian processes (SGPs). Unlike the existing predictive models of AL which cannot be updated incrementally and are only capable of processing middle-sized datasets (i.e., a few thousands of data points), our proposed models can process massive datasets in an anytime fashion, thus providing a principled trade-off between the processing time and the predictive accuracy. The efficiency of our framework is then demonstrated empirically on a variety of large-scale real-world datasets which contains hundreds of thousand data points.
- Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with Self-Interested Agents.
Trong Nghia Hoang & Kian Hsiang Low.
In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI-13), pages 2298-2305, Beijing, China, Aug 3-9, 2013.
13.2% acceptance rate (oral presentation)
Abstract. A key challenge in non-cooperative multi-agent systems is that of developing efficient planning algorithms for intelligent agents to interact and perform effectively among boundedly rational, self-interested agents (e.g., humans). The practicality of existing works addressing this challenge is being undermined due to either the restrictive assumptions of the other agents' behavior, the failure in accounting for their rationality, or the prohibitively expensive cost of modeling and predicting their intentions. To boost the practicality of research in this field, we investigate how intention prediction can be efficiently exploited and made practical in planning, thereby leading to efficient intention-aware planning frameworks capable of predicting the intentions of other agents and acting optimally with respect to their predicted intentions. We show that the performance losses incurred by the resulting planning policies are linearly bounded by the error of intention prediction. Empirical evaluations through a series of stochastic games demonstrate that our policies can achieve better and more robust performance than the state-of-the-art algorithms.
- A General Framework for Interacting Bayes-Optimally with Self-Interested Agents using Arbitrary Parametric Model and Model Prior.
Trong Nghia Hoang & Kian Hsiang Low.
In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI-13), pages 1394-1400, Beijing, China, Aug 3-9, 2013.
28.0% acceptance rate
Abstract. Recent advances in Bayesian reinforcement learning (BRL) have shown that Bayes-optimality is theoretically achievable by modeling the environment's latent dynamics using Flat-Dirichlet-Multinomial (FDM) prior. In self-interested multi-agent environments, the transition dynamics are mainly controlled by the other agent's stochastic behavior for which FDM's independence and modeling assumptions do not hold. As a result, FDM does not allow the other agent's behavior to be generalized across different states nor specified using prior domain knowledge. To overcome these practical limitations of FDM, we propose a generalization of BRL to integrate the general class of parametric models and model priors, thus allowing practitioners' domain knowledge to be exploited to produce a fine-grained and compact representation of the other agent's behavior. Empirical evaluation shows that our approach outperforms existing multi-agent reinforcement learning algorithms.
- Intention-Aware Planning under Uncertainty for Interacting with Self-Interested, Boundedly Rational Agents.
Trong Nghia Hoang & Kian Hsiang Low.
In Proceedings of the
11th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-12), pages 1233-1234, Valencia, Spain, June 4-8, 2012.
Abstract. A key challenge in non-cooperative multi-agent systems is that of developing efficient planning algorithms for intelligent agents to perform effectively among boundedly rational, self-interested (i.e., non-cooperative) agents (e.g., humans). To address this challenge, we investigate how intention prediction can be efficiently exploited and made practical in planning, thereby leading to efficient intention-aware planning frameworks capable of predicting the intentions of other agents and acting optimally with respect to their predicted intentions.
GAUSSIAN PROCESS DECENTRALIZED DATA FUSION & ACTIVE SENSING AGENTS FOR MOBILITY-ON-DEMAND SYSTEMSTowards Large-Scale Spatiotemporal Traffic Modeling and Prediction
PROJECT DURATION : Aug 2010 - Present
PROJECT AFFILIATION :
Singapore-MIT Alliance for Research and Technology (SMART) Future Urban Mobility (FM) IRG (Collaborator: Patrick Jaillet, MIT)
PROJECT FUNDING
- SMART Subaward Agreements - FM IRG :
Spatiotemporal Modeling and Prediction of Traffic Patterns,
SGD $361,456.17, Oct 2011 - Mar 2017
- Research Collaboration Agreement with Sumitomo Electric Industries, Ltd. : Estimation/Prediction Algorithm for Traffic Volume without Rich Installation of Detectors, JPY $3,000,000, Sep 2013 - Nov 2014
PROBLEM MOTIVATION
PRIVATE automobiles are becoming unsustainable personal mobility solutions in densely populated urban cities because the addition of parking and road spaces cannot keep pace with their escalating numbers due to limited urban land. For example, Hong Kong and Singapore have, respectively, experienced 27.6% and 37% increase in private vehicles from 2003 to 2011. However, their road networks have only expanded less than 10% in size. Without implementing sustainable measures, traffic congestions and delays will grow more severe and frequent, especially during peak hours.
Mobility-on-demand (MoD) systems (e.g., Velib system of over 20000 shared bicycles in Paris, experimental carsharing systems) have recently emerged as a promising paradigm of one-way vehicle sharing for sustainable personal urban mobility, specifically, to tackle the problems of low vehicle utilization rate and parking space caused by private automobiles. Conventionally, a MoD system provides stacks and racks of light electric vehicles distributed throughout a city: When a user wants to go somewhere, he simply walks to the nearest rack, swipes a card to pick up a vehicle, drives it to the rack nearest to his destination, and drops it off. In this work, we assume the capability of a MoD system to be enhanced by deploying robotic shared vehicles (e.g., General Motors Chevrolet EN-V 2.0 prototype) that can autonomously drive and cruise the streets of a densely populated urban city to be hailed by users (like taxis) instead of just waiting at the racks to be picked up. Compared to the conventional MoD system, the fleet of autonomous robotic vehicles provides greater accessibility to users who can be picked up and dropped off at any location in the road network. As a result, it can service regions of high mobility demand but with poor coverage of stacks and racks due to limited space for their installation.
The key factors in the success of a MoD system are the costs to the users and system latencies, which can be minimized by managing the MoD system effectively. To achieve this, two main technical challenges need to be addressed: (a) Real-time, fine-grained mobility demand sensing and prediction, and (b) real-time active fleet management to balance vehicle supply and demand and satisfy latency requirements at sustainable operating costs. Existing works on load balancing in MoD systems, dynamic traffic assignment problems, dynamic one-to-one pickup and delivery problems, and location recommendation and dispatch for cruising taxis have tackled variants of the second challenge by assuming the necessary inputs of mobility demand and traffic flow information to be perfectly or accurately known using prior knowledge or offline processing of historic data. Such an assumption does not hold for densely populated urban cities because their mobility demand patterns and traffic flow are often subject to short-term random fluctuations and perturbations due to frequent special events (e.g., storewide sales, exhibitions), unpredictable weather conditions, unforeseen emergencies (e.g., breakdowns in public transport services), or traffic incidents (e.g., accidents, vehicle breakdowns, roadworks). So, in order for the active fleet management strategies to perform well in fleet rebalancing and route planning to service the mobility demands, they require accurate, fine-grained predictive information of the spatiotemporally varying mobility demand patterns and traffic flow in real time, the former of which is the desired outcome of addressing the first challenge. To the best of our knowledge, there is little progress in the algorithm design and development to take on the first challenge, which will be a focus of our work here.
In practice, it is non-trivial to achieve real-time, accurate prediction of spatiotemporally varying traffic phenomena such as mobility demand patterns and traffic flow because the quantity of sensors that can be deployed to observe an entire road network is cost-constrained. For example, static sensors such as loop detectors are traditionally placed at designated locations in a road network to collect data for predicting the traffic flow. However, they provide sparse coverage (i.e., many road segments are not observed, thus leading to data sparsity), incur high installation and maintenance costs, and cannot reposition by themselves in response to changes in the traffic flow. Low-cost GPS technology allows the collection of traffic flow data using passive mobile probes (e.g., taxis/cabs). Unlike static sensors, they can directly measure the travel times along road segments. But, they provide fairly sparse coverage due to low GPS sampling frequency (i.e., often imposed by taxi/cab companies) and no control over their routes, incur high initial implementation cost, pose privacy issues, and produce highly-varying speeds and travel times while traversing the same road segment due to inconsistent driving behaviors. A critical mass of probes is needed on each road segment to ease the severity of the last drawback but is often hard to achieve on non-highway segments due to sparse coverage. In contrast, we propose using the autonomous robotic vehicles as active mobile probes to overcome the limitations of static and passive mobile probes. In particular, they can be directed to explore any segments of a road network to gather real-time mobility demand data (e.g., pickup counts of different regions) and traffic flow data (e.g., speeds and travel times along road segments) at a desired GPS sampling rate while enforcing consistent driving behavior.
How then can the vacant autonomous robotic vehicles in a MoD system actively cruise a road network to gather and assimilate the most informative data for predicting a spatiotemporally varying traffic phenomenon like a mobility demand pattern or traffic flow? To solve this problem, a centralized approach to data fusion and active sensing is poorly suited because it suffers from a single point of failure and incurs huge communication, space, and time overheads with large data and fleet.
PROPOSED METHODOLOGY
Our work proposes novel decentralized data fusion and active sensing algorithms for real-time, fine-grained traffic sensing, modeling, and prediction with a fleet of autonomous robotic vehicles in a MoD system. Note that the decentralized data fusion component of our proposed algorithms can also be used for static sensors and passive mobile probes.
The specific contributions of our work here include:
- Modeling and predicting a mobility demand pattern and traffic flow using, respectively, rich classes of Bayesian nonparametric models called a log-Gaussian process (lGP) model and a relational Gaussian process model, the latter of whose spatiotemporal correlation structure can exploit both the road segment features and road network topology information;
- Developing novel Gaussian process decentralized data fusion algorithms for cooperative perception of traffic phenomena called GP-DDF and GP-DDF+ whose predictive performance are theoretically guaranteed to be equivalent to that of sophisticated centralized sparse approximations of the full-rank Gaussian process (full GP) model: The computation of such sparse approximate GP models can thus be distributed among the MoD vehicles, thereby achieving efficient and scalable probabilistic prediction;
- Deriving consensus filtering variants of GP-DDF and GP-DDF+ that require only local communication between neighboring MoD vehicles instead of assuming all-to-all communication between MoD vehicles;
- Devising decentralized active sensing algorithms (a) whose performance, when coupled with GP-DDF, can be theoretically guaranteed to realize the effect of the spatiotemporal correlation structure of the traffic phenomenon and various parameter settings of the MoD system, and (b) that, when used for sampling a mobility demand pattern, can be analytically shown to exhibit, interestingly, a cruising behavior of simultaneously exploring demand hotspots and sparsely sampled regions that have higher likelihood of picking up users, hence achieving a dual effect of fleet rebalancing to service the mobility demands;

- Analyzing the time and communication overheads of our proposed algorithms: We prove that our algorithms can scale better than existing state-of-the-art centralized algorithms in the size of the data and fleet;
- Empirically evaluating the predictive accuracy, time efficiency, scalability, and performance of servicing mobility demands (i.e., average cruising length of vehicles, average waiting time of users, total number of pickups) of our proposed algorithms on two datasets featuring real-world traffic phenomena such as a mobility demand pattern over the central business district of Singapore and speeds of road segments over an urban road network in Singapore.
PUBLICATIONS
- Collective Online Learning of Gaussian Processes in Massive Multi-Agent Systems.
Trong Nghia Hoang, Quang Minh Hoang, Kian Hsiang Low & Jonathan P. How.
In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI-19), Honolulu, Hawaii, Jan 27 - Feb 1, 2019.
16.2% acceptance rate
Abstract. This paper presents a novel Collective Online Learning of Gaussian Processes (COOL-GP) framework for enabling a massive number of GP inference agents to simultaneously perform (a) efficient online updates of their GP models using their local streaming data with varying correlation structures and (b) decentralized fusion of their resulting online GP models with different learned hyperparameter settings and inducing inputs. To realize this, we exploit the notion of a common encoding structure to encapsulate the local streaming data gathered by any GP inference agent into summary statistics based on our proposed representation, which is amenable to both an efficient online update via an importance sampling trick as well as multi-agent model fusion via decentralized message passing that can exploit sparse connectivity among agents for improving efficiency and enhance the robustness of our framework against transmission loss. We provide a rigorous theoretical analysis of the approximation loss arising from our proposed representation to achieve efficient online updates and model fusion. Empirical evaluations show that COOL-GP is highly effective in model fusion, resilient to information disparity between agents, robust to transmission loss, and can scale to thousands of agents.
- Gaussian Process Decentralized Data Fusion Meets Transfer Learning in Large-Scale Distributed Cooperative Perception.
Ruofei Ouyang & Kian Hsiang Low.
In Autonomous Robots (Special Issue on Multi-Robot and Multi-Agent Systems), 2019.
Extended version of our
AAAI-18 paper
Abstract. This paper presents novel Gaussian process decentralized data fusion algorithms exploiting the notion of agent-centric support sets for distributed cooperative perception of large-scale environmental phenomena. To overcome the limitations of scale in existing works, our proposed algorithms allow every mobile sensing agent to utilize a different support set and dynamically switch to another during execution for encapsulating its own data into a local summary that, perhaps surprisingly, can still be assimilated with the other agents' local summaries (i.e., based on their current choices of support sets) into a globally consistent summary to be used for predicting the phenomenon. To achieve this, we propose a novel transfer learning mechanism for a team of agents capable of sharing and transferring information encapsulated in a summary based on a support set to that utilizing a different support set with some loss that can be theoretically bounded and analyzed. To alleviate the issue of information loss accumulating over multiple instances of transfer learning, we propose a new information sharing mechanism to be incorporated into our algorithms in order to achieve memory-efficient lazy transfer learning. Empirical evaluation on three real-world datasets for up to 128 agents show that our algorithms outperform the state-of-the-art methods.
- Gaussian Process Decentralized Data Fusion Meets Transfer Learning in Large-Scale Distributed Cooperative Perception.
Ruofei Ouyang & Kian Hsiang Low.
In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), pages 3876-3883, New Orleans, LA, Feb 2-8, 2018.
24.55% acceptance rate
Abstract. This paper presents novel Gaussian process decentralized data fusion algorithms exploiting the notion of agent-centric support sets for distributed cooperative perception of large-scale environmental phenomena. To overcome the limitations of scale in existing works, our proposed algorithms allow every mobile sensing agent to choose a different support set and dynamically switch to another during execution for encapsulating its own data into a local summary that, perhaps surprisingly, can still be assimilated with the other agents' local summaries (i.e., based on their current choices of support sets) into a globally consistent summary to be used for predicting the phenomenon. To achieve this, we propose a novel transfer learning mechanism for a team of agents capable of sharing and transferring information encapsulated in a summary based on a support set to that utilizing a different support set with some loss that can be theoretically bounded and analyzed. To alleviate the issue of information loss accumulating over multiple instances of transfer learning, we propose a new information sharing mechanism to be incorporated into our algorithms in order to achieve memory-efficient lazy transfer learning. Empirical evaluation on real-world datasets show that our algorithms outperform the state-of-the-art methods.
- Gaussian Process Decentralized Data Fusion and Active Sensing for Spatiotemporal Traffic Modeling and Prediction in Mobility-on-Demand Systems.
Jie Chen, Kian Hsiang Low, Patrick Jaillet & Yujian Yao.
In IEEE Transactions on Automation Science and Engineering
(Special Issue on Networked Cooperative Autonomous Systems), volume 12, issue 3, pages 901-921, Jul 2015.
Abstract. Mobility-on-demand (MoD) systems have recently emerged as a
promising paradigm of one-way vehicle sharing for sustainable personal
urban mobility in densely populated cities. We assume the capability of
a MoD system to be enhanced by deploying robotic shared vehicles that
can autonomously cruise the streets to be hailed by users. A key
challenge of the MoD system is that of real-time, fine-grained mobility
demand and traffic flow sensing and prediction. This paper presents
novel Gaussian process (GP) decentralized data fusion and active
sensing algorithms for real-time, fine-grained traffic modeling and
prediction with a fleet of MoD vehicles. The predictive performance of
our decentralized data fusion algorithms are theoretically guaranteed to
be equivalent to that of sophisticated centralized sparse GP
approximations. We derive consensus filtering variants requiring only
local communication between neighboring vehicles. We theoretically
guarantee the performance of our decentralized active sensing
algorithms. When they are used to gather informative data for mobility
demand prediction, they can achieve a dual effect of fleet rebalancing
to service mobility demands. Empirical evaluation on real-world datasets
shows that our algorithms are significantly more time-efficient and
scalable in the size of data and fleet while achieving predictive
performance comparable to that of state-of-the-art algorithms.
- Gaussian Process-Based Decentralized Data Fusion and Active Sensing Agents: Towards Large-Scale Modeling and Prediction of Spatiotemporal Traffic Phenomena.
Jie Chen.
Ph.D. Thesis, Department of Computer Science, National University of Singapore, Dec 2013.
Abstract.
Knowing and understanding the environmental phenomena is important to many real world applications. This thesis is devoted to study large-scale modeling and prediction of spatiotemporal environmental phenomena (i.e., urban traffic phenomena). Towards this goal, our proposed approaches rely on a class of Bayesian non-parametric models: Gaussian processes (GP).
To accurately model spatiotemporal urban traffic phenomena in real world situation, a novel relational GP taking into account both the road segment features and road network topology information is proposed to model real world traffic conditions over road network. Additionally, a GP variant called log-Gaussian process (lGP) is exploited to model an urban mobility demand pattern which contains skewness and extremity in demand measurements.
To achieve efficient and scalable urban traffic phenomenon prediction given a large phenomenon data, we propose three novel parallel GPs: parallel partially independent training conditional (pPITC), parallel partially independent conditional(pPIC) and parallel incomplete Cholesky factorization (pICF)-based approximations of GP model, which can distribute their computational load into a cluster of parallel/multi-core machines, thereby achieving time efficiency. The predictive performances of such parallel GPs are theoretically guaranteed to be equivalent to that of some centralized approaches to approximate full/exact GP regression. The proposed parallel GPs are implemented using the message passing interface (MPI) framework and tested on two large real world datasets. The theoretical and empirical results show that our parallel GPs achieve significantly better time efficiency and scalability than that of full GP, while achieving comparable accuracy. They also achieve fine speedup performance that is the ratio of time required by the parallel algorithms and their centralized counterparts.
To exploit active mobile sensors to perform decentralized perception of the spatiotemporal urban traffic phenomenon, we propose a decentralized algorithm framework: Gaussian process-based decentralized data fusion and active sensing (D2FAS) which is composed of a decentralized data fusion (DDF) component and a decentralized active sensing (DAS) component. The DDF component includes a novel Gaussian process-based decentralized data fusion (GP-DDF) algorithm that can achieve remarkably efficient and scalable prediction of phenomenon and a novel Gaussian process-based decentralized data fusion with local augmentation (GP-DDF+) algorithm that can achieve better predictive accuracy while preserving time efficiency of GP-DDF. The predictive performances of both GP-DDF and GP-DDF+ are theoretically guaranteed to be equivalent to that of some sophisticated centralized sparse approximations of exact/full GP. For the DAS component, we propose a novel partially decentralized active sensing (PDAS) algorithm that exploits property in correlation structure of GP-DDF to enable mobile sensors cooperatively gathering traffic phenomenon data along a near-optimal joint walk with theoretical guarantee, and a fully decentralized active sensing (FDAS) algorithm that guides each mobile sensor gather phenomenon data along its locally optimal walk.
Lastly, to justify the practicality of the D2FAS framework, we develop and test D2FAS algorithms running with active mobile sensors on real world datasets for monitoring traffic conditions and sensing/servicing urban mobility demands. Theoretical and empirical results show that the proposed algorithms are significantly more time-efficient, more scalable in the size of data and in the number of sensors than the state-of-the-art centralized approaches, while achieving comparable predictive accuracy.
- Gaussian Process-Based Decentralized Data Fusion and Active Sensing for Mobility-on-Demand System.
Jie Chen, Kian Hsiang Low & Colin Keng-Yan Tan.
In Proceedings of the
Robotics: Science and Systems Conference (RSS-13), Berlin, Germany, Jun 24-28, 2013.
30.1% acceptance rate
- Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena.
Jie Chen, Kian Hsiang Low, Colin Keng-Yan Tan, Ali Oran, Patrick Jaillet, John M. Dolan & Gaurav S. Sukhatme.
In Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (UAI-12), pages 163-173, Catalina Island, CA, Aug 15-17, 2012.
31.6% acceptance rate
Also appeared in AAMAS-12 Workshop on Agents in Traffic and Transportation (ATT-12), Valencia, Spain, June 4-8, 2012.
Abstract. The problem of modeling and predicting spatiotemporal traffic phenomena over an urban road network is important to many traffic applications such as detecting and forecasting congestion hotspots. This paper presents a decentralized data fusion and active sensing (D2FAS) algorithm for mobile sensors to actively explore the road network to gather and assimilate the most informative data for predicting the traffic phenomenon. We analyze the time and communication complexity of D2FAS and demonstrate that it can scale well with a large number of observations and sensors. We provide a theoretical guarantee on its predictive performance to be equivalent to that of a sophisticated centralized sparse approximation for the Gaussian process (GP) model: The computation of such a sparse approximate GP model can thus be parallelized and distributed among the mobile sensors (in a Google-like MapReduce paradigm), thereby achieving efficient and scalable prediction. We also theoretically guarantee its active sensing performance that improves under various practical environmental conditions. Empirical evaluation on real-world urban road network data shows that our D2FAS algorithm is significantly more time-efficient and scalable than state-of-the-art centralized algorithms while achieving comparable predictive performance.
PRESENTATIONS
- Gaussian Process-Based Decentralized Data Fusion
and Active Sensing Agents:
Towards Large-Scale Modeling & Prediction of Spatiotemporal Traffic Phenomena.
Kian Hsiang Low.
Invited speaker at the RSS-13 Workshop on Robotic Exploration, Monitoring, and Information Collection: Nonparametric Modeling, Information-based Control, and Planning under Uncertainty, Berlin, Germany, Jun 27-28, 2013.
PLANNING UNDER UNCERTAINTY FOR LARGE-SCALE ACTIVE MULTI-CAMERA SURVEILLANCE

PROJECT DURATION : Mar 2010 - Nov 2014
PROJECT AFFILIATION :
Sensor-Enhanced Social Media (SeSaMe) Centre (Collaborator: Mohan Kankanhalli)
MEDIA NEWS :
TODAY's Science section (15 May 2015) - 'When CCTV cameras work together as one'
PROBLEM MOTIVATION
The problem of surveillance has grown to be a critical concern in many urban cities worldwide following a recent series of security threats like Mumbai terrorist attacks and London bomb blasts. Central to the problem of surveillance is that of monitoring, tracking, and observing multiple mobile targets of interest distributed over a large-scale obstacle-ridden environment (e.g., airport terminals, railway and subway stations, bus depots, shopping malls, school campuses, military bases). It is often necessary to acquire high-resolution videos/images of these targets for supporting real-world surveillance applications like activity/intention tracking and recognition, biometric analysis like target identification and face recognition, surveillance video mining, forensic video analysis/retrieval, among others.
Traditional surveillance systems consist of a large network of fixed/static CCTV (Closed Circuit Television) cameras that are placed to constantly focus at selected important locations in the buildings like entrance/exit, lobby, etc. Unfortunately, the maximum resolution of these cameras is limited to 720 x 480 pixels. So, they cannot capture high-resolution images/videos of the targets, especially when the targets are far away from the cameras. As a result, they perform poorly in acquiring the close-up views of the targets and their activities. HDTV/Megapixel cameras have recently been introduced to overcome this resolution issue. Similar to CCTV cameras, these fixed/static HDTV/megapixel cameras are placed to constantly focus at specific locations in the environment. A relatively large network of such cameras has to be installed in order to observe the targets in any region of the environment at high resolution, which is impractical in terms of equipment, installation, and maintenance costs.
The use of active PTZ (Pan/Tilt/Zoom) cameras is becoming an increasingly popular alternative to that of fixed/static cameras for surveillance because the active cameras are endowed with pan-tilt-zoom capabilities that can be exploited to focus on and observe the targets at high image/video resolution. Hence, fewer active cameras need to be deployed to be able to capture high-resolution images/videos of the targets in any region of the environment. In order to achieve effective real-time surveillance, an efficient automated mechanism is required to autonomously coordinate and control these cameras' actions.
The objective of this work is thus to address the following central surveillance problem: "How can a network of active cameras be coordinated and controlled to maximize the number of targets observed with a guaranteed image resolution?"
PROPOSED METHODOLOGY
This work presents a novel principled decision-theoretic planning under uncertainty approach to coordinating and controlling a large-scale network of active cameras for performing high-quality surveillance of large crowds of moving targets. In particular, our approach addresses the following practical issues affecting the surveillance problem:
(a) Multiple sources of uncertainty. A typical surveillance environment is fraught with multiple sources of uncertainty such as noisy cameras' observations, stochastic targets' motion, and unknown targets' locations, etc. These uncertainties make it difficult for the active cameras to know where to observe in order to keep the targets within their fields of view (fov). Consequently, they may lose track of the observed targets. To resolve this, our approach models a belief over the targets' states (i.e., locations, directions, and velocities) and updates the belief in a Bayesian paradigm based on probabilistic models of the targets' motion and the active cameras' observations;
(b) Camera-target ratio. In crowded environments, the number of targets to be observed is usually much greater than the number of available cameras. When the number of targets increases, a surveillance system, if poorly designed, tends to incur exponentially increasing computation time, which degrades the real-time performance of the entire surveillance system;
(c) Trade-off between maximizing the expected number of observed targets and the image resolution of observing them. Increasing the resolution of observing some targets through panning, tilting, or zooming may result in the loss of other targets being tracked. To address this trade-off, the cameras' actions are coordinated to simultaneously improve the belief over the targets' states and maximize the expected number of targets observed with a guaranteed pre-defined resolution;
(d) Scalability. By exploiting the inherent structure of the surveillance problem, our approach can scale linearly in the number of targets to be observed;
(e) Real-time requirement. The cameras' actions are computed in real time;
(f) Occlusions. Many real-world surveillance environments contain obstacles that occlude the fov of some or perhaps even all of the cameras, thus preventing the cameras from persistently tracking their observed targets. The regions where the targets cannot be observed by any of the cameras due to obstacles are said to be occluded. When the targets reside in these occluded regions or are not within the fov of any camera, the surveillance system loses track of them, thus degrading the surveillance performance. Such environments are called partially observable in the sense that the exact locations of the targets may not be observed directly by the cameras at all times.
As demonstrated empirically through simulations, our approach can achieve high-quality surveillance of a large number of targets in real time and its surveillance performance degrades gracefully with an increasing number of targets. The real-world experiments show the practicality of our decision-theoretic approach to coordinate and control cameras in surveillance systems.
PUBLICATIONS
- Scalable Decision-Theoretic Coordination and Control for Real-time Active Multi-Camera Surveillance.
Prabhu Natarajan, Trong Nghia Hoang, Yongkang Wong, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
8th ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC'14) (Invited Paper to Special Session on Smart Cameras for Smart Environments), pages 115-120, Venezia, Italy, Nov 4-7, 2014.
Abstract. This paper presents an overview of our novel decision-theoretic multi-agent approach for controlling and coordinating multiple active cameras in surveillance. In this approach, a surveillance task is modeled as a stochastic optimization problem, where the active cameras are controlled and coordinated to achieve the desired surveillance goal in presence of uncertainties. We enumerate the practical issues in active camera surveillance and discuss how these issues are addressed in our decision-theoretic approach. We focus on two novel surveillance tasks: maximize the number of targets observed in active cameras with guaranteed image resolution and to improve the fairness in observation of multiple targets. We discuss the overview of our novel decision-theoretic frameworks: Markov decision process and partially observable Markov decision process frameworks for coordinating active cameras in uncertain and partially occluded environments.
- No One is Left "Unwatched": Fairness in Observation of Crowds of Mobile Targets in Active Camera Surveillance.
Prabhu Natarajan, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
21st European Conference on Artificial Intelligence (ECAI-14), including Prestigious Applications of Intelligent Systems (PAIS-14), pages 1155-1160, Prague, Czech Republic, Aug 18-22, 2014.
Abstract. Central to the problem of active multi-camera surveillance is the fundamental issue of fairness in the observation of crowds of targets such that no target is "starved" of observation by the cameras for a long time. This paper presents a principled decision-theoretic multi-camera coordination and control (MC2) algorithm called fair-MC2 that can coordinate and control the active cameras to achieve max-min fairness in the observation of crowds of targets moving stochastically. Our fair-MC2 algorithm is novel in demonstrating how (a) the uncertainty in the locations, directions, speeds, and observation times of the targets arising from the stochasticity of their motion can be modeled probabilistically, (b) the notion of fairness in observing targets can be formally realized in the domain of multi-camera surveillance for the first time by exploiting the max-min fairness metric to formalize our surveillance objective, that is, to maximize the expected minimum observation time over all targets while guaranteeing a predefined image resolution of observing them, and (c) a structural assumption in the state transition dynamics of a surveillance environment can be exploited to improve its scalability to linear time in the number of targets to be observed during surveillance. Empirical evaluation through extensive simulations in realistic surveillance environments shows that fair-MC2 outperforms the state-of-the-art and baseline MC2 algorithms. We have also demonstrated the feasibility of deploying our fair-MC2 algorithm on real AXIS 214 PTZ cameras.
- Decision-Theoretic Approach to Maximizing Fairness in Multi-Target Observation in Multi-Camera Surveillance.
Prabhu Natarajan, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
13th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-14), pages 1521-1522, Paris, France, May 5-9, 2014.
Abstract. Central to the problem of active multi-camera surveillance is the fundamental issue of fairness in the observation of multiple targets such that no target is left unobserved by the cameras for a long time. To address this important issue, we propose a novel principled decision-theoretic approach to control and coordinate multiple active cameras to achieve fairness in the observation of multiple moving targets.
- A Decision-Theoretic Approach for Controlling and Coordinating Multiple Active Cameras in Surveillance.
Prabhu Natarajan.
Ph.D. Thesis, Department of Computer Science, National University of Singapore, Dec 2013.
Abstract.
The use of active cameras in surveillance is becoming increasingly popular as they try to meet the demands of capturing high-resolution images/videos of targets in surveillance for face recognition, target identification, forensic video analysis, etc. These active cameras are endowed with pan, tilt, and zoom capabilities, which can be exploited to provide high-quality surveillance. In order to achieve effective, real-time surveillance, an efficient collaborative mechanism is needed to control and coordinate these cameras' actions, which is the focus of this thesis. The central problem in surveillance is to monitor a set of targets with guaranteed image resolution. Controlling and coordinating multiple active cameras to achieve this surveillance task is non-trivial and challenging because: (a) presence of inherent uncertainties in the surveillance environment (targets motion, location, and noisy camera observation); (b) there exists a non-trivial trade-off between number of targets and the resolution of observing these targets; and (c) more importantly, the coordination framework should be scalable with increasing number of targets and cameras.
In this thesis, we formulate a novel decision-theoretic multi-agent planning approach for controlling and coordinating multiple active cameras in surveillance. Our decision-theoretic approach offers advantages of (a) accounting the uncertainties using probabilistic models; (b) the non-trivial trade-off is addressed by coordinating the active cameras' actions to maximize the number of targets with guaranteed resolution; and (c) the scalability in number of targets and cameras is achieved by exploiting the structures and properties that are present in our surveillance problem. We focus on two novel problems in active camera surveillance: (a) maximizing observations of multiple targets (MOMT), i.e., maximizing the number of targets observed in active cameras with guaranteed image resolution; and (b) improving fairness in observation of multiple targets (FOMT), i.e., no target is "starved" of observation by active cameras for long duration of time.
We propose two formal decision-theoretic frameworks (a) Markov Decision Process (MDP) and (b) Partially Observable Markov Decision Process (POMDP) frameworks for coordinating active cameras in surveillance. MDP framework controls active cameras in fully observable surveillance environments where the active cameras are supported by one or more wide-view static/fixed cameras to observe the entire surveillance environment at low-resolution. POMDP framework controls active cameras in partially observable surveillance environments where it is impractical to observe the entire surveillance environment using static/fixed cameras due to occlusions caused by physical infrastructures. Hence the POMDP framework do not have a complete view of the surveillance environment.
Specifically, we propose (a) MDP frameworks to solve MOMT problem and FOMT problem in fully observable surveillance environment; and (b) POMDP framework to solve MOMT problem in partially observable surveillance environment. As proven analytically, our MDP and POMDP frameworks incurs time that is linear in number of targets to be observed during surveillance. We have used max-plus algorithm with our MDP framework to improve its scalability in number of cameras for MOMT problem. Empirical evaluation through simulations in realistic surveillance environment reveals that our proposed approach can achieve high-quality surveillance in real time. We also demonstrate our pro- posed approach with real Axis 214 PTZ cameras to show the practicality of our approach in real world surveillance. Both the simulations and real camera experiments show that our decision-theoretic approach can control and coordinate active cameras efficiently and hence contributes significantly towards improving the active camera surveillance research.
- Decision-Theoretic Coordination and Control for Active Multi-Camera Surveillance in Uncertain, Partially Observable Environments.
Prabhu Natarajan, Trong Nghia Hoang, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
6th ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC'12), pages 1-6, Hong Kong, Oct 30 - Nov 2, 2012.
Abstract. A central problem of surveillance is to monitor multiple targets moving in a large-scale, obstacle-ridden environment with occlusions. This paper presents a novel principled Partially Observable Markov Decision Process-based approach to coordinating and controlling a network of active cameras for tracking and observing multiple mobile targets at high resolution in such surveillance environments. Our proposed approach is capable of (a) maintaining a belief over the targets' states (i.e., locations, directions, and velocities) to track them, even when they may not be observed directly by the cameras at all times, (b) coordinating the cameras' actions to simultaneously improve the belief over the targets' states and maximize the expected number of targets observed with a guaranteed resolution, and (c) exploiting the inherent structure of our surveillance problem to improve its scalability (i.e., linear time) in the number of targets to be observed. Quantitative comparisons with state-of-the-art multi-camera coordination and control techniques show that our approach can achieve higher surveillance quality in real time. The practical feasibility of our approach is also demonstrated using real AXIS 214 PTZ cameras.
- PhD Forum: Decision-Theoretic Coordination and Control for Active Multi-Camera Surveillance.
Prabhu Natarajan.
In Proceedings of the
6th ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC'12), pages 1-2, Hong Kong, Oct 30 - Nov 2, 2012.
Best PhD Forum Paper Award
Abstract. In this thesis, we present novel decision-theoretic
multi-agent approaches for controlling and coordinating multiple
active cameras in surveillance. Decision-theoretic approaches
models the interaction between active camera network and the
uncertain surveillance environment effectively. The goal of the
surveillance is to maximize the number of targets observed in
active cameras with guaranteed image resolution. We enumerate
the practical issues in active camera surveillance and discuss how
these issues are addressed in our decision-theoretic approaches.
The existing camera control approaches have serious limitations
in terms of scalability in number of targets. Where as in
our approaches, the scalability in number of targets has been
improved by exploiting the structure and properties that are
present in our surveillance problem. We proposed two novel
decision-theoretic frameworks: Markov Decision Process (MDP)
and Partially Observable Markov Decision Process (POMDP)
frameworks for coordinating active cameras in fully observable
and partially observable surveillance settings.
- Decision-Theoretic Approach to Maximizing Observation of Multiple Targets in Multi-Camera Surveillance.
Prabhu Natarajan, Trong Nghia Hoang, Kian Hsiang Low & Mohan Kankanhalli.
In Proceedings of the
11th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-12), pages 155-162, Valencia, Spain, June 4-8, 2012.
20.4% acceptance rate
Abstract. This paper presents a novel decision-theoretic approach to control and coordinate multiple active cameras for observing a number of moving targets in a surveillance system. This approach offers the advantages of being able to (a) account for the stochasticity of targets' motion via probabilistic modeling, and (b) address the trade-off between maximizing the expected number of observed targets and the resolution of the observed targets through stochastic optimization. One of the key issues faced by existing approaches in multi-camera surveillance is that of scalability with increasing number of targets. We show how its scalability can be improved by exploiting the problem structure: as proven analytically, our decision-theoretic approach incurs time that is linear in the number of targets to be observed during surveillance. As demonstrated empirically through simulations, our proposed approach can achieve high-quality surveillance of up to 50 targets in real time and its surveillance performance degrades gracefully with increasing number of targets. We also demonstrate our proposed approach with real AXIS 214 PTZ cameras in maximizing the number of Lego robots observed at high resolution over a surveyed rectangular area. The results are promising and clearly show the feasibility of our decision-theoretic approach in controlling and coordinating the active cameras in real surveillance system.
- Decision-Theoretic Approach for Controlling and
Coordinating Multiple Active Cameras in Surveillance.
Prabhu Natarajan.
In Proceedings of the
11th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-12), Valencia, Spain, June 4-8, 2012.
Doctoral consortium abstract
Abstract. This paper presents a novel decision-theoretic approach to control and coordinate multiple active cameras for observing a number of moving targets in a surveillance system. This approach offers the advantages of being able to (a) account for the stochasticity of targets' motion via probabilistic modeling, and (b) address the trade-off between maximizing the expected number of observed targets and the resolution of the observed targets through stochastic optimization. One of the key issues faced by existing approaches in multi-camera surveillance is that of scalability with increasing number of targets. We show how its scalability can be improved by exploiting the problem structure: as proven analytically, our decision-theoretic approach incurs time that is linear in the number of targets to be observed during surveillance. As demonstrated empirically through simulations, our proposed approach can achieve high-quality surveillance of up to 50 targets in real time and its surveillance performance degrades gracefully with increasing number of targets. We also demonstrate our proposed approach with real AXIS 214 PTZ cameras in maximizing the number of Lego robots observed at high resolution over a surveyed rectangular area. The results are promising and clearly show the feasibility of our decision-theoretic approach in controlling and coordinating the active cameras in real surveillance system.
MULTI-ROBOT INFORMATIVE PATH PLANNING FOR ACTIVE SENSING OF SPATIOTEMPORAL ENVIRONMENTAL PHENOMENA
PROJECT DURATION : Jan 2010 - May 2013
PROJECT AFFILIATION :
Collaborative Multi-Robot Exploration of the Coastal Ocean (Collaborators: John M. Dolan, CMU; Gaurav S. Sukhatme, USC; Kanna Rajan, MBARI)
PROJECT FUNDING : MOE AcRF Tier 1 Grant :
Active Robotic Exploration and Mapping for Environmental Sensing Applications,
SGD $165,377, Apr 2010 - Mar 2013
PROBLEM MOTIVATION
Research in environmental sensing and monitoring has recently gained significant attention and practical interest, especially in supporting environmental sustainability efforts worldwide. A key direction of this research aims at sensing, modeling, and predicting the various types of environmental phenomena spatially distributed over our natural and built-up habitats so as to improve our knowledge and understanding of their economic, environmental, and health impacts and implications. This is non-trivial to achieve due to a trade-off between the quantity of sensing resources (e.g., number of deployed sensors, energy consumption, mission time) and the uncertainty in predictive modeling. In the case of deploying a limited number of mobile robotic sensing assets, such a trade-off motivates the need to plan the most informative resource-constrained observation paths to minimize the uncertainty in modeling and predicting a spatially varying environmental phenomenon, which constitutes the active sensing problem to be addressed in this work.
A wide multitude of natural and urban environmental phenomena is characterized by spatially correlated field measurements, which raises the following fundamental issue faced by the active sensing problem:
How can the spatial correlation structure of an environmental phenomenon be exploited to improve the active sensing performance and computational efficiency of robotic path planning?
In this work, we will investigate the above issue for an important broad class of environmental phenomena called anisotropic fields that exhibit a (often much) higher spatial correlation along one direction than along its per- pendicular direction. Such fields occur widely in natural and built-up environments and some of them include (a) ocean and freshwater phenomena like plankton density, fish abundance, temperature and salinity; (b) soil and atmospheric phenomena like peat thickness, surface soil moisture, rainfall; (c) mineral deposits like radioactive ore; (d) pollutant and contaminant concentration like air, heavy metals; and (e) ecological abundance like vegetation density.
PROPOSED METHODOLOGY
This work presents two principled approaches to efficient information-theoretic path planning based on entropy and mutual information criteria for in situ active sensing of environmental phenomena. In contrast to the existing methods described above, our proposed path planning algorithms are novel in addressing a trade-off between active sensing performance and computational efficiency. An important practical consequence is that our algorithms can exploit the spatial correlation structure of anisotropic fields to improve time efficiency while preserving near-optimal active sensing performance.
PUBLICATIONS
- Multi-Robot Informative Path Planning for Active Sensing of Environmental Phenomena: A Tale of Two Algorithms.
Nannan Cao, Kian Hsiang Low & John M. Dolan.
In Proceedings of the
12th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-13), pages 7-14, Saint Paul, MN, May 6-10, 2013.
22.9% acceptance rate
Abstract. A key problem of robotic environmental sensing and monitoring is that of active sensing: How can a team of robots plan the most informative observation paths to minimize the uncertainty in modeling and predicting an environmental phenomenon? This paper presents two principled approaches to efficient information-theoretic path planning based on entropy and mutual information criteria for in situ active sensing of an important broad class of widely-occurring environmental phenomena called anisotropic fields. Our proposed algorithms are novel in addressing a trade-off between active sensing performance and time efficiency. An important practical consequence is that our algorithms can exploit the spatial correlation structure of Gaussian process-based anisotropic fields to improve time efficiency while preserving near-optimal active sensing performance. We analyze the time complexity of our algorithms and prove analytically that they scale better than state-of-the-art algorithms with increasing planning horizon length. We provide theoretical guarantees on the active sensing performance of our algorithms for a class of exploration tasks called transect sampling, which, in particular, can be improved with longer planning time and/or lower spatial correlation along the transect. Empirical evaluation on real-world anisotropic field data shows that our algorithms can perform better or at least as well as the state-of-the-art algorithms while often incurring a few orders of magnitude less computational time, even when the field conditions are less favorable.
- Information-Theoretic Multi-Robot Path Planning.
Nannan Cao.
M.Sc. Thesis, Department of Computer Science, National University of Singapore, Sep 2012.
Abstract.
Research in environmental sensing and monitoring is especially important in supporting environmental sustainability efforts worldwide, and has recently attracted significant attention and interest. A key direction of this research lies in modeling and predicting the spatiotemporally varying environmental phenomena. One approach is to use a team of robots to sample the area and model the measurement values at unobserved points. For smoothly varying and hotspot fields, there is some work which has been done to model the fields well. However, there is still a class of common environmental fields called anisotropic fields in which the spatial phenomena are highly correlated along one direction and less correlated along the perpendicular direction. We exploit the environmental structure to improve the sampling performance and time efficiency of planning for anisotropic fields.
In this thesis, we cast the planning problem into a stagewise decision-theoretic problem. we adopt Gaussian Process to model spatial phenomena. Maximum entropy criterion and maximum mutual information criterion are used to measure the informativeness of the observation paths. It is found that for many GPs, correlation of two points exponentially decreases with the distance between the two points. With this property, for maximum entropy criterion, we propose a polynomial-time approximation algorithm, MEPP, to find the maximum entropy paths. We also provide a theoretical performance guarantee for this algorithm. For maximum mutual information criterion, we propose another polynomial-time approximation algorithm, M2IPP. Similar to the MEPP, a performance guarantee is also provided for this algorithm. We demonstrate the performance advantages of our algorithms on two real data sets. To get lower prediction error, three priciples have also been proposed to select the criterion for different environmental fields.
- Active Markov Information-Theoretic Path Planning for Robotic Environmental Sensing.
Kian Hsiang Low, John M. Dolan & Pradeep Khosla.
In Proceedings of the
10th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-11), pages 753-760, Taipei, Taiwan, May 2-6, 2011.
22.1% acceptance rate
Abstract. Recent research in multi-robot exploration and mapping has focused on sampling environmental fields, which are typically modeled using the Gaussian process (GP). Existing information-theoretic exploration strategies for learning GP-based environmental field maps adopt the non-Markovian problem structure and consequently scale poorly with the length of history of observations. Hence, it becomes computationally impractical to use these strategies for in situ, real-time active sampling. To ease this computational burden, this paper presents a Markov-based approach to efficient information-theoretic path planning for active sampling of GP-based fields. We analyze the time complexity of solving the Markov-based path planning problem, and demonstrate analytically that it scales better than that of deriving the non-Markovian strategies with increasing length of planning horizon. For a class of exploration tasks called the transect sampling task, we provide theoretical guarantees on the active sampling performance of our Markov-based policy, from which ideal environmental field conditions and sampling task settings can be established to limit its performance degradation due to violation of the Markov assumption. Empirical evaluation on real-world temperature and plankton density field data shows that our Markov-based policy can generally achieve active sampling performance comparable to that of the widely-used non-Markovian greedy policies under less favorable realistic field conditions and task settings while enjoying significant computational gain over them.
ENVIRONMENTAL BOUNDARY TRACKING & ESTIMATION WITH MULTIPLE ROBOTS

PROJECT DURATION : Jan 2010 - May 2012
PROJECT COLLABORATORS : John M. Dolan, CMU; Steve Chien, JPL, Caltech
PROJECT FUNDING : MOE AcRF Tier 1 Grant :
Active Robotic Exploration and Mapping for Environmental Sensing Applications,
SGD $165,377, Apr 2010 - Mar 2013
PROBLEM MOTIVATION
A fundamental problem in environmental sensing and monitoring is to identify and delineate the hotspot regions in a large-scale environmental field. It involves partitioning the area spanned by the field into one class of regions called the hotspot regions in which the field measurements exceed a predefined threshold, and the other class of regions where they do not. Such a problem arises in many real-world applications such as precision agriculture, monitoring of ocean and freshwater phenomena (e.g., plankton bloom), forest ecosystems, rare species, pollution (e.g., oil spill), or contamination (e.g., radiation leak). In these applications, it is necessary to assess the spatial extent and shape of the hotspot regions accurately due to severe economic, environmental, and health implications. In practice, this aim is non-trivial to achieve because the constraints on the sampling assets' resources (e.g., energy consumption, mission time, sensing range) limit the number and coverage of in situ observations over the large field that can be used to infer the hotspot regions. Subject to limited observations, the most informative ones should therefore be selected in order to minimize the uncertainty of estimating the hotspot regions (or, equivalently, classifying/labeling the hotspots) in the large field, which motivates our adaptive sampling work in this work.
Mobile robot teams are particularly desirable for performing the above environmental sensing task because they can actively explore to map the hotspot regions at high resolution. On the other hand, static sensors lack mobility and are therefore not capable of doing this well unless a large quantity is deployed. While research in multi-robot exploration and mapping have largely focused on the conventional task of building occupancy grids, some recent efforts are put into the more complex, general task of sampling spatially distributed environmental fields. In contrast to occupancy grids that assume discrete, independent cell occupancies, environmental fields are characterized by continuous-valued, spatially correlated measurements, properties of which cannot be exploited by occupancy grid mapping strategies to select the most informative observation paths. To exploit such properties, exploration strategies for learning environmental field maps have recently been developed and can be classified into two regimes: (a) wide-area coverage strategies consider sparsely sampled (i.e., largely unexplored) areas to be of high uncertainty and consequently spread observations evenly across the field; (b) hotspot sampling strategies assume areas of high uncertainty and interest to contain extreme, highly-varying measurements and hence produce clustered observations.
Formal, principled approaches of exploration have also been devised to simultaneously perform hotspot sampling when a hotspot region is found as well as wide-area coverage to search for new hotspot regions in sparsely sampled areas. These strategies optimize their observation paths to minimize the uncertainty (e.g., in terms of mean-squared error or entropy) of mapping the entire continuous-valued field. They are, however, suboptimal for classifying/labeling the hotspots in the field, which we will discuss and demonstrate theoretically and empirically in this work.
PROPOSED METHODOLOGY
This work presents a novel decentralized active robotic exploration (DARE) strategy for probabilistic classification/labeling of hotspots in a large-scale environmental field. The environmental field is assumed to be realized from a rich class of probabilistic spatial models called Gaussian process (GP) that can formally characterize its spatial correlation structure. More importantly, it can provide formal measures of classification/labeling uncertainty (i.e., in the form of cost functions) such as the misclassification and entropy criteria for directing the robots to explore highly uncertain areas of the field. The chief impediment to using these formal criteria is that they result in cost-minimizing exploration strategies, which cannot be solved in closed form. To resolve this, they are reformulated as reward-maximizing dual strategies, from which we can then derive the approximate DARE strategy to be solved in closed form efficiently. The specific contributions of our work include:
- Analyzing the time complexity of solving the DARE strategy: We prove that its incurred time is independent of the map resolution and the number of robots, thus making it practical for in situ, real-time active sampling. In contrast, existing state-of-the-art exploration strategies for learning environmental field maps scale poorly with increasing map resolution and/or number of robots;
- Analyzing the exploration behavior of the DARE strategy through its formulation: It exhibits an interesting formal trade-off between that of boundary tracking until the hotspot region boundary can be accurately predicted and wide-area coverage to find new boundaries in sparsely sampled areas to be tracked. In contrast, ad hoc, reactive boundary tracking strategies typically require a hotspot region boundary to be located manually or via random exploration and are not driven by the need to maximize the fidelity of estimating multiple hotspot regions given limited observations;
- Providing theoretical guarantee on the active exploration performance of the DARE strategy: We prove that, under a reasonable conditional independence assumption, it produces the same optimal observation paths as that of the centralized cost-minimizing strategies, the latter of which otherwise cannot be solved in closed form. This result has a simple but important implication: The uncertainty of labeling the hotspots in a GP-based field is greatest at or close to the hotspot region boundaries;
- Empirically evaluating the active exploration performance and time efficiency of the DARE strategy on real-world plankton density and temperature field data: Subject to limited observations, the DARE strategy can achieve better classification of the hotspots than state-of-the-art active exploration strategies while being significantly more time-efficient than those performing wide-area coverage and hotspot sampling.
PUBLICATIONS
- Decentralized Active Robotic Exploration and Mapping for Probabilistic Field Classification in Environmental Sensing.
Kian Hsiang Low, Jie Chen, John M. Dolan, Steve Chien & David R. Thompson.
In Proceedings of the
11th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-12), pages 105-112, Valencia, Spain, June 4-8, 2012.
20.4% acceptance rate
Also appeared in
IROS'11 Workshop on Robotics for Environmental Monitoring (WREM-11), San Francisco, CA, Sep 30, 2011.
Abstract. A central problem in environmental sensing and monitoring is to classify/label the hotspots in a large-scale environmental field. This paper presents a novel decentralized active robotic exploration (DARE) strategy for probabilistic classification/labeling of hotspots in a Gaussian process (GP)-based field. In contrast to existing state-of-the-art exploration strategies for learning environmental field maps, the time needed to solve the DARE strategy is independent of the map resolution and the number of robots, thus making it practical for in situ, real-time active sampling. Its exploration behavior exhibits an interesting formal trade-off between that of boundary tracking until the hotspot region boundary can be accurately predicted and wide-area coverage to find new boundaries in sparsely sampled areas to be tracked. We provide a theoretical guarantee on the active exploration performance of the DARE strategy: under reasonable conditional independence assumption, we prove that it can optimally achieve two formal cost-minimizing exploration objectives based on the misclassification and entropy criteria. Importantly, this result implies that the uncertainty of labeling the hotspots in a GP-based field is greatest at or close to the hotspot region boundaries. Empirical evaluation on real-world plankton density and temperature field data shows that, subject to limited observations, DARE strategy can achieve more superior classification of hotspots and time efficiency than state-of-the-art active exploration strategies.
MULTI-ROBOT ADAPTIVE SAMPLING FOR ENVIRONMENTAL SENSING & MONITORING
PROJECT DURATION : Jul 2005 - Mar 2010
PROJECT LAB : T-SAR
PROBLEM MOTIVATION
Recent research in multi-robot exploration and mapping has focused on sampling environmental fields, some of which typically feature a few small hotspots in a large region.
Such a hotspot field often arises in two real-world applications:
(1) planetary exploration such as geologic reconnaissance and prospecting for mineral deposits or natural gases, and
(2) environment and ecological sensing such as precision agriculture, and monitoring of ocean phenomena (e.g., plankton bloom, anoxic zones), forest ecosystems, rare species, pollution (e.g., oil spill), or contamination (e.g., radiation leak).
In particular, the hotspot field is characterized by continuous-valued, spatially correlated measurements with the hotspots exhibiting extreme measurements and much higher spatial variability than the rest of the field.
With limited (e.g., point-based) robot sensing range, a complete coverage becomes impractical in terms of resource costs (e.g., resource consumption).
So, to accurately map the field, the hotspots have to be sampled at a higher resolution.
The hotspot field discourages static sensor placement because a large number of sensors has to be positioned to detect and refine the sampling of hotspots. If these static sensors are not placed in any hotspot initially, they cannot reposition by themselves to locate one. In contrast, a robot team is capable of performing high-resolution hotspot sampling due to its mobility. Hence, it is desirable to build a mobile robot team that can actively explore to map a hotspot field.
PROPOSED METHODOLOGY
To learn a hotspot field map, the exploration strategy of the robot team has to plan the most informative resource-constrained observation paths that minimize the uncertainty of mapping the hotspot field.
By representing the hotspot field using rich classes of Bayesian non-parametric models such as the Gaussian process or log-Gaussian process,
formal measures of mapping uncertainty (e.g., based on mean-squared error [ AAMAS-08] or entropy [ ICAPS-09] criterion) can be defined and subsequently exploited by our proposed adaptive sampling algorithms for directing the robot team to explore highly uncertain areas of the field.
In contrast to non-adaptive sampling strategies that only perform well with smoothly-varying fields,
our non-myopic adaptive sampling algorithms can exploit clustering phenomena (i.e., hotspots) to plan observation paths that produce lower mapping uncertainty.
PUBLICATIONS
- Adaptive Sampling of Time Series with Application to Remote Exploration.
David R. Thompson, Nathalie Cabrol, Michael Furlong, Craig Hardgrove, Kian Hsiang Low, Jeffrey Moersch & David Wettergreen.
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA'13), pages 3463-3468, Karlsruhe, Germany, May 6-10, 2013.
Abstract. We address the problem of adaptive information-optimal data collection in time series. Here a remote sensor or explorer agent throttles its sampling rate in order to track anomalous events while obeying constraints on time and power. This problem is challenging because the agent has limited visibility - all collected datapoints lie in the past, but its resource allocation decisions require predicting far into the future. Our solution is to continually fit a Gaussian process model to the latest data and optimize the sampling plan on line to maximize information gain. We compare the performance characteristics of stationary and nonstationary Gaussian process models. We also describe an application based on geologic analysis during planetary rover exploration. Here adaptive sampling can improve coverage of localized anomalies and potentially benefit mission science yield of long autonomous traverses.
- Telesupervised Remote Surface
Water Quality Sensing.
Gregg Podnar, John M. Dolan, Kian Hsiang Low & Alberto Elfes.
In Proceedings of the IEEE Aerospace Conference, Big Sky, MT, Mar 6-13, 2010.
Abstract. We present a fleet of autonomous Robot Sensor Boats (RSBs) developed for lake and river fresh water quality assessment and controlled by our Multilevel Autonomy Robot Telesupervision Architecture (MARTA). The RSBs are low cost, highly maneuverable, shallow draft sensor boats, developed as part of the Sensor Web program supported under the Advanced Information Systems Technology program of NASA's Earth Systems Technology Office. They can scan large areas of lakes, and navigate up tributaries to measure water quality near outfalls that larger research vessels cannot reach. The MARTA telesupervision architecture has been applied to a number of domains from multi-platform autonomous wide area planetary mineral prospecting, to multi-platform ocean monitoring. The RSBs are a complementary expansion of a fleet of NOAA/NASA-developed extended-deployment surface autonomous vehicles that enable in-situ study of meteorological factors of the ocean/atmosphere interface, and which have been adapted to investigate harmful algal blooms under this program. The flexibility of the MARTA telesupervision architecture was proven as it supported simultaneous operation of these heterogenous autonomous sensor platforms while geographically widely separated. Results and analysis are presented of multiple tests carried out over three months using a multi-sensor water sonde to assess water quality in a small recreational lake. Inference Grids were used to produce maps representing temperature, pH, and dissolved oxygen. The tests were performed under various water conditions (clear vs. hair algae-laden) and both before and after heavy rains. Data from each RSB was relayed to a data server in our lab in Pittsburgh, Pennsylvania, and made available over the World Wide Web where it was acquired by team members at the Jet Propulsion Laboratory of NASA in Pasadena, California who monitored the boats and their sensor readings in real time, as well as using these data to model the water quality by producing Inference Grid-based maps.
- Multi-Robot Adaptive Exploration and Mapping for Environmental Sensing Applications.
Kian Hsiang Low.
Ph.D. Thesis, Technical Report CMU-ECE-2009-024, Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, Aug 2009.
Abstract.
Recent research in robot exploration and mapping has focused on sampling hotspot fields, which often arise in environmental and ecological sensing applications. Such a hotspot field is characterized by continuous, positively skewed, spatially correlated measurements with the hotspots exhibiting extreme measurements and much higher spatial variability than the rest of the field.
To map a hotspot field of the above characterization, we assume that it is realized from non-parametric probabilistic models such as the Gaussian and log-Gaussian processes (respectively, GP and lGP), which can provide formal measures of map uncertainty. To learn a hotspot field map, the exploration strategy of a robot team then has to plan resource-constrained observation paths that minimize the uncertainty of a spatial model of the hotspot field. This exploration problem is formalized in a sequential decision-theoretic planning under uncertainty framework called the multi-robot adaptive sampling problem (MASP). So, MASP can be viewed as a sequential, non-myopic version of active learning. In contrast to finite-state Markov decision problems, MASP adopts a more complex but realistic continuous-state, non-Markovian problem structure so that its induced exploration policy can be informed by the complete history of continuous, spatially correlated observations for selecting paths. It is unique in unifying formulations of non-myopic exploration problems along the entire adaptivity spectrum, thus subsuming existing non-adaptive formulations and allowing the performance advantage of a more adaptive policy to be theoretically realized. Through MASP, it is demonstrated that a more adaptive strategy can exploit clustering phenomena in a hotspot field to produce lower expected map uncertainty. By measuring map uncertainty using the mean-squared error criterion, a MASP-based exploration strategy consequently plans adaptive observation paths that minimize the expected posterior map error or equivalently, maximize the expected map error reduction.
The time complexity of solving MASP (approximately) depends on the map resolution, which limits its practical use in large-scale, high-resolution exploration and mapping. This computational difficulty is alleviated through an information-theoretic approach to MASP (iMASP), which measures map uncertainty based on the entropy criterion instead. As a result, an iMASP-based exploration strategy plans adaptive observation paths that minimize the expected posterior map entropy or equivalently, maximize the expected entropy of observation paths. Unlike MASP, reformulating the cost-minimizing iMASP as a reward-maximizing dual problem causes its time complexity of being solved approximately to be independent of the map resolution and less sensitive to larger robot team size as demonstrated both analytically and empirically. Furthermore, this reward-maximizing dual transforms the widely-used non-adaptive maximum entropy sampling problem into a novel adaptive variant, thus improving the performance of the induced exploration policy.
One advantage stemming from the reward-maximizing dual formulations of MASP and iMASP is that they allow observation selection properties of the induced exploration policies to be realized for sampling the hotspot field. These properties include adaptivity, hotspot sampling, and wide-area coverage. We show that existing GP-based exploration strategies may not explore and map the hotspot field well with the selected observations because they are non-adaptive and perform only wide-area coverage. In contrast, the lGP-based exploration policies can learn a high-quality hotspot field map because they are adaptive and perform both wide-area coverage and hotspot sampling.
The other advantage is that even though MASP and iMASP are non-trivial to solve due to their continuous state components, the convexity of their reward-maximizing duals can be exploited to derive, in a computationally tractable manner, discrete-state monotone-bounding approximations and subsequently, approximately optimal exploration policies with theoretical performance guarantees. Anytime algorithms based on approximate MASP and iMASP are then proposed to alleviate the computational difficulty that arises from their non-Markovian structure.
It is of practical interest to be able to quantitatively characterize the "hotspotness" of an environmental field. We propose a novel "hotspotness" index, which is defined in terms of the spatial correlation properties of the hotspot field. As a result, this index can be related to the intensity, size, and diffuseness of the hotspots in the field.
We also investigate how the spatial correlation properties of the hotspot field affect the performance advantage of adaptivity. In particular, we derive sufficient and necessary conditions of the spatial correlation properties for adaptive exploration to yield no performance advantage.
Lastly, we develop computationally efficient approximately optimal exploration strategies for sampling the GP by assuming the Markov property in iMASP planning. We provide theoretical guarantees on the performance of the Markov-based policies, which improve with decreasing spatial correlation. We evaluate empirically the effects of varying spatial correlations on the mapping performance of the Markov-based policies as well as whether these Markov-based path planners are time-efficient for the transect sampling task.
Through the abovementioned work, this thesis establishes the following two claims: (1) adaptive, non-myopic exploration strategies can exploit clustering phenomena to plan observation paths that produce lower map uncertainty than non-adaptive, greedy methods; and (2) Markov-based exploration strategies can exploit small spatial correlation to plan observation paths which achieve map uncertainty comparable to that of non-Markovian policies using significantly less planning time.
- Information-Theoretic Approach to Efficient Adaptive Path Planning for Mobile Robotic Environmental Sensing.
Kian Hsiang Low, John M. Dolan & Pradeep Khosla.
In Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS-09), pages 233-240, Thessaloniki, Greece, Sep 19-23, 2009.
33.9% acceptance rate
Also appeared in IPSN-09 Workshop on Sensor Networks for Earth and Space Science Applications (ESSA-09), San Francisco, CA, Apr 16, 2009.
Also orally presented in RSS-09 Workshop on Aquatic Robots and Ocean Sampling, Seattle, WA, Jun 29, 2009.
Abstract. Recent research in robot exploration and mapping has focused on sampling environmental hotspot fields. This exploration task is formalized by Low, Dolan, and Khosla (2008) in a sequential decision-theoretic planning under uncertainty framework called MASP. The time complexity of solving MASP approximately depends on the map resolution, which limits its use in large-scale, high-resolution exploration and mapping. To alleviate this computational difficulty, this paper presents an information-theoretic approach to MASP (iMASP) for efficient adaptive path planning; by reformulating the cost-minimizing iMASP as a reward-maximizing problem, its time complexity becomes independent of map resolution and is less sensitive to increasing robot team size as demonstrated both theoretically and empirically. Using the reward-maximizing dual, we derive a novel adaptive variant of maximum entropy sampling, thus improving the induced exploration policy performance. It also allows us to establish theoretical bounds quantifying the performance advantage of optimal adaptive over non-adaptive policies and the performance quality of approximately optimal vs. optimal adaptive policies. We show analytically and empirically the superior performance of iMASP-based policies for sampling the log-Gaussian process to that of policies for the widely-used Gaussian process in mapping the hotspot field. Lastly, we provide sufficient conditions that, when met, guarantee adaptivity has no benefit under an assumed environment model.
- Cooperative Aquatic Sensing using the Telesupervised Adaptive Ocean Sensor Fleet.
John M. Dolan, Gregg W. Podnar, Stephen Stancliff, Kian Hsiang Low, Alberto Elfes, John Higinbotham, Jeffrey C. Hosler, Tiffany A. Moisan & John Moisan.
In Proceedings of the SPIE Conference on Remote Sensing of the Ocean, Sea Ice, and Large Water Regions, volume 7473, Berlin, Germany, Aug 31 - Sep 3, 2009.
Abstract. Earth science research must bridge the gap between the atmosphere and the ocean to foster understanding of Earth's climate and ecology. Typical ocean sensing is done with satellites or in situ buoys and research ships which are slow to reposition. Cloud cover inhibits study of localized transient phenomena such as Harmful Algal Blooms (HAB). A fleet of extended-deployment surface autonomous vehicles will enable in situ study of characteristics of HAB, coastal pollutants, and related phenomena. We have developed a multiplatform telesupervision architecture that supports adaptive reconfiguration based on environmental sensor inputs. Our system allows the autonomous repositioning of smart sensors for HAB study by networking a fleet of NOAA OASIS (Ocean Atmosphere Sensor Integration System) surface autonomous vehicles. In situ measurements intelligently modify the search for areas of high concentration. Inference Grid and complementary information-theoretic techniques support sensor fusion and analysis. Telesupervision supports sliding autonomy from high-level mission tasking, through vehicle and data monitoring, to teleoperation when direct human interaction is appropriate. This paper reports on experimental results from multi-platform tests conducted in the Chesapeake Bay and in Pittsburgh, Pennsylvania waters using OASIS platforms, autonomous kayaks, and multiple simulated platforms to conduct cooperative sensing of chlorophyll-a and water quality.
- Robot Boats as a Mobile Aquatic Sensor Network.
Kian Hsiang Low, Gregg Podnar, Stephen Stancliff, John M. Dolan & Alberto Elfes.
In Proceedings of the IPSN-09 Workshop on Sensor Networks for Earth and Space Science Applications (ESSA-09), San Francisco, CA, Apr 16, 2009.
Abstract. This paper describes the Multilevel Autonomy Robot Telesupervision Architecture (MARTA), an architecture for supervisory control of a heterogeneous fleet of net-worked unmanned autonomous aquatic surface vessels carrying a payload of environmental science sensors. This architecture allows a land-based human scientist to effectively supervise data gathering by multiple robotic assets that implement a web of widely dispersed mobile sensors for in situ study of physical, chemical or bio-logical processes in water or in the water/atmosphere interface.
- Adaptive Multi-Robot Wide-Area Exploration And Mapping.
Kian Hsiang Low, John M. Dolan & Pradeep Khosla.
In Proceedings of the
7th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS-08), pages 23-30, Estoril, Portugal, May 12-16, 2008.
22.2% acceptance rate
Also presented as a poster in RSS-09 Workshop on Aquatic Robots and Ocean Sampling, Seattle, WA, Jun 29, 2009.
Abstract. The exploration problem is a central issue in mobile robotics. A complete terrain coverage is not practical if the environment is large with only a few small hotspots. This paper presents an adaptive multi-robot exploration strategy that is novel in performing both wide-area coverage and hotspot sampling using non-myopic path planning. As a result, the environmental phenomena can be accurately mapped. It is based on a dynamic programming formulation, which we call the Multi-robot Adaptive Sampling Problem (MASP). A key feature of MASP is in covering the entire adaptivity spectrum, thus allowing strategies of varying adaptivity to be formed and theoretically analyzed in their performance; a more adaptive strategy improves mapping accuracy. We apply MASP to sampling the Gaussian and log-Gaussian processes, and analyze if the resulting strategies are adaptive and maximize wide-area coverage and hotspot sampling. Solving MASP is non-trivial as it comprises continuous state components. So, it is reformulated for convex analysis, which allows discrete-state monotone-bounding approximation to be developed. We provide a theoretical guarantee on the policy quality of the approximate MASP (aMASP) for using in MASP. Although aMASP can be solved exactly, its state size grows exponentially with the number of stages. To alleviate this computational difficulty, anytime algorithms are proposed based on aMASP, one of which can guarantee its policy quality for MASP in real time.
- Adaptive Sampling for Multi-Robot Wide-Area Exploration.
Kian Hsiang Low, Geoffrey J. Gordon, John M. Dolan & Pradeep Khosla.
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA'07), pages 755-760, Rome, Italy, Apr 10-14, 2007.
Abstract. The exploration problem is a central issue in mobile robotics. A complete coverage is not practical if the environment is large with a few small hotspots, and the sampling cost is high. So, it is desirable to build robot teams that can coordinate to maximize sampling at these hotspots while minimizing resource costs, and consequently learn more accurately about properties of such environmental phenomena. An important issue in designing such teams is the exploration strategy. The contribution of this paper is in the evaluation of an adaptive exploration strategy called adaptive cluster sampling (ACS), which is demonstrated to reduce the resource costs (i.e., mission time and energy consumption) of a robot team, and yield more information about the environment by directing robot exploration towards hotspots. Due to the adaptive nature of the strategy, it is not obvious how the sampled data can be used to provide unbiased, low-variance estimates of the properties. This paper therefore discusses how estimators that are Rao-Blackwellized can be used to achieve low error. This paper also presents the first analysis of the characteristics of the environmental phenomena that favor the ACS strategy and estimators. Quantitative experimental results in a mineral prospecting task simulation show that our approach is more efficient in exploration by yielding more minerals and information with fewer resources and providing more precise mineral density estimates than previous methods.
- Adaptive Sampling for Multi-Robot Wide Area Prospecting.
Kian Hsiang Low, Geoffrey J. Gordon, John M. Dolan, and Pradeep Khosla.
In Technical Report CMU-RI-TR-05-51, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, Oct 2005.
Abstract. Prospecting for in situ mineral resources is essential for establishing settlements on the Moon and Mars. To reduce human effort and risk, it is desirable to build robotic systems to perform this prospecting. An important issue in designing such systems is the sampling strategy: how do the robots choose where to prospect next? This paper argues that a strategy called Adaptive Cluster Sampling (ACS) has a number of desirable properties: compared to conventional strategies, (1) it reduces the total mission time and energy consumption of a team of robots, and (2) returns a higher mineral yield and more information about the prospected region by directing exploration towards areas of high mineral density, thus providing detailed maps of the boundaries of such areas. Due to the adaptive nature of the sampling scheme, it is not immediately obvious how the resulting sampled data can be used to provide an unbiased, low-variance estimate of the regional mineral density. This paper therefore investigates new mineral density estimators, which have lower error than previously-developed estimators; they are derived from the older estimators via a process called Rao-Blackwellization. Since the efficiency of estimators depends on the type of mineralogical population sampled, the population characteristics that favor ACS estimators are also analyzed. The ACS scheme and our new estimators are evaluated empirically in a detailed simulation of the prospecting task, and the quantitative results show that our approach can yield more minerals with less resources and provide more accurate mineral density estimates than previous methods.
DISTRIBUTED LAYERED ARCHITECTURE FOR SELF-ORGANIZING MOBILE SENSOR NETWORKS
PROJECT DURATION : Nov 2002 - Jun 2005
PROBLEM MOTIVATION
One of the fundamental issues that arises in a sensor network is coverage.
Traditionally, network coverage is maximized by determining the optimal placement of static sensors in a centralized manner,
which can be related to the class of art gallery problems.
However, recent investigations in sensor network mobility reveal that
mobile sensors can self-organize to provide better coverage than static placement.
Existing applications have only utilized uninformed mobility (i.e., random motion or patrol).
In contrast, our work here focuses on informed, intelligent mobility to further improve coverage.
Our network coverage problem is motivated by the following constraints
that discourage static sensor placement or uninformed mobility:
(a) no prior information about the exact target locations, population densities or motion pattern,
(b) limited sensory range, and
(c) very large area to be observed.
All these conditions may cause the sensors to be unable to cover the entire region of interest.
Hence, fixed sensor locations or uninformed mobility will not be adequate in general.
Rather, the sensors have to move dynamically in response to the motion and distribution of targets and other sensors
to maximize coverage.
Inspired by robotics, the above problem may be regarded as that of low-level motion control to coordinate the sensors'
target tracking movements in the continuous workspace.
Alternatively, it can be cast as a high-level task allocation problem by segmenting the workspace into discrete regions
such that each region is assigned a group or coalition of sensors to track the targets within.
PROPOSED METHODOLOGY
This work presents a reactive layered multi-robot architecture
for distributed mobile sensor network coverage in complex, dynamic environments.
At the lower layer, each robot uses a reactive motion control strategy
known as Cooperative Extended Kohonen Maps
to coordinate their target tracking within a region without the need of communication.
This strategy is also responsible for obstacle avoidance, robot separation to minimize task interference,
and navigation between regions via beacons or checkpoints plotted by a motion planner.
At the higher layer, the robots use a dynamic ant-based task allocation scheme to cooperatively self-organize
their coalitions in a decentralized manner according to the target distributions across the regions.
This scheme addresses the following issues, which distinguish it from the other task allocation mechanisms:
Task Allocation for Multi-Robot Tasks: Existing algorithms (e.g., auction-and behavior-based) assume a multi-robot
task can be partitioned into single-robot tasks. But this may not be always possible or the multi-robot task
can be more efficiently performed by coalitions of robots.
Coalition Formation for Minimalist Robots: Existing coalition formation schemes require complex planning, explicit
negotiation, and precise estimation of coalitional cost. Hence, they do not perform well in dynamic, real-time scenarios.
Cooperation of Resource-Limited Robots: Robots with limited communication and sensing capabilities (i.e., partial
observability) can only obtain local, uncertain information of the dynamic environment. With limited computational power,
their cooperative strategies cannot involve complex planning or negotiations.
PUBLICATIONS
- Autonomic Mobile Sensor Network with Self-Coordinated Task Allocation and Execution.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews
(Special Issue on Engineering Autonomic Systems), volume 36, issue 3, pages 315-327, May 2006.
Andrew P. Sage Best Transactions Paper Award for the best paper published in IEEE Trans. SMC - Part A, B, and C in 2006
Abstract. This paper describes a distributed layered architecture for resource-constrained multirobot cooperation, which is utilized in autonomic mobile sensor network coverage. In the upper layer, a dynamic task allocation scheme self-organizes the robot coalitions to track efficiently across regions. It uses concepts of ant behavior to self-regulate the regional distributions of robots in proportion to that of the moving targets to be tracked in a nonstationary environment. As a result, the adverse effects of task interference between robots are minimized and network coverage is improved. In the lower task execution layer, the robots use self-organizing neural networks to coordinate their target tracking within a region. Both layers employ self-organization techniques, which exhibit autonomic properties such as self-configuring, self-optimizing, self-healing, and self-protecting. Quantitative comparisons with other tracking strategies such as static sensor placements, potential fields, and auction-based negotiation show that our layered approach can provide better coverage, greater robustness to sensor failures, and greater flexibility to respond to environmental changes.
- Task Allocation via Self-Organizing Swarm Coalitions in Distributed Mobile Sensor Network.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the 19th National Conference on Artificial Intelligence (AAAI-04), pages 28-33, San Jose, CA, Jul 25-29, 2004.
26.7% acceptance rate
Abstract. This paper presents a task allocation scheme via self-organizing swarm coalitions for distributed mobile sensor network coverage. Our approach uses the concepts of ant behavior to self-regulate the regional distributions of sensors in proportion to that of the moving targets to be tracked in a non-stationary environment. As a result, the adverse effects of task interference between robots are minimized and sensor network coverage is improved. Quantitative comparisons with other tracking strategies such as static sensor placement, potential fields, and auction-based negotiation show that our approach can provide better coverage and greater flexibility to respond to environmental changes.
- Reactive, Distributed Layered Architecture for Resource-Bounded Multi-Robot Cooperation: Application to Mobile Sensor Network Coverage.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA'04), pages 3747-3752, New Orleans, LA, Apr 26 - May 1, 2004.
Abstract. This paper describes a reactive, distributed layered architecture for cooperation of multiple resource-bounded robots, which is utilized in mobile sensor network coverage. In the upper layer, a dynamic task allocation scheme self-organizes the robot coalitions to track efficiently in separate regions. It uses the concepts of ant behavior to self-regulate the regional distributions of robots in proportion to that of the targets to be tracked in the changing environment. As a result, the adverse effects of task interference between robots are minimized and sensor network coverage is improved. In the lower layer, the robots use self-organizing neural networks to coordinate their target tracking within a region. Quantitative comparisons with other tracking strategies such as static sensor placements, potential fields, and auction-based negotiation show that our approach can provide better coverage and greater flexibility in responding to environmental changes.
PRESENTATIONS
- Task Allocation via Self-Organizing Swarm Coalitions in Distributed Mobile Sensor Network.
Kian Hsiang Low.
Presented in 8th National IT Awareness Project Competition (NITA-04), National University of Singapore, Mar 13, 2004
(Overall Best Project, Postgraduate Category).
VIDEO DEMOS
Coverage of 30 targets (green) with 15 ant robots (white)
- Self-organization of swarm coalitions to unknown, time-varying target distribution after
- Robot switching to region of higher task demand (i.e., targets to robots ratio)
ACTION SELECTION MECHANISM FOR MULTI-ROBOT TASKS
PROJECT DURATION : Sep 2002 - Nov 2002
PROBLEM MOTIVATION
A central issue in the design of behavior-based control architectures for autonomous mobile robots
is the formulation of effective mechanisms to coordinate the behaviors.
These mechanisms determine the policy of conflict resolution between behaviors,
which involves behavioral cooperation and competition to select the most appropriate action.
The actions are selected so as to optimize the achievement of the goals or behavioral objectives.
Developing such an action selection methodology is non-trivial
due to realistic constraints such as environmental complexity and unpredictability,
and resource limitations, which include computational and cognitive capabilities of the robot,
incomplete knowledge of the environment, and time constraints.
As a result, action selection can never be absolutely optimal.
Given these constraints, the action selection scheme should be able to choose actions
that are good enough to satisfy multiple concurrent, possibly conflicting, behavioral objectives.
PROPOSED METHODOLOGY
Our motivation of the action selection mechanism is to develop a motion control strategy for autonomous non-holonomic mobile robots
that can perform distributed multi-robot surveillance in unknown, dynamic, complex, and unpredictable environments.
By implementing the action selection framework using an assemblage of self-organizing neural networks,
it induces the following key features that significantly enhance the agent's action selection capability:
self-organization of continuous state and action spaces to provide smooth, efficient and fine motion control,
and action selection via the cooperation and competition of Extended Kohonen Maps to achieve more complex motion tasks:
(1) negotiation of unforeseen concave and narrowly spaced obstacles, and
(2) cooperative tracking of multiple mobile targets by a team of robots.
Qualitative and quantitative comparisons for single- and multi-robot tasks show that
our framework can provide better action selection than do potential fields method.
PUBLICATIONS
- An Ensemble of Cooperative Extended Kohonen Maps for Complex Robot Motion Tasks.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Neural Computation, volume 17, issue 6, pages 1411-1445, Jun 2005.
Abstract. Self-organizing feature maps such as extended Kohonen maps (EKMs) have been very successful at learning sensorimotor control for mobile robot tasks. This letter presents a new ensemble approach, cooperative EKMs with indirect mapping, to achieve complex robot motion. An indirect-mapping EKM self-organizes to map from the sensory input space to the motor control space indirectly via a control parameter space. Quantitative evaluation reveals that indirect mapping can provide finer, smoother, and more efficient motion control than does direct mapping by operating in a continuous, rather than discrete, motor control space. It is also shown to outperform basis function neural networks. Furthermore, training its control parameters with recursive least squares enables faster convergence and better performance compared to gradient descent. The cooperation and competition of multiple self-organized EKMs allow a nonholonomic mobile robot to negotiate unforeseen, concave, closely spaced, and dynamic obstacles. Qualitative and quantitative comparisons with neural network ensembles employing weighted sum reveal that our method can achieve more sophisticated motion tasks even though the weighted-sum ensemble approach also operates in continuous motor control space.
- Continuous-Spaced Action Selection for Single- and Multi-Robot Tasks Using Cooperative Extended Kohonen Maps.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the IEEE International Conference on Networking, Sensing and Control (ICNSC'04)
(Invited Paper to Special Session on Visual Surveillance), pages 198-203, Taipei, Taiwan, Mar 21-23, 2004.
Abstract. Action selection is a central issue in the design of behavior-based control architectures for autonomous mobile robots. This paper presents an action selection framework based on an assemblage of self-organizing neural networks called Cooperative Extended Kohonen Maps. This framework encapsulates two features that significantly enhance a robot's action selection capability: self-organization in the continuous state and action spaces to provide smooth, efficient and fine motion control; action selection via the cooperation and competition of Extended Kohonen Maps so that more complex motion tasks can be achieved. Qualitative and quantitative comparisons for both single- and multi-robot motion tasks show that our framework can provide better action selection than do action superposition methods.
- Action Selection for Single- and Multi-Robot Tasks Using Cooperative Extended Kohonen Maps.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI-03), pages 1505-1506, Acapulco, Mexico, Aug 9-15, 2003.
27.6% acceptance rate
Abstract. This paper presents an action selection framework based on an assemblage of self-organizing neural networks called Cooperative Extended Kohonen Maps. This framework encapsulates two features that significantly enhance a robot's action selection capability: self-organization in the continuous state and action spaces to provide smooth, efficient and fine motion control; action selection via the cooperation and competition of Extended Kohonen Maps to achieve more complex motion tasks. Qualitative and quantitative comparisons for single- and multi-robot tasks show our framework can provide better action selection than do potential fields method.
- Action Selection in Continuous State and Action Spaces by Cooperation and Competition of Extended Kohonen Maps.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the
2nd International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS-03), pages 1056-1057, Melbourne, Australia, Jul 14-18, 2003.
Abstract. This paper presents an action selection framework based on an assemblage of self-organizing neural networks called Cooperative Extended Kohonen Maps. This framework encapsulates two features that significantly enhance a robot's action selection capability: self-organization in the continuous state and action spaces to provide smooth, efficient and fine motion control; action selection via the cooperation and competition of Extended Kohonen Maps to achieve more complex motion tasks. Qualitative tests demonstrate the capability of our action selection method for both single- and multi-robot motion tasks.
VIDEO DEMO
- Cooperative tracking of moving targets by robots
using cooperative Extended Kohonen Maps
INTEGRATED ROBOT PLANNING AND CONTROL
PROJECT DURATION : Jul 2001 - Sep 2002
PROBLEM MOTIVATION
Robot motion research has proceeded along two separate directions:
high-level deliberative planning and low-level reactive control.
Deliberative planning uses a world model to generate an optimal sequence of collision-free actions
that can achieve a globally specified goal in a complex static environment.
However, in a dynamic environment, unforeseen obstacles may obstruct the action sequence,
and replanning to react to these situations can be too computationally expensive.
On the other hand, reactive control directly couples sensed data to appropriate actions.
It allows the robot to respond robustly and timely to unexpected obstacles and environmental changes
but may be trapped by them.
PROPOSED METHODOLOGY
The problem of goal-directed, collision-free motion in a complex, unpredictable environment can be solved
by tightly integrating high-level deliberative planning with low-level reactive control.
This work presents two such architectures for a nonholonomic mobile robot.
To achieve real-time performance, reactive control capabilities have to be fully realized so that
the deliberative planner can be simplified.
These architectures are enriched with reactive target reaching and obstacle avoidance modules.
Their target reaching modules use indirect-mapping Extended Kohonen Map to provide finer and smoother motion control
than direct-mapping methods.
While one architecture fuses these modules indirectly via command fusion,
the other one couples them directly using cooperative Extended Kohonen Maps,
enabling the robot to negotiate unforeseen concave obstacles.
The planner for both architectures use a slippery cells technique to
decompose the free workspace into fewer cells, thus reducing search time.
Any two points in the cell can still be traversed by reactive motion.
PUBLICATIONS
- Enhancing the Reactive Capabilities of Integrated Planning and Control with Cooperative Extended Kohonen Maps.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the
IEEE International Conference on Robotics and Automation (ICRA'03), pages 3428-3433, Taipei, Taiwan, May 12-17, 2003.
Abstract. Despite the many significant advances made in robot motion research, few works have focused on the tight integration of high-level deliberative planning with reactive control at the lowest level. In particular, the real-time performance of existing integrated planning and control architectures is still not optimal because the reactive control capabilities have not been fully realized. This paper aims to enhance the low-level reactive capabilities of integrated planning and control with Cooperative Extended Kohonen Maps for handling complex, unpredictable environments so that the work-load of the high-level planner can be consequently eased. The enhancements include fine, smooth motion control, execution of more complex motion tasks such as overcoming unforeseen concave obstacles and traversing between closely spaced obstacles, and asynchronous execution of behaviors.
- A Hybrid Mobile Robot Architecture with Integrated Planning and Control.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the
1st International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS-02), pages 219-226, Bologna, Italy, Jul 15-19, 2002.
26% acceptance rate
Abstract. Research in the planning and control of mobile robots has received much attention in the past two decades. Two basic approaches have emerged from these research efforts: deliberative vs. reactive. These two approaches can be distinguished by their different usage of sensed data and global knowledge, speed of response, reasoning capability, and complexity of computation. Their strengths are complementary and their weaknesses can be mitigated by combining the two approaches in a hybrid architecture. This paper describes a method for goal-directed, collision-free navigation in unpredictable environments that employs a behavior-based hybrid architecture with asynchronously operating behavioral modules. It differs from existing hybrid architectures in two important ways: (1) the planning module produces a sequence of checkpoints instead of a conventional complete path, and (2) in addition to obstacle avoidance, the reactive module also performs target reaching under the control of a self-organizing neural network. The neural network is trained to perform fine, smooth motor control that moves the robot through the checkpoints. These two aspects facilitate a tight integration between high-level planning and low-level control, which permits real-time performance and easy path modification even when the robot is en route to the goal position.
- Integrated Planning and Control of Mobile Robot with Self-Organizing Neural Network.
Kian Hsiang Low, Wee Kheng Leow & Marcelo H. Ang, Jr.
In Proceedings of the
IEEE International Conference on Robotics and Automation (ICRA'02), pages 3870-3875, Washington, DC, May 11-15, 2002.
Abstract. Despite the many significant advances made in robotics research, few works have focused on the tight integration of task planning and motion control. Most integration works involve the task planner providing discrete commands to the low-level controller, which performs kinematics and control computations to command the motor and joint actuators. This paper presents a framework of the integrated planning and control for mobile robot navigation. Unlike existing integrated approaches, it produces a sequence of checkpoints instead of a complete path at the planning level. At the motion control level, a neural network is trained to perform motor control that moves the robot from one checkpoint to the next. This method allows for a tight integration between high-level planning and low-level control, which permits real-time performance and easy modification of motion path while the robot is enroute to the goal position.
- Integrated Robot Planning and Control with Extended Kohonen Maps.
Kian Hsiang Low.
Master's Thesis, Department of Computer Science, School of Computing, National University of Singapore, Jul 2002.
Singapore Computer Society Prize for best M.Sc. Thesis 2002-2003
Abstract. The problem of goal-directed, collision-free motion in a complex, unpredictable environment can be solved by tightly integrating high-level deliberative planning with low-level reactive control. This thesis presents two such architectures for a nonholonomic mobile robot. To achieve real-time performance, reactive control capabilities have to be fully realized so that the deliberative planner can be simplified. These architectures are enriched with reactive target reaching and obstacle avoidance modules. Their target reaching modules use indirect-mapping Extended Kohonen Map to provide finer and smoother motion control than direct-mapping methods. While one architecture fuses these modules indirectly via command fusion, the other one couples them directly using cooperative Extended Kohonen Maps, enabling the robot to negotiate unforeseen concave obstacles. The planner for both architectures use a slippery cells technique to decompose the free workspace into fewer cells, thus reducing search time. Any two points in the cell can still be traversed by reactive motion.
VIDEO DEMOS
- Robot motion in an environment with unforeseen stationary obstacle
using command fusion
- Robot motion in an environment with unforeseen moving obstacle
using command fusion
- Robot motion in an environment that changes using command
fusion
- Robot motion in an environment with
unforeseen stationary concave and narrowly spaced convex obstacles using cooperative Extended Kohonen Maps
- Robot motion in an environment with
unforeseen moving obstacles using cooperative Extended Kohonen Maps
|