Multi-Agent Planning, Learning, and Coordination Group (MapleCG)
Towards bridging the gaps between planning, learning, and coordination
LOW, BRYAN KIAN HSIANG |劉謙雄|
Assistant Professor > CS > NUS
Ph.D. > ECE ◊ RI > CMU
MapleCG > CS > NUS
My Conception of a Swarm of Agents.
Courtesy of Cocoa Yeo (2008).
Invited to serve as a World Economic Forum’s Global Future Councils Fellow
for the Council on the Future of Artificial Intelligence and Robotics, Sep 2016
Congratulations to Quang Minh Hoang for receiving the outstanding undergraduate researcher prize, NUS!
Our ICML 2017 submission is accepted!
Our AAAI 2017 submission is accepted!
Our ICML 2016 submission is accepted!
Our IJCAI 2016 submission is accepted!
Source codes for our parallel/distributed Gaussian processes (ICML 2016, AAAI 2015, UAI 2013) and
online/anytime Gaussian processes (ICML 2015, AAAI 2014) are now available!
MOE AcRF Tier 2 Grant : Scaling up Gaussian Process Predictive Models for Big Data,
SGD $637,884, Jul 2017 - Jul 2020
Research collaboration agreement with Panasonic R&D Singapore :
Sonar Data Fusion Algorithm for Object Distance Estimation,
SGD $48,792, Dec 2016 - Jul 2017
Invited to serve as senior program committee member of IJCAI 2015,
associate editors of IROS 2012 & ICRA 2011,
program committee members of AAAI 2016-2017, IJCAI CompSust Track 2015,
AAMAS 2011-2014, 2016, RSS 2014,
ICAPS 2010-2012, IJCAI 2011, 2017, AAAI 2010, and
reviewer of NIPS 2013-2016
Students interested to join MapleCG, click here for more info
Appointed as a member of the CS Executive Committee,
Department of Computer Science, NUS, Jul 2014 - Jun 2017
CS3243 Introduction to Artificial Intelligence ◊ Sem 2 ◊ 2010-11 ◊ 2012-13 ◊ 2013-14 ◊ 2014-15 ◊ 2015-16 ◊ 2016-17
CS4246 AI Planning and Decision Making ◊ Sem 1 ◊ 2010-11 ◊ 2011-12 ◊ 2015-16 ◊ 2016-17
CS1231 Discrete Structures ◊ Sem 1 ◊ 2009-10 ◊ 2010-11 ◊ 2011-12 ◊ 2012-13 ◊ 2013-14 ◊ 2014-15
I am looking for talented undergraduate and graduate students in NUS to join my MapleCG research group. If you are really excited and motivated to be involved in novel research in the fields of artificial intelligence, planning under uncertainty (i.e., decision-theoretic, information-theoretic), robotics, multi-agent systems (i.e., multi-agent coordination, planning, and learning), game theory, statistical machine learning, optimization, and/or swarm intelligence, please email me and we can set up a time for discussion. Please also take some time to view the research projects on the left.
I am currently advising the following students and research staff:
MOST FREQUENTLY USED WORDS extracted from paper titles and abstracts
AUTOMATIC MACHINE LEARNING : BAYESIAN OPTIMIZATION
PROJECT DURATION : Feb 2016 - Present
EXPLORATION-EXPLOITATION DILEMMA IN ACTIVE LEARNING OF GAUSSIAN PROCESSES
PROJECT DURATION : Aug 2013 - Present
The exploration-exploitation dilemma arises in the following three problems of active learning of Gaussian processes:
ONLINE AND ANYTIME SPARSE GAUSSIAN PROCESSES FOR BIG DATA
PROJECT DURATION : Aug 2013 - Present
A Gaussian process regression (GPR) model is a Bayesian nonparametric model for performing nonlinear regression that provides a Gaussian predictive distribution with formal measures of predictive uncertainty. The expressivity of a full-rank GPR (FGPR) model, however, comes at a cost of cubic time in the size of the data, thus rendering it computationally impractical for training with massive datasets. To improve its scalability, a number of sparse GPR (SGPR) models exploiting low-rank approximate representations have been proposed, many of which share a similar structural assumption of conditional independence (albeit of varying degrees) based on the notion of inducing variables and consequently incur only linear time in the data size. The work of Quinonero-Candela & Rasmussen (2005) has in fact presented a unifying view of such SGPR models, which include the subset of regressors (SoR), deterministic training conditional (DTC), fully independent training conditional (FITC), fully independent conditional (FIC), partially independent training conditional (PITC), and partially independent conditional (PIC) approximations. To scale up these SGPR models further for performing real-time predictions necessary in many time-critical applications and decision support systems (e.g., ocean sensing, traffic monitoring), the work of Gal et al. (2014) has parallelized DTC while that of Chen et al. (2013) has parallelized FITC, FIC, PITC, and PIC to be run on multiple machines. The recent work of Low et al. (2015) has produced a spectrum of SGPR models with PIC and FGPR at the two extremes that are also amenable to parallelization on multiple machines. Ideally, these parallel SGPR models can reduce the incurred time of their centralized counterparts by a factor close to the number of machines. In practice, since the number of machines is limited due to budget constraints, their incurred time will still grow with an increasing size of data. Like their centralized counterparts, they can be trained using all the data. When the data is expected to stream in over a (possibly indefinitely) long time, it is also computationally impractical to repeatedly use these existing offline sparse GP approximation methods or even the online GP model (i.e., quadratic time in the data size) for training at each time step.
A more affordable alternative is to instead train a SGPR model in either an (1) online or (2) anytime fashion with a small, randomly sampled subset of the data at each iteration, which requires only a single machine:
PARALLEL AND DISTRIBUTED SPARSE GAUSSIAN PROCESSES FOR BIG DATA
Gaussian process (GP) models are a rich class of Bayesian non-parametric models that can perform probabilistic regression by providing Gaussian predictive distributions with formal measures of the predictive uncertainty. Unfortunately, a GP model is handicapped by its poor scalability in the size of the data, hence limiting its practical use to small data. To improve its scalability, two families of sparse GP regression methods have been proposed: (a) Low-rank approximate representations of the full-rank GP (FGP) model are well-suited for modeling slowly-varying functions with large correlation and can use all the data for predictions. But, they require a relatively high rank to capture small-scale features/patterns (i.e., of small correlation) with high fidelity, thus losing their computational advantage. (b) In contrast, localized regression and covariance tapering methods (e.g., local GPs and compactly supported covariance functions) are particularly useful for modeling rapidly-varying functions with small correlation. However, they can only utilize local data for predictions, thereby performing poorly in input regions with little/no data. Furthermore, to accurately represent large-scale features/patterns (i.e., of large correlation), the locality/tapering range has to be increased considerably, thus sacrificing their time efficiency.
Recent sparse GP regression methods have unified approaches from the two families described above to harness their complementary modeling and predictive capabilities (hence, eliminating their deficiencies) while retaining their computational advantages. Specifically, after approximating the FGP (in particular, its covariance matrix) with a low-rank representation based on the notion of inducing variables, a sparse covariance matrix approximation of the resulting residual process is made. However, this sparse residual covariance matrix approximation imposes a fairly strong conditional independence assumption given the inducing variables since the number of inducing variables cannot be too large to preserve time efficiency. We argue in this work that such a strong assumption is an overkill: It is in fact possible to construct a more refined, dense residual covariance matrix approximation by exploiting a Markov assumption and, perhaps surprisingly, still achieve scalability, which distinguishes our work here from existing sparse GP regression methods utilizing low-rank representations (i.e., including the unified approaches) described earlier. As a result, our proposed residual covariance matrix approximation can significantly relax the conditional independence assumption (especially with larger data), hence potentially improving the predictive performance.
This work presents a low-rank-cum-Markov approximation (LMA) of the FGP model that is novel in leveraging the dual computational advantages stemming from complementing the reduced-rank covariance matrix approximation based on the inducing variables with the residual covariance matrix approximation due to the Markov assumption; the latter approximation is guaranteed to be closest in the Kullback-Leibler distance criterion subject to some constraint. Consequently, our proposed LMA method can trade off between the number of inducing variables and the order of the Markov property to (a) incur lower computational cost than sparse GP regression methods utilizing low-rank representations with only the number of inducing variables or spectral points as the varying parameter while achieving predictive performance comparable to them and (b) accurately represent features/patterns of any scale. Interestingly, varying the Markov order produces a spectrum of LMAs with the partially independent conditional (PIC) approximation and FGP at the two extremes. An important advantage of LMA over most existing sparse GP regression methods is that it is amenable to parallelization on multiple machines/cores, thus gaining greater scalability for performing real-time predictions necessary in many time-critical applications and decision support systems (e.g., ocean sensing, traffic monitoring). Our parallel LMA method is implemented using the message passing interface (MPI) framework to run in clusters of up to 32 computing nodes and its predictive performance, scalability, and speedup are empirically evaluated on three real-world datasets (i.e., including a million-sized dataset).
INTENTION-AWARE PLANNING UNDER UNCERTAINTY FOR INTERACTING OPTIMALLY WITH SELF-INTERESTED AGENTS
PROJECT DURATION : May 2011 - Present
PROJECT AFFILIATION : Singapore-MIT Alliance for Research and Technology (SMART) Future Urban Mobility (FM) IRG (Collaborators: Emilio Frazzoli, MIT; Daniela Rus, MIT)
PROJECT FUNDING : SMART Subaward Agreements - FM IRG : Autonomy in Mobility-On-Demand Systems, SGD $1,348,638.22, Aug 2010 - Dec 2015
Designing and developing efficient planning algorithms for intelligent agents to interact and perform effectively among other self-interested agents has recently emerged as a grand challenge in non-cooperative multi-agent systems. Such a challenge is posed by many real-world applications, which include automated electronic trading markets where software agents interact, and traffic intersections where autonomous cars have to negotiate with human-driven vehicles to cross them, among others. Modeling, predicting, and learning the other agents' intentions efficiently is therefore critical to overcoming this challenge.
In practice, it is highly non-trivial to model and predict the other agents' intentions efficiently. Existing works addressing this challenge are often undermined due to either the restrictive assumptions on the agents' behaviors or the prohibitively expensive cost of modeling and predicting their intentions:
Furthermore, there is another practical concern for these decision-theoretic approaches: They often require the behavioral model's parameters (e.g., hierarchical level k) to be completely specified by the practitioners, which can be very impractical in many situations where it is non-trivial to do so or the prior knowledge is insufficient to reliably derive these parameters. This essentially boils down to the need of learning while interacting with the other self-interested agents and, interestingly, an exploitation-exploration trade-off to be made while doing so: Should an agent exploit the "best" action based on its (possibly misleading) knowledge to maximize the payoff or explore a "sub-optimal" action to refine its knowledge?
Naively, one may attempt to solve it by directly tapping into the huge body of existing works in Bayesian Reinforcement Learning (BRL), which offers a broad range of principled treatments of this issue under single-agent contexts. However, most of these works often assume very simple and specific parameterizations of the unknown environments, thus rendering them inapplicable to the context where the other agent's behavior has a far more complicated parameterization. More importantly, the other agent's behavior often needs to be modeled differently depending on the specific context. Grounding in the context of existing BRL frameworks, either the domain expert struggles to best fit his prior knowledge to the supported set of parameterizations or the agent developer has to re-design the framework to incorporate a new modeling scheme. Arguably, there is no free lunch when it comes to modeling the agent's behavior across various applications.
The main focus of our work here is thus to investigate and address the following questions:
GAUSSIAN PROCESS DECENTRALIZED DATA FUSION & ACTIVE SENSING AGENTS FOR MOBILITY-ON-DEMAND SYSTEMS
Towards Large-Scale Spatiotemporal Traffic Modeling and Prediction
PROJECT DURATION : Aug 2010 - Present
PROJECT AFFILIATION : Singapore-MIT Alliance for Research and Technology (SMART) Future Urban Mobility (FM) IRG (Collaborator: Patrick Jaillet, MIT)
PRIVATE automobiles are becoming unsustainable personal mobility solutions in densely populated urban cities because the addition of parking and road spaces cannot keep pace with their escalating numbers due to limited urban land. For example, Hong Kong and Singapore have, respectively, experienced 27.6% and 37% increase in private vehicles from 2003 to 2011. However, their road networks have only expanded less than 10% in size. Without implementing sustainable measures, traffic congestions and delays will grow more severe and frequent, especially during peak hours.
Mobility-on-demand (MoD) systems (e.g., Velib system of over 20000 shared bicycles in Paris, experimental carsharing systems) have recently emerged as a promising paradigm of one-way vehicle sharing for sustainable personal urban mobility, specifically, to tackle the problems of low vehicle utilization rate and parking space caused by private automobiles. Conventionally, a MoD system provides stacks and racks of light electric vehicles distributed throughout a city: When a user wants to go somewhere, he simply walks to the nearest rack, swipes a card to pick up a vehicle, drives it to the rack nearest to his destination, and drops it off. In this work, we assume the capability of a MoD system to be enhanced by deploying robotic shared vehicles (e.g., General Motors Chevrolet EN-V 2.0 prototype) that can autonomously drive and cruise the streets of a densely populated urban city to be hailed by users (like taxis) instead of just waiting at the racks to be picked up. Compared to the conventional MoD system, the fleet of autonomous robotic vehicles provides greater accessibility to users who can be picked up and dropped off at any location in the road network. As a result, it can service regions of high mobility demand but with poor coverage of stacks and racks due to limited space for their installation.
The key factors in the success of a MoD system are the costs to the users and system latencies, which can be minimized by managing the MoD system effectively. To achieve this, two main technical challenges need to be addressed: (a) Real-time, fine-grained mobility demand sensing and prediction, and (b) real-time active fleet management to balance vehicle supply and demand and satisfy latency requirements at sustainable operating costs. Existing works on load balancing in MoD systems, dynamic traffic assignment problems, dynamic one-to-one pickup and delivery problems, and location recommendation and dispatch for cruising taxis have tackled variants of the second challenge by assuming the necessary inputs of mobility demand and traffic flow information to be perfectly or accurately known using prior knowledge or offline processing of historic data. Such an assumption does not hold for densely populated urban cities because their mobility demand patterns and traffic flow are often subject to short-term random fluctuations and perturbations due to frequent special events (e.g., storewide sales, exhibitions), unpredictable weather conditions, unforeseen emergencies (e.g., breakdowns in public transport services), or traffic incidents (e.g., accidents, vehicle breakdowns, roadworks). So, in order for the active fleet management strategies to perform well in fleet rebalancing and route planning to service the mobility demands, they require accurate, fine-grained predictive information of the spatiotemporally varying mobility demand patterns and traffic flow in real time, the former of which is the desired outcome of addressing the first challenge. To the best of our knowledge, there is little progress in the algorithm design and development to take on the first challenge, which will be a focus of our work here.
In practice, it is non-trivial to achieve real-time, accurate prediction of spatiotemporally varying traffic phenomena such as mobility demand patterns and traffic flow because the quantity of sensors that can be deployed to observe an entire road network is cost-constrained. For example, static sensors such as loop detectors are traditionally placed at designated locations in a road network to collect data for predicting the traffic flow. However, they provide sparse coverage (i.e., many road segments are not observed, thus leading to data sparsity), incur high installation and maintenance costs, and cannot reposition by themselves in response to changes in the traffic flow. Low-cost GPS technology allows the collection of traffic flow data using passive mobile probes (e.g., taxis/cabs). Unlike static sensors, they can directly measure the travel times along road segments. But, they provide fairly sparse coverage due to low GPS sampling frequency (i.e., often imposed by taxi/cab companies) and no control over their routes, incur high initial implementation cost, pose privacy issues, and produce highly-varying speeds and travel times while traversing the same road segment due to inconsistent driving behaviors. A critical mass of probes is needed on each road segment to ease the severity of the last drawback but is often hard to achieve on non-highway segments due to sparse coverage. In contrast, we propose using the autonomous robotic vehicles as active mobile probes to overcome the limitations of static and passive mobile probes. In particular, they can be directed to explore any segments of a road network to gather real-time mobility demand data (e.g., pickup counts of different regions) and traffic flow data (e.g., speeds and travel times along road segments) at a desired GPS sampling rate while enforcing consistent driving behavior. How then can the vacant autonomous robotic vehicles in a MoD system actively cruise a road network to gather and assimilate the most informative data for predicting a spatiotemporally varying traffic phenomenon like a mobility demand pattern or traffic flow? To solve this problem, a centralized approach to data fusion and active sensing is poorly suited because it suffers from a single point of failure and incurs huge communication, space, and time overheads with large data and fleet.
Our work proposes novel decentralized data fusion and active sensing algorithms for real-time, fine-grained traffic sensing, modeling, and prediction with a fleet of autonomous robotic vehicles in a MoD system. Note that the decentralized data fusion component of our proposed algorithms can also be used for static sensors and passive mobile probes. The specific contributions of our work here include:
PLANNING UNDER UNCERTAINTY FOR LARGE-SCALE ACTIVE MULTI-CAMERA SURVEILLANCE
PROJECT AFFILIATION : Sensor-Enhanced Social Media (SeSaMe) Centre (Collaborator: Mohan Kankanhalli)
MEDIA NEWS : TODAY's Science section (15 May 2015) - 'When CCTV cameras work together as one'
The problem of surveillance has grown to be a critical concern in many urban cities worldwide following a recent series of security threats like Mumbai terrorist attacks and London bomb blasts. Central to the problem of surveillance is that of monitoring, tracking, and observing multiple mobile targets of interest distributed over a large-scale obstacle-ridden environment (e.g., airport terminals, railway and subway stations, bus depots, shopping malls, school campuses, military bases). It is often necessary to acquire high-resolution videos/images of these targets for supporting real-world surveillance applications like activity/intention tracking and recognition, biometric analysis like target identification and face recognition, surveillance video mining, forensic video analysis/retrieval, among others.
Traditional surveillance systems consist of a large network of fixed/static CCTV (Closed Circuit Television) cameras that are placed to constantly focus at selected important locations in the buildings like entrance/exit, lobby, etc. Unfortunately, the maximum resolution of these cameras is limited to 720 x 480 pixels. So, they cannot capture high-resolution images/videos of the targets, especially when the targets are far away from the cameras. As a result, they perform poorly in acquiring the close-up views of the targets and their activities. HDTV/Megapixel cameras have recently been introduced to overcome this resolution issue. Similar to CCTV cameras, these fixed/static HDTV/megapixel cameras are placed to constantly focus at specific locations in the environment. A relatively large network of such cameras has to be installed in order to observe the targets in any region of the environment at high resolution, which is impractical in terms of equipment, installation, and maintenance costs.
The use of active PTZ (Pan/Tilt/Zoom) cameras is becoming an increasingly popular alternative to that of fixed/static cameras for surveillance because the active cameras are endowed with pan-tilt-zoom capabilities that can be exploited to focus on and observe the targets at high image/video resolution. Hence, fewer active cameras need to be deployed to be able to capture high-resolution images/videos of the targets in any region of the environment. In order to achieve effective real-time surveillance, an efficient automated mechanism is required to autonomously coordinate and control these cameras' actions.
The objective of this work is thus to address the following central surveillance problem: "How can a network of active cameras be coordinated and controlled to maximize the number of targets observed with a guaranteed image resolution?"
This work presents a novel principled decision-theoretic planning under uncertainty approach to coordinating and controlling a large-scale network of active cameras for performing high-quality surveillance of large crowds of moving targets. In particular, our approach addresses the following practical issues affecting the surveillance problem:
(a) Multiple sources of uncertainty. A typical surveillance environment is fraught with multiple sources of uncertainty such as noisy cameras' observations, stochastic targets' motion, and unknown targets' locations, etc. These uncertainties make it difficult for the active cameras to know where to observe in order to keep the targets within their fields of view (fov). Consequently, they may lose track of the observed targets. To resolve this, our approach models a belief over the targets' states (i.e., locations, directions, and velocities) and updates the belief in a Bayesian paradigm based on probabilistic models of the targets' motion and the active cameras' observations;
(b) Camera-target ratio. In crowded environments, the number of targets to be observed is usually much greater than the number of available cameras. When the number of targets increases, a surveillance system, if poorly designed, tends to incur exponentially increasing computation time, which degrades the real-time performance of the entire surveillance system;
(c) Trade-off between maximizing the expected number of observed targets and the image resolution of observing them. Increasing the resolution of observing some targets through panning, tilting, or zooming may result in the loss of other targets being tracked. To address this trade-off, the cameras' actions are coordinated to simultaneously improve the belief over the targets' states and maximize the expected number of targets observed with a guaranteed pre-defined resolution;
(d) Scalability. By exploiting the inherent structure of the surveillance problem, our approach can scale linearly in the number of targets to be observed;
(e) Real-time requirement. The cameras' actions are computed in real time;
(f) Occlusions. Many real-world surveillance environments contain obstacles that occlude the fov of some or perhaps even all of the cameras, thus preventing the cameras from persistently tracking their observed targets. The regions where the targets cannot be observed by any of the cameras due to obstacles are said to be occluded. When the targets reside in these occluded regions or are not within the fov of any camera, the surveillance system loses track of them, thus degrading the surveillance performance. Such environments are called partially observable in the sense that the exact locations of the targets may not be observed directly by the cameras at all times.
As demonstrated empirically through simulations, our approach can achieve high-quality surveillance of a large number of targets in real time and its surveillance performance degrades gracefully with an increasing number of targets. The real-world experiments show the practicality of our decision-theoretic approach to coordinate and control cameras in surveillance systems.
MULTI-ROBOT INFORMATIVE PATH PLANNING FOR ACTIVE SENSING OF SPATIOTEMPORAL ENVIRONMENTAL PHENOMENA
PROJECT DURATION : Jan 2010 - May 2013
PROJECT AFFILIATION : Collaborative Multi-Robot Exploration of the Coastal Ocean (Collaborators: John M. Dolan, CMU; Gaurav S. Sukhatme, USC; Kanna Rajan, MBARI)
PROJECT FUNDING : MOE AcRF Tier 1 Grant : Active Robotic Exploration and Mapping for Environmental Sensing Applications, SGD $165,377, Apr 2010 - Mar 2013
Research in environmental sensing and monitoring has recently gained significant attention and practical interest, especially in supporting environmental sustainability efforts worldwide. A key direction of this research aims at sensing, modeling, and predicting the various types of environmental phenomena spatially distributed over our natural and built-up habitats so as to improve our knowledge and understanding of their economic, environmental, and health impacts and implications. This is non-trivial to achieve due to a trade-off between the quantity of sensing resources (e.g., number of deployed sensors, energy consumption, mission time) and the uncertainty in predictive modeling. In the case of deploying a limited number of mobile robotic sensing assets, such a trade-off motivates the need to plan the most informative resource-constrained observation paths to minimize the uncertainty in modeling and predicting a spatially varying environmental phenomenon, which constitutes the active sensing problem to be addressed in this work.
A wide multitude of natural and urban environmental phenomena is characterized by spatially correlated field measurements, which raises the following fundamental issue faced by the active sensing problem:
How can the spatial correlation structure of an environmental phenomenon be exploited to improve the active sensing performance and computational efficiency of robotic path planning?
In this work, we will investigate the above issue for an important broad class of environmental phenomena called anisotropic fields that exhibit a (often much) higher spatial correlation along one direction than along its per- pendicular direction. Such fields occur widely in natural and built-up environments and some of them include (a) ocean and freshwater phenomena like plankton density, fish abundance, temperature and salinity; (b) soil and atmospheric phenomena like peat thickness, surface soil moisture, rainfall; (c) mineral deposits like radioactive ore; (d) pollutant and contaminant concentration like air, heavy metals; and (e) ecological abundance like vegetation density.
This work presents two principled approaches to efficient information-theoretic path planning based on entropy and mutual information criteria for in situ active sensing of environmental phenomena. In contrast to the existing methods described above, our proposed path planning algorithms are novel in addressing a trade-off between active sensing performance and computational efficiency. An important practical consequence is that our algorithms can exploit the spatial correlation structure of anisotropic fields to improve time efficiency while preserving near-optimal active sensing performance.
ENVIRONMENTAL BOUNDARY TRACKING & ESTIMATION WITH MULTIPLE ROBOTS
PROJECT COLLABORATORS : John M. Dolan, CMU; Steve Chien, JPL, Caltech
PROJECT FUNDING : MOE AcRF Tier 1 Grant : Active Robotic Exploration and Mapping for Environmental Sensing Applications, SGD $165,377, Apr 2010 - Mar 2013
A fundamental problem in environmental sensing and monitoring is to identify and delineate the hotspot regions in a large-scale environmental field. It involves partitioning the area spanned by the field into one class of regions called the hotspot regions in which the field measurements exceed a predefined threshold, and the other class of regions where they do not. Such a problem arises in many real-world applications such as precision agriculture, monitoring of ocean and freshwater phenomena (e.g., plankton bloom), forest ecosystems, rare species, pollution (e.g., oil spill), or contamination (e.g., radiation leak). In these applications, it is necessary to assess the spatial extent and shape of the hotspot regions accurately due to severe economic, environmental, and health implications. In practice, this aim is non-trivial to achieve because the constraints on the sampling assets' resources (e.g., energy consumption, mission time, sensing range) limit the number and coverage of in situ observations over the large field that can be used to infer the hotspot regions. Subject to limited observations, the most informative ones should therefore be selected in order to minimize the uncertainty of estimating the hotspot regions (or, equivalently, classifying/labeling the hotspots) in the large field, which motivates our adaptive sampling work in this work.
Mobile robot teams are particularly desirable for performing the above environmental sensing task because they can actively explore to map the hotspot regions at high resolution. On the other hand, static sensors lack mobility and are therefore not capable of doing this well unless a large quantity is deployed. While research in multi-robot exploration and mapping have largely focused on the conventional task of building occupancy grids, some recent efforts are put into the more complex, general task of sampling spatially distributed environmental fields. In contrast to occupancy grids that assume discrete, independent cell occupancies, environmental fields are characterized by continuous-valued, spatially correlated measurements, properties of which cannot be exploited by occupancy grid mapping strategies to select the most informative observation paths. To exploit such properties, exploration strategies for learning environmental field maps have recently been developed and can be classified into two regimes: (a) wide-area coverage strategies consider sparsely sampled (i.e., largely unexplored) areas to be of high uncertainty and consequently spread observations evenly across the field; (b) hotspot sampling strategies assume areas of high uncertainty and interest to contain extreme, highly-varying measurements and hence produce clustered observations. Formal, principled approaches of exploration have also been devised to simultaneously perform hotspot sampling when a hotspot region is found as well as wide-area coverage to search for new hotspot regions in sparsely sampled areas. These strategies optimize their observation paths to minimize the uncertainty (e.g., in terms of mean-squared error or entropy) of mapping the entire continuous-valued field. They are, however, suboptimal for classifying/labeling the hotspots in the field, which we will discuss and demonstrate theoretically and empirically in this work.
This work presents a novel decentralized active robotic exploration (DARE) strategy for probabilistic classification/labeling of hotspots in a large-scale environmental field. The environmental field is assumed to be realized from a rich class of probabilistic spatial models called Gaussian process (GP) that can formally characterize its spatial correlation structure. More importantly, it can provide formal measures of classification/labeling uncertainty (i.e., in the form of cost functions) such as the misclassification and entropy criteria for directing the robots to explore highly uncertain areas of the field. The chief impediment to using these formal criteria is that they result in cost-minimizing exploration strategies, which cannot be solved in closed form. To resolve this, they are reformulated as reward-maximizing dual strategies, from which we can then derive the approximate DARE strategy to be solved in closed form efficiently. The specific contributions of our work include:
MULTI-ROBOT ADAPTIVE SAMPLING FOR ENVIRONMENTAL SENSING & MONITORING
PROJECT DURATION : Jul 2005 - Mar 2010
PROJECT LAB : T-SAR
Recent research in multi-robot exploration and mapping has focused on sampling environmental fields, some of which typically feature a few small hotspots in a large region. Such a hotspot field often arises in two real-world applications: (1) planetary exploration such as geologic reconnaissance and prospecting for mineral deposits or natural gases, and (2) environment and ecological sensing such as precision agriculture, and monitoring of ocean phenomena (e.g., plankton bloom, anoxic zones), forest ecosystems, rare species, pollution (e.g., oil spill), or contamination (e.g., radiation leak). In particular, the hotspot field is characterized by continuous-valued, spatially correlated measurements with the hotspots exhibiting extreme measurements and much higher spatial variability than the rest of the field. With limited (e.g., point-based) robot sensing range, a complete coverage becomes impractical in terms of resource costs (e.g., resource consumption). So, to accurately map the field, the hotspots have to be sampled at a higher resolution.
The hotspot field discourages static sensor placement because a large number of sensors has to be positioned to detect and refine the sampling of hotspots. If these static sensors are not placed in any hotspot initially, they cannot reposition by themselves to locate one. In contrast, a robot team is capable of performing high-resolution hotspot sampling due to its mobility. Hence, it is desirable to build a mobile robot team that can actively explore to map a hotspot field.
To learn a hotspot field map, the exploration strategy of the robot team has to plan the most informative resource-constrained observation paths that minimize the uncertainty of mapping the hotspot field. By representing the hotspot field using rich classes of Bayesian non-parametric models such as the Gaussian process or log-Gaussian process, formal measures of mapping uncertainty (e.g., based on mean-squared error [AAMAS-08] or entropy [ICAPS-09] criterion) can be defined and subsequently exploited by our proposed adaptive sampling algorithms for directing the robot team to explore highly uncertain areas of the field. In contrast to non-adaptive sampling strategies that only perform well with smoothly-varying fields, our non-myopic adaptive sampling algorithms can exploit clustering phenomena (i.e., hotspots) to plan observation paths that produce lower mapping uncertainty.
DISTRIBUTED LAYERED ARCHITECTURE FOR SELF-ORGANIZING MOBILE SENSOR NETWORKS
PROJECT DURATION : Nov 2002 - Jun 2005
One of the fundamental issues that arises in a sensor network is coverage. Traditionally, network coverage is maximized by determining the optimal placement of static sensors in a centralized manner, which can be related to the class of art gallery problems. However, recent investigations in sensor network mobility reveal that mobile sensors can self-organize to provide better coverage than static placement. Existing applications have only utilized uninformed mobility (i.e., random motion or patrol). In contrast, our work here focuses on informed, intelligent mobility to further improve coverage. Our network coverage problem is motivated by the following constraints that discourage static sensor placement or uninformed mobility: (a) no prior information about the exact target locations, population densities or motion pattern, (b) limited sensory range, and (c) very large area to be observed. All these conditions may cause the sensors to be unable to cover the entire region of interest. Hence, fixed sensor locations or uninformed mobility will not be adequate in general. Rather, the sensors have to move dynamically in response to the motion and distribution of targets and other sensors to maximize coverage. Inspired by robotics, the above problem may be regarded as that of low-level motion control to coordinate the sensors' target tracking movements in the continuous workspace. Alternatively, it can be cast as a high-level task allocation problem by segmenting the workspace into discrete regions such that each region is assigned a group or coalition of sensors to track the targets within.
This work presents a reactive layered multi-robot architecture for distributed mobile sensor network coverage in complex, dynamic environments. At the lower layer, each robot uses a reactive motion control strategy known as Cooperative Extended Kohonen Maps to coordinate their target tracking within a region without the need of communication. This strategy is also responsible for obstacle avoidance, robot separation to minimize task interference, and navigation between regions via beacons or checkpoints plotted by a motion planner. At the higher layer, the robots use a dynamic ant-based task allocation scheme to cooperatively self-organize their coalitions in a decentralized manner according to the target distributions across the regions. This scheme addresses the following issues, which distinguish it from the other task allocation mechanisms:
Task Allocation for Multi-Robot Tasks: Existing algorithms (e.g., auction-and behavior-based) assume a multi-robot task can be partitioned into single-robot tasks. But this may not be always possible or the multi-robot task can be more efficiently performed by coalitions of robots.
Coalition Formation for Minimalist Robots: Existing coalition formation schemes require complex planning, explicit negotiation, and precise estimation of coalitional cost. Hence, they do not perform well in dynamic, real-time scenarios.
Cooperation of Resource-Limited Robots: Robots with limited communication and sensing capabilities (i.e., partial observability) can only obtain local, uncertain information of the dynamic environment. With limited computational power, their cooperative strategies cannot involve complex planning or negotiations.
Coverage of 30 targets (green) with 15 ant robots (white)
ACTION SELECTION MECHANISM FOR MULTI-ROBOT TASKS
PROJECT DURATION : Sep 2002 - Nov 2002
A central issue in the design of behavior-based control architectures for autonomous mobile robots is the formulation of effective mechanisms to coordinate the behaviors. These mechanisms determine the policy of conflict resolution between behaviors, which involves behavioral cooperation and competition to select the most appropriate action. The actions are selected so as to optimize the achievement of the goals or behavioral objectives. Developing such an action selection methodology is non-trivial due to realistic constraints such as environmental complexity and unpredictability, and resource limitations, which include computational and cognitive capabilities of the robot, incomplete knowledge of the environment, and time constraints. As a result, action selection can never be absolutely optimal. Given these constraints, the action selection scheme should be able to choose actions that are good enough to satisfy multiple concurrent, possibly conflicting, behavioral objectives.
Our motivation of the action selection mechanism is to develop a motion control strategy for autonomous non-holonomic mobile robots that can perform distributed multi-robot surveillance in unknown, dynamic, complex, and unpredictable environments. By implementing the action selection framework using an assemblage of self-organizing neural networks, it induces the following key features that significantly enhance the agent's action selection capability: self-organization of continuous state and action spaces to provide smooth, efficient and fine motion control, and action selection via the cooperation and competition of Extended Kohonen Maps to achieve more complex motion tasks: (1) negotiation of unforeseen concave and narrowly spaced obstacles, and (2) cooperative tracking of multiple mobile targets by a team of robots. Qualitative and quantitative comparisons for single- and multi-robot tasks show that our framework can provide better action selection than do potential fields method.
INTEGRATED ROBOT PLANNING AND CONTROL
PROJECT DURATION : Jul 2001 - Sep 2002
Robot motion research has proceeded along two separate directions: high-level deliberative planning and low-level reactive control. Deliberative planning uses a world model to generate an optimal sequence of collision-free actions that can achieve a globally specified goal in a complex static environment. However, in a dynamic environment, unforeseen obstacles may obstruct the action sequence, and replanning to react to these situations can be too computationally expensive. On the other hand, reactive control directly couples sensed data to appropriate actions. It allows the robot to respond robustly and timely to unexpected obstacles and environmental changes but may be trapped by them.
The problem of goal-directed, collision-free motion in a complex, unpredictable environment can be solved by tightly integrating high-level deliberative planning with low-level reactive control. This work presents two such architectures for a nonholonomic mobile robot. To achieve real-time performance, reactive control capabilities have to be fully realized so that the deliberative planner can be simplified. These architectures are enriched with reactive target reaching and obstacle avoidance modules. Their target reaching modules use indirect-mapping Extended Kohonen Map to provide finer and smoother motion control than direct-mapping methods. While one architecture fuses these modules indirectly via command fusion, the other one couples them directly using cooperative Extended Kohonen Maps, enabling the robot to negotiate unforeseen concave obstacles. The planner for both architectures use a slippery cells technique to decompose the free workspace into fewer cells, thus reducing search time. Any two points in the cell can still be traversed by reactive motion.