Database Group, School of Computing, National University of Singapore

 

 

Figure 1. Architecture of BestPeer++

The last two decades have witnessed a growing need for information sharing. With more affordable commodity hardwares and higher communication bandwidth, data sharing has become an exciting killer application of the Internet.

Peer-to-peer (P2P) architecture aims at extending the current distributed computing design to accommodate dynamic resources such as information and computing power. In such environments, the peers are autonomous with highly dynamic behaviors and act as servers and consumers at the same time. These characteristics bring many opportunities for exploitation but with technical challenges to be solved as well.

About BestPeer++

Now in the last stage of its evolution, BestPeer++ is enhanced with distributed access control, multiple types of indexes, and pay-as-you-go query processing for delivering elastic data sharing services in the cloud.

The software components of BestPeer++ are separated into two parts: core and adapter. The core contains all the data sharing functionalities and is designed to be platform independent. The adapter contains one abstract adapter which defines the elastic infrastructure service interface and a set of concrete adapter components which implement such an interface through APIs provided by specific cloud service providers (e.g., Amazon). We adopt this "two-level" design to achieve portability. With appropriate adapters, BestPeer++ can be ported to any cloud environments (public and private) or even non-cloud environment (e.g., on-premise data center).  The architecture of BestPeer++ is depicted in Figure 1.

Specifically, highlights of BestPeer++ are:

       Amazon Cloud Adapter: The key idea of BestPeer++ is to use dedicated database servers to store data for each business and organize those database servers through P2P network for data sharing. The Amazon Cloud Adapter provides an elastic hardware infrastructure for BestPeer++ to operate on by using Amazon Cloud services.

       The BestPeer++ Core: The BestPeer++ core contains all platform-independent logic, including query processing and P2P overlay. It runs on top of adapter and consists of two software components: bootstrap peer and normal peer.

       Adaptive Query Processor: BestPeer++ employs a hybrid design for achieving high performance query processing. The major workload of a corporate network is simple, low-overhead queries. Such queries typically only involve querying a very small number of business partners and can be processed in short time. BestPeer++ is mainly optimized for these queries. For infrequent time-consuming analytical tasks, we provide an interface for exporting the data from BestPeer++ to Hadoop and allow users to analyze those data using MapReduce.

An open source version of BestPeer++ can be downloaded from here.