project has two stages. In this first stage, we built an
unstructured P2P network based on agent model. A number of
applications were tested on that platform such as PeerDB (for database
application), PeerIS (for Information Retrieval), BuddyWeb (for
collaborative web caching).
An open source version of BestPeer can be
downloaded from here.
the second stage, BestPeer is enhanced to a
scalable, sharable, and secure P2P-based Data
Management system with full functionalities for building corporate
network applications such as
national healthcare network. As shown in Figure 1, BestPeer is designed to support enterprise
applications. It builds a corporate network by linking companies via a
structured overlay (BATON). Each company acts as a node in BestPeer and
exports a portion of its local data for sharing with other companies. From
the view of a user, BestPeer can be considered as a new data sharing
platform for enterprise applications.
Specifically, BestPeer V2.0 supports:
Semi-Automatic Schema Mapping: To
share data with others, the company needs to map its schema to the global
schema. A machine learning algorithm is employed to help the manager to
establish the mapping relations.
， Incremental Data Integration: Once the mapping
relations are set up, BestPeer automatically and periodically exports data from local
databases of participating companies to BestPeer data sharing platform.
， Efficient Query Processing: A
distributed query plan is generated and forwarded to multiple processing
nodes, where the query is processed in parallel.
In addition, to
support analytic queries that aim to provide timely summarized
statistics for decision making, a distributed online aggregation scheme
is developed to iteratively and progressively produce approximate
aggregate results for users.
Data security and privacy:
messages sent between nodes in BestPeer are
encrypted to increase the security level of the system. Furthermore,
access to the data shared in BestPeer corporate network is controlled by
a distributed role-based access control scheme
protect local data of each node from malicious users.
， Intelligent Replication: BestPeer
provides an always-on service, in that node failures do not affect
the availability of data. To achieve this goal, an intelligent replication
strategy driven by the system runtime workload is applied to replicate data across the nodes
for data availability and load balancing.
， Analytic Tool:
BestPeer software runs as a backend service in each node. The
users can access the service via web interfaces, which increases the
usability of the service.
graphs (e.g., bar-graph and pie-graph) for query results to facilitate
， Cloud Support:
BestPeer is now
cloud enabled. By integrating cloud computing, database, and P2P
technologies, BestPeer achieves its query processing efficiency in a
pay-as-you-go manner and is a promising approach for corporate network
applications. More details of our cloud solution can be found on