ES2: Elastic Storage System

ES2 Overview

Overview

A typical data management system has to deal with real-time updates by individual users, and as well as periodical large scale analytical processing, indexing, and data extraction. We provide an elastic cloud data storage system, called ES2, which is designed to support both functionalities within the same storage. Its features include:

  • Elastic scaling
  • Hybrid storage - supporting both OLTP and OLAP
    • Flexible data partitioning based on the database workload
  • Load-adaptive replication
  • Transactional semantics for bundled updates
  • DBMS-like index functionality
    • Multiple indexes of different types: hash, range, multi-dimensional, bitmap indexes

System Architecture

ES2 System

Overview of Elastic Storage System (ES2)
  • Data import control module supports efficient data bulk-loading from external data sources. The data could be loaded from various data sources such as databases stored in conventional DBMSes, plain or structured data files, and the intermediate data generated by other Cloud applications.
  • Physical storage module contains major components such as distributed file system (DFS), meta-data catalog and distributed indexing. The DFS is where the imported data are actually stored. The meta-data catalog maintains both meta information about the tables in the storage and various fine-grained statistics information required by the data access control module.
  • Data access control module is responsible for performing data access requests from the upper layer applications. It has two sub-components: data access interface and data manipulator. The data access interface parses the data access requests into the corresponding internal representations that the data manipulator operates on and chooses a near optimal data access plan such as parallel sequential scan or index scan or hybrid for locating and operating on the target data stored in the physical storage module.