Falcon

New Project Site Available @ https://www.comp.nus.edu.sg/~dbsystem/fintech-Falcon


Falcon: A Secure and Interpretable Federated Learning Platform

Overview

Falcon is designed for federated learning with privacy protection, allowing multiple parties to train a model on their joint data and make predictions without disclosing their private data, so as to abide by the privacy regulations. It provides strong privacy protection, interpretable incentives, and parallelized executions.

Motivating Example

Falcon focuses on cross-silo data collaboration. A motivating example is in the digital banking scenario, where a bank and a Fintech company aim to jointly build a machine learning model that evaluates credit card applications.

img1

The bank has some partial information about the users (e.g., account balances), while the Fintech company has some other information (e.g., the users’ online transactions). Based on data collaboration, the bank has a more accurate model, while the Fintech company could benefit from a pay-per-use model for its contribution to the training and prediction.

System Architecture

The Falcon system consists of Platform, Coordinator, and a number of Parties, the architecture is illustrated as follows.

img2

The Platform provides the user interface and helps the parties to manage collaboration meta-data, the Coordinator manages the work execution and monitors the execution status, and the Parties execute each task in a decentralized manner, with the help of the secure operators. The machine learning models to be supported include logistic regression, decision tree, random forest, gradient boosting decision tree, neural networks, etc.

Main Features

Strong privacy protection. Falcon utilizes a set of cryptographic techniques, including partially homomorphic encryption, secure multiparty computation, and differential privacy, ensuring no intermediate information is disclosed during the whole machine learning pipeline.

Interpretable incentives. Falcon provides secure explainable methods to calculate each party’s data contribution to the training process and interpret each feature’s contribution to the model predictions, making the collaboration more reliable and trustworthy.

Parallelized executions. Falcon supports parallelized executions over the cryptographic operations based on docker swarm service to speed up training and prediction, ensures that our platform is highly scalable and results in improved performance. The following figure shows an example of vertical logistic regression training using the Bank Marketing dataset. There are three parties, and each party has four machines. The speedup increases as the number of workers on each party increases.

img3

Future Work

The Falcon platform is still a work in progress. More information will be updated in the future.

Avatar
NUS DBsystem

AI- and Data-driven Financial Management and Analytics