BCube and Flint: Overcoming the 50% Barrier in Blockchains

1. Background on blockchains

Blockchain is a disruptive technology that enables data exchange and processing among a large number of mutually-distrusting parties. Blockchains are the foundation of many crypto-currencies such as Bitcoin, but the utilities of blockchains extend far beyond crypto-currencies: People believe that blockchains can bring more extensive collaboration and data sharing in many industries, such as trading, finance, and even healthcare.

2. The 50% barrier in blockchains

Security is one of the most important properties of blockchains. Blockchains today can remain secure, even if some of the nodes in the blockchain system are completely malicious. In fact, many blockchains today can tolerate up to almost 50% of the nodes being malicious, or more generally, up to 50% of the computational power, or hash power, or stake being malicious. However, there are growing needs for blockchains to remain secure, even when more than 50% of the nodes are malicious.

Is 50% a fundamental barrier for blockchains?

People are usually pessimistic about this question: First, there is simply no existing blockchain system can remain secure, when more than 50% of the nodes are malicious. For example, the well-known 51% attack breaks Bitcoin. Second, many people even believe that it is simply impossible to remain secure when more than 50% of the nodes are malicious. The intuition is that when half of the nodes are bad, the bad nodes can form their own world, that is separate from and symmetric with the world formed by the good nodes. This results in "split-brain" for the blockchain: We now have two symmetric worlds, and we cannot tell which is the one formed by good nodes.

3. Our work: Overcoming the 50% barrier

Perhaps somewhat surprisingly, in our research, we have invented the very first practical blockchain system (called BCube) that remains secure even when more then 50% of the nodes are malicious. Before proceeding, let us quickly clarify why the "split-brain" argument does not actually apply to blockchains: Indeed, there can be two symmetric worlds. But for a given user of the blockchain, assuming the user is an honest node, then the user can trust the world to which he/she belongs. This breaks the symmetry.

We have proved the security of BCube via formal proofs. We have also implemented a research prototype of BCube, and done experiments with up to 10000 nodes. Our results show that BCube can comfortably tolerate 70% of the nodes being malicious. Such tolerance can further increase, if one does not have stringent requirements on the performance of the blockchain. To achieve these strong results, at the core of BCube, we use a novel byzantine broadcast protocol (called OverlayBB) that we invent. OverlayBB is able to remain secure even when majority of the nodes are malicious. At the same time, it can deliver excellent end-to-end throughput, by using (among other things) for example, aggressive pipelining.

Flint is our follow-up work on BCube. While using byzantine broadcast enables BCube to tolerate a malicious majority, it also results in large block confirmation latency. For example, to confirm a transaction in BCube, it can take several hours. Under certain conditions, such a large confirmation latency is fundamental if the malicious nodes indeed constitute a majority of the entire system. However, one would expect that in common/typical cases, only a small fraction of the nodes will be malicious.

Given such, Flint strives to achieve the best of both worlds. First, under the common cases where the fraction of malicious nodes is small (e.g., 20%), then Flint can guarantee security and also provide fast confirmation. Such fast confirmation only takes a few minutes, rather than a few hours. Second, under the worst-case where the fraction of malicious nodes is large (e.g., 60%), Flint still guarantees security while providing normal confirmation latency (i.e., a few hours). Flint achieves this by have an optimistic track, which confirms transactions fast under the normal case, and a normal track, which guarantees security even under the worst-case.

4. Find more about BCube and Flint in our papers:

Ruomu Hou, Haifeng Yu, and Prateek Saxena, "Using Throughput-Centric Byzantine Broadcast to Tolerate Malicious Majority in Blockchains". IEEE Symposium on Security and Privacy (Oakland), May 2022.

Ruomu Hou and Haifeng Yu, "Optimistic Fast Confirmation While Tolerating Malicious Majority in Blockchains". IEEE Symposium on Security and Privacy (Oakland), May 2023.

You See the Whole Picture When the Puzzle Pieces Match

1. Determining whether I have seen the "whole picture"

Consider a computer network with a certain topology, and some given node \(v\) in the network. In many application scenarios, we need the following basic functionality: Node \(v\) wants to collect information from every node \(w\) in the system, and wants to determine when it has indeed got information from all nodes. (Namely, when it has seen the "whole picture".)

While simple on the surface, this problem is non-trivial to solve: First, the information collected from different \(w\)'s may get combined/aggregated, before such information reaches \(v\). Hence \(v\) cannot easily tally how many pieces of information it has received. Second, node \(v\) may not actually know how many nodes there are in the network. Hence even if \(v\) could tally, it would still not be able to verify the tally count against a given target.

2. Our work: Using puzzle pieces

In this research, we invent a novel and elegant way for \(v\) to check whether it has seen the "whole picture". In our approach, we let each node \(w\) pick a random integer \(x_w \in [1, p]\), where \(p\) is some large prime number. Next, \(w\) sends one copy of \(x_w\) to each of its neighbors, and then keeps a copy of \( (p-x_w) \) for each copy of \(x_w\) that it sends out. Finally, \(w\) adds up all the value that it has kept, together with all the values that it has received from all its neighbors, to get a final number of \(\lambda_w\). In some sense, this \(\lambda_w\) is the ''puzzle piece'' from \(w\). In the above figure, we have \(\lambda_w = 2(p-x_w) + x_a + x_b \).

All the "puzzle pieces" from all the nodes, interestingly, offer some rather nice properties: Let \(U\) be the set of all nodes in the system, and let \(U'\) be any non-empty set of nodes such that \(U'\subsetneq U\). With some caveats, we can show that with high probability: \begin{eqnarray*} \sum_{w\in U} \lambda_w &=& 0 \mbox{ mod } p\\ \sum_{w\in U'} \lambda_w &\ne& 0 \mbox{ mod } p \end{eqnarray*}

Now we can use these properties to help \(v\) to determine whether it has seen the "whole picture", in the following way. When node \(v\) collects information from all the \(w\)'s, node \(v\) will simultaneously also collect the puzzle piece \(\lambda_w\) from all the \(w\)'s. Node \(v\) can now easily determined whether it has got information from all nodes: It simply checks whether all the puzzle pieces it has collected add up to zero. (Interestingly, here node \(v\) does not even need to know the total number of nodes in the system.) Putting it another way:

You see the whole picture when all the puzzle pieces match!

3. Generalizing the design

The above elegant mechanism is useful for solving various challenging problems in distributed computing. The mechanism can be further generalized/tweaked as well, for example, by adjusting how the puzzle pieces are generated. We can then use these puzzle pieces to check other interesting global properties in a distributed system. For more details, please see following:

Massively Parallel Aggregation in Dynamic Networks

Best Paper in SPAA 2020

1. Background on dynamic networks

Imagine a futuristic scenario with many smart vehicles on the road. Each smart vehicle can directly communicate with near-by vehicles using wireless communication, as long as they are within the radio range of each other. This essentially gives us a computer network with a certain topology. Interestingly, the topology of this network can change over time, as vehicles keep moving in and out of each other's radio range. Such networks are hence called dynamic networks. Because of the continuous topology changes in such dynamic networks, even basic problems (such as counting the number of nodes in the network) become challenging to solve.

2. Our work: Obtaining sublinear algorithms in dynamic networks

In this research work, we have invented the very first sublinear-complexity algorithms for solving a range of basic problems (such as counting the total number of nodes) in such networks. Our algorithms are based on the key novel idea of massively parallel aggregation. For example, when we want to send a numerical value from a node \(v\) to a node \(u\), we split the value evenly across all possible paths (of at most a certain length) from \(v\) to \(u\). Note that there can be an exponential number of such paths. We show that such splitting helps to constrain the damage caused by potential topology changes, which eventually leads to algorithm with sublinear time complexity. At the same time, somewhat surprisingly, we also find an efficient way to "implement" such a mechanism with proper data structures, despite the exponential number of paths involved.

3. For more details, please see following: