Veil: Bridging the Gap Between Speed and Rigor in Verifying Complex Distributed Systems
Why Verifying Distributed Systems Is Mission-Critical
Imagine the technology that powers everything from cloud services to blockchains suddenly failing because of a hidden flaw. As our reliance on large, interconnected distributed systems grows, ensuring their correctness and safety becomes not just a technical challenge, but a societal one. A failure in a cloud-based financial system could halt businesses; an error in a blockchain could cost millions. Despite this, verifying these complex systems remains one of the most daunting tasks in computer science.
Traditional methods offer a stark trade-off. On one end, there are fully manual proofs created using powerful interactive proof assistants. These provide a high degree of certainty but require months or even years of painstaking work by specialized experts. On the other end, fully automated verification tools promise speed but often require developers to contort their descriptions of systems into unnatural forms, and even then, they may be unable to prove complex properties.
This gap between thoroughness and usability has limited the scalability of formal verification efforts. Bridging it could revolutionize the reliability of critical digital infrastructure. That’s precisely the promise behind Veil, a groundbreaking tool developed by researchers at NUS Computing’s Verified Systems Engineering (VERSE) Lab led by Associate Professor Ilya Sergey.
The Pain Points of Traditional Verification
The difficulty in verifying distributed systems stems from their sheer complexity. Multiple computers interact across unreliable networks, facing unpredictable failures and delays. To be confident that such a system behaves correctly in every possible scenario is akin to proving that a massive, chaotic ballet will never miss a single step.
Interactive proof assistants like Lean, Coq, or Isabelle can rigorously verify these systems, but at enormous human cost. It’s not uncommon for verification projects to consume thousands of hours from highly trained formal method experts.
Meanwhile, automated tools like Ivy or mypyvy offer quicker checks but with serious limitations. They often require “contorted specifications” – rewriting what you want to verify into awkward forms that fit the tool’s limited language. Worse, when automation fails, these tools provide little support for users to manually prove what remains.
Thus, developers have faced a painful choice: aim for very high assurance but pay the price in time and expertise, or aim for speed and accept weaker, narrower guarantees.
Veil’s Big Idea: The Best of Both Worlds
Veil is designed to break this false dichotomy. It provides “push-button” verification when possible but offers seamless access to the full power of interactive proof when needed. Built on the modern proof assistant Lean 4, Veil offers a unified workflow where developers can first lean on automation and, if necessary, escalate to richer manual proofs without changing tools or losing rigor.
Importantly, the core of Veil, the part that translates system descriptions into logical puzzles for solvers, is itself formally proven sound. This foundational guarantee means that users can have greater trust that verification results reported by Veil are correct, as the risk of bugs in the verifier itself is reduced.
How Veil Works: A Closer Look
Veil structures system verification around three components: the system’s state, its possible actions, and the invariants that must always hold.
- State captures all mutable information, like which node holds a token in a mutual exclusion protocol.
- Actions describe events that can change the state, like sending a message or granting access.
- Invariants are the safety properties you want to guarantee, such as “no two nodes can be in the critical section simultaneously.”
Developers write these components in a high-level, human-readable way. Veil then automatically generates verification conditions – logical formulas that must be true if the system behaves correctly – and passes them to powerful solvers like cvc5 or Z3.
If the solvers succeed, great; the invariant is verified. If not, Veil lets users step into Lean’s interactive proof mode, where they can craft detailed, human-guided proofs with strong automation support to finish the job.
Veil also supports bounded model checking for quick sanity checks and takes care to produce solver-friendly verification conditions. When counterexamples are found, it employs model minimization to simplify them, making debugging easier.
Real-World Testing: Veil in Action
To prove its effectiveness, Veil was tested on a battery of 16 benchmark distributed protocols, including complex algorithms like Vertical Paxos. It successfully verified all of them; something not all other tools could achieve. Moreover, verification times were practical. Many case studies completed in under 10 to 15 seconds, making it realistic for developers to integrate formal verification into everyday development cycles.
Perhaps even more impressively, Veil could handle specifications and properties that fell outside the “effectively propositional” fragment that limits older tools. This means it can tackle richer, more expressive properties, which is key for real-world distributed systems.
In porting formal proofs from other systems, like the Stellar Consensus Protocol (SCP) and Rabia, Veil even helped discover inconsistencies in proofs conducted using a mixture of verification tools, showing the usefulness of a unified framework to avoid human errors which previous approaches are prone to.
Why This Matters: A Future of Safer Systems
The stakes for reliable distributed systems are only rising. Financial systems, healthcare infrastructures, public blockchains, and future metaverse platforms all rely on distributed coordination among computers that must handle failures gracefully.
If we can significantly lower the cost and expertise needed to formally verify these systems, it could unlock a new era of safer, more trustworthy digital infrastructure. Developers could routinely verify complex protocols without requiring years of specialized training. Companies could ship safer products without prohibitive costs. Regulators could demand higher verification standards, knowing that practical tools exist to meet them.
For example, in the context of public blockchains, which manage billions in assets, more accessible formal verification could prevent vulnerabilities like double-spend attacks or smart contract bugs. In cloud computing, companies could more confidently offer fault-tolerant services knowing their distributed algorithms are mathematically guaranteed to behave correctly.
Conclusion: Enabling a New Era of Verified Systems
The development of Veil is more than an academic exercise; it could fundamentally change what is possible in the real-world deployment of complex distributed systems.
Today, many industries hesitate to fully embrace formal verification because of the steep costs and technical barriers. But with Veil’s blend of automation, expressiveness, and efficiency, this equation shifts. It brings us closer to a future where formally verified protocols are not exceptional luxuries, but standard practice.
In cloud computing, for example, services like distributed storage, consensus systems, and fault-tolerant microservices could now be designed with formally verified safety properties, dramatically reducing the risk of silent data loss, split-brain errors, or service outages. Teams could verify key properties of their systems incrementally during development, rather than months after deployment, or worse, after a critical failure.
In blockchain and decentralized finance (DeFi), where billions of dollars flow through public, open networks, Veil’s approach could enable developers to formally verify the safety, liveness, and consistency of smart contracts, consensus protocols, and cross-chain bridges. Rather than reacting to catastrophic vulnerabilities only after they are exploited, blockchain systems could ship with machine-checked guarantees of behavior.
Similarly, in emerging fields like autonomous vehicles, drones, and smart city infrastructure, where distributed coordination among independent agents is crucial, Veil’s framework could be applied to verify that safety-critical invariants—like collision avoidance or priority rules—are always maintained, even in the face of partial failures or network delays.
Beyond these immediate applications, the impact on software engineering culture could be profound. With tools like Veil making formal verification more accessible and faster, new generations of engineers could be trained to think about proofs and invariants as a routine part of system design – just as they are already trained to think about testing or security today. Verified-by-construction systems could become the norm, dramatically improving the reliability and trustworthiness of the digital world.
Veil’s modularity and portability mean it could inspire a broader ecosystem of verification tools tailored to different domains, from IoT networks to distributed AI training platforms. The long-standing tension between moving fast and building safe, reliable distributed systems could finally start to ease.
Veil can also bring about a paradigm shift in how we think about regulation for critical software systems. Rather than solely relying on best practices and ad-hoc testing, regulators and standards bodies could begin requiring formal verification for systems that meet certain criticality thresholds. For instance, financial regulators could mandate that blockchain-based financial infrastructure undergo machine-checkable proofs of key properties like consistency and liveness. Cloud service providers could be asked to produce formal verifications for their consensus protocols before being certified for high-availability claims. Importantly, regulators should also recognize the need to balance burden and practicality. Mandates should evolve alongside tools like Veil, ensuring that compliance is achievable without imposing unrealistic costs on developers.
In short, Veil doesn’t just offer a better way to verify today’s systems. It opens the door to building the next generation of distributed technologies – technologies we can trust from the ground up.
Veil is available open-source under the Apache 2.0 license at https://github.com/verse-lab/veil
Further Readings: Pîrlea, G., Gladshtein, V., Kinsbruner, E., Zhao, Q. and Sergey, I. (2025) “Veil: A Framework for Automated and Interactive Verification of Transition Systems,” 37th International Conference on Computer Aided Verification (CAV 2025), July 21-25, Zagreb, Croatia.