Research Summary (2018-20)

My research program’s long-term vision is to advance automated reasoning techniques to enable computing to deal with increasingly uncertain real-world environments. From the core technical perspective, my research program is motivated by the unprecedented advances in the combinatorial solving techniques over the past three decades, often called as NP revolution, owing to the focus on the fundamental problem in Nondeterministic Polynomial (NP) time complexity: satisfiability (SAT). The NP revolution offers us the opportunity to develop scalable techniques for problems that lie Beyond NP. In particular, I focus on four fundamental problems that lie beyond NP: constrained counting, constrained sampling, functional synthesis, and maximum satisfiability. My research program is focused on enabling Beyond NP Revolution via making progress on the following critical enabling ingredients:

Theme 1: Beyond Worst-Case Algorithmic Design

While theoreticians have studied the aforementioned problems for over thirty years, such studies employed the narrow lens of SAT oracles, which did not broadly allow a more nuanced characterization of the SAT queries’ complexity. In contrast, we focus on designing algorithms with SAT queries that can be efficiently solved by the underlying SAT solvers. In the context of approximate counting, we have developed a hashing-based framework, ApproxMC, that relies on sparse XORs to circumvent the practical hardness of queries based on dense XORs [MSV19a, MSV19b, ABM20, MA20, BM20]. The LICS20 paper [MA20](co-authored with S. Akshay) resolved the longstanding open problem of designing a sparse XOR-based algorithmic framework that can achieve runtime improvement without losing theoretical guarantees. As a high degree of symmetry in the underlying constraints amplifies the practical hardness, we rely on domainlevel symmetry breaking for performance improvement [WUA+20]. In the context of sampling, our framework, Waps, employs the progress in knowledge compilation to achieve scalability without losing theoretical guarantees [SGRM18, GSRM19]. In follow-up work, we improved the underlying algorithmic framework for knowledge compilation via probabilistic caching enabled via universal hashing [SRSM19]. Building on the success of constrained sampling and machine learning techniques, we proposed a novel algorithmic framework, called Manthan, for functional synthesis that achieved significant breakthrough in the number of instances solved [GRM20a].

Theme 2: Applications

A key ingredient in the NP revolution was focus on the instances arising from the real-world, allowing the design of techniques tuned to practical instances. In a similar vein, we focus on discovering new applications for the aforementioned problems and the design of efficient modeling schemes. In an interdisciplinary collaboration, we reduced the problem of reliability of power transmission grids to constrained counting [PDOMV19]. Furthermore, we reduce the quantitative verification of neural networks [NSM+19, BSS+19], and quantification of information flow [BEH+18] to constrained counting. Building on our algorithmic progress in constrained sampling, we design efficient technique for higher $t$-wise coverage in the context of highly configurable software systems [BLM20]. In regards to maximum satisfiability, we showed that the problem of decoding in group testing reduces to maximum satisfiability [CGSM20]. In another line of work, we design efficient provable techniques for learning interpretable classifiers via reduction to MaxSAT [MM18, GM19, GMM20].

Theme 3: Query-Driven Design of Solvers

SAT solvers’ usage inside broader algorithmic frameworks gives rise to the rich structure of the SAT queries. We have taken a twofold approach: On one hand, our recent work on uncovering the structure of solution spaces through the lens of phase transition behavior [PJM19, GRM20b]. As a next step, we focus on engineering the underlying architecture of the SAT solvers. Given the crucial importance of CNF-XOR solving for counting and sampling, we have proposed a novel architecture, BIRD, relying on the tight integration of CDCL and Gauss-Jordan Elimination to achieve significant improvements for CNF-XOR solving [SM19, SGM20]. Motivated by applications in cryptography and collaboration with the Defense Service Organization (DSO), we developed a new architecture, Bosphorus, for CNF-ANF formulas [CSCM19], and further improved performance with phase saving [SM20]. Our most ambitious project, CrystalBall [SKM19], focuses on the data-driven synthesis of SAT Solvers by combining the advances in proof systems with supervised learning.

Theme 4: Testing Framework

Similar to the pivotal role played by testing frameworks in the $NP$ $revolution$, the realization of the Beyond NP revolution would need scalable testing and verification methodologies. As a first step, we have focused on testing of constrained samplers. In contrast to conventional programs, one trace/sample is typically inadequate to prove the non-conformity of the underlying samplers. We are developing an algorithmic framework, Barbarik, that sits at the intersection of property testing and symbolic reasoning [CM19]. Barbarik [CM19, MPC20] can handle product distributions and is the first tester to require a constant number of samples, in contrast to prior work requiring exponentially many samples.

Open-Source Tools

Our research has led to the release of 10 actively maintained open source tools . Our SAT solver entry won 3rd place in the Main Track of the highly prestigious and competitive SAT competition 2020 (the first instance of a top-3 finish by an entry from Singapore) while our model counting entry won 1st place in two of three tracks.