1
|
Sparse Hashing for Scalable Approximate Model Counting: Theory and Practice
Kuldeep S. Meel ⓡ S. Akshay
Proceedings of Logic in Computer science (LICS), 2020.
Given a CNF formula F on n variables, the problem of model counting, also referred to as #SAT, is to compute the number of models or satisfying assignments of F. Model counting is a fundamental but hard problem in computer science with varied applications. Recent years have witnessed a surge of effort towards developing efficient algorithmic techniques that combine the classical 2-universal hashing (from [Stockmeyer 1983]) with the remarkable progress in SAT solving over the past decade. These techniques augment the CNF formula F with random XOR constraints and invoke an NP oracle repeatedly on the resultant CNF-XOR formulas. In practice, the NP oracle calls are replaced by calls to a SAT solver and it is observed that runtime performance of modern SAT solvers (based on conflict-driven clause learning) on CNF-XOR formulas is adversely affected by the size of XOR constraints. However, the standard construction of 2-universal hash functions chooses every variable with probability p =1/2 leading to XOR constraints of size n/2 in expectation. Consequently, the main challenge is to design sparse hash functions, where variables can be chosen with smaller probability and lead to smaller sized XOR constraints, which can then replace 2-universal hash functions. In this paper, our goal is to address this challenge both from a theoretical and a practical perspective. First, we formalize a relaxation of universal hashing, called concentrated hashing, a notion implicit in prior works to design sparse hash functions. We then establish a novel and beautiful connection between concentration measures of these hash functions and isoperimetric inequalities on boolean hypercubes. This allows us to obtain tight bounds on variance as well as the dispersion index and show that p = O(log m /m ) suffices for the design of sparse hash functions from 2^n to 2^m belonging to concentrated hash family. Finally, we use sparse hash functions belonging to this concentrated hash family to develop new approximate counting algorithms. A comprehensive experimental evaluation of our algorithm on 1896 benchmarks with computational effort of over 20,000 computational hours demonstrates significant speedup compared to existing approaches. To the best of our knowledge, this work is the first study to demonstrate runtime improvement of approximate model counting algorithms through the usage of sparse hash functions, while still retaining strong theoretical guarantees (a la 2-universal hash functions).
|
2
|
BIRD: Engineering an Efficient CNF-XOR SAT Solver and its Applications to Approximate Model Counting
Mate Soos and Kuldeep S. Meel
Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 2019.
Given a Boolean formula F, the problem of model counting, also referred to as #SAT is to compute the number of solutions of F. Model counting is a fundamental problem in artificial intelligence with a wide range of applications including probabilistic reasoning, decision making under uncertainty, quantified information flow, and the like. Motivated by the success of SAT solvers, there has been surge of interest in the design of hashing-based techniques for approximate model counting for the past decade. We profiled the state of the art approximate model counter ApproxMC3 and observed that over 99.99% of time is consumed by the underlying SAT solver, CryptoMinisat. This observation motivated us to ask: Can we design an efficient underlying CNF-XOR SAT solver that can take advantage of the structure of hashing-based algorithms and would this lead to an efficient approximate model counter? The primary contribution of this paper is an affirmative answer to the above question. We present a novel architecture, called BIRD, to handle CNF-XOR formulas arising from hashing-based techniques. The resulting hashing-based approximate model counter, called ApproxMC3, employs the BIRD framework in its underlying SAT solver, CryptoMinisat. To the best of our knowledge, we conducted the most comprehensive study of evaluation performance of counting algorithms involving 1896 benchmarks with computational effort totaling 86400 computational hours. Our experimental evaluation demonstrates significant runtime performance improvement for ApproxMC3 over ApproMC2. In particular, we solve 648 benchmarks more than ApproMC2, the state of the art approximate model counter and for all the formulas where both ApproMC2 and ApproxMC3 did not timeout and took more than 1 seconds, the mean speedup is 284.40 -- more than two orders of magnitude.
|
3
|
Quantitative Verification of Neural Networks And its Security Applications
Teodora Baluta, Shiqi Shen, Shweta Shine, Kuldeep S. Meel, and Prateek Saxena
ACM Conference on Computer and Communications Security (CCS), 2019.
Neural networks are increasingly employed in safety-critical domains. This has prompted interest in verifying or certifying logically encoded properties of neural networks. Prior work has largely focused on checking existential properties, wherein the goal is to check whether there exists any input that violates a given property of interest. However, neural network training is a stochastic process, and many questions arising in their analysis require probabilistic and quantitative reasoning, i.e., estimating how many inputs satisfy a given property. To this end, our paper proposes a novel and principled framework to quantitative verification of logical properties specified over neural networks. Our framework is the first to provide PAC-style soundness guarantees, in that its quantitative estimates are within a controllable and bounded error from the true count. We instantiate our algorithmic framework by building a prototype tool called NPAQ that enables checking rich properties over binarized neural networks. We show how emerging security analyses can utilize our framework in 3 concrete point applications: quantifying robustness to adversarial inputs, efficacy of trojan attacks, and fairness/bias of given neural networks.
|
4
|
On testing of Uniform Samplers
Sourav Chakraborty and Kuldeep S. Meel
Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 2019.
Recent years have seen an unprecedented adoption of artificial intelligence in wide variety of applications ranging from medical diagnosis, automobile, security to aircraft collision avoidance. Probabilistic reasoning is a key component of such modern artificial intelligence systems. Sampling techniques form the core of the state of the art probabilistic reasoning systems. In contrast to testing for deterministic programs, where one trace is sufficient to prove the existence of a bug; such is not the case for samplers as one sample is typically not sufficient to prove non-conformity of the sampler to the desired distribution. This makes one wonder: whether it is possible to design testing methodology to test whether a sampler under test generates samples close to a given distribution. The primary contribution of this paper is a positive answer to the above question: We design, to the best of our knowledge, the first algorithmic framework, Barbarik, to test whether the distribution generated by $\varepsilon-$close or $\eta-$far from the uniform distribution. In contrast to the sampling techniques that require an exponential or sub-exponential number of samples for sampler whose support can be represented by $n$ bits, Barbarik requires only $\mathcal{O}(1/(\eta-\varepsilon)^2)$ samples. We present a prototype implementation of Barbarik and use it to test three state of the art uniform samplers over the support defined by combinatorial constraints. Barbarik is able to provide a certificate of uniformity to one sampler and demonstrate non-uniformity for the other two samplers.
|
5
|
On Computing Minimal Independent Support and Its Applications to Sampling and Counting
Alexander Ivrii, Sharad Malik, Kuldeep S. Meel, and Moshe Y. Vardi
Proceedings of International Conference on Constraint Programming (CP), 2015.
Best Student Paper Award
Constrained sampling and counting are two fundamental problems arising in domains ranging from artificial intelligence and security, to hardware and software testing. Recent approaches to approximate solutions for these problems rely on employing SAT solvers and universal hash functions that are typically encoded as XOR constraints of length n/2 for an input formula with n variables. As the runtime performance of SAT solvers heavily depends on the length of XOR constraints, recent research effort has been focused on reduction of length of XOR constraints. Consequently, a notion of Independent Support was proposed, and it was shown that constructing XORs over independent support (if known) can lead to a significant reduction in the length of XOR constraints without losing the theoretical guarantees of sampling and counting algorithms. In this paper, we present the first algorithmic procedure (and a corresponding tool, called MIS) to determine minimal independent support for a given CNF formula by employing a reduction to group minimal unsatisfiable subsets (GMUS). By utilizing minimal independent supports computed by MIS, we provide new tighter bounds on the length of XOR constraints for constrained counting and sampling. Furthermore, the universal hash functions constructed from independent supports computed by MIS provide two to three orders of magnitude performance improvement in state-of-the-art constrained sampling and counting tools, while still retaining theoretical guarantees.
|