Automatic Induction Proofs of Data-Structures in Imperative Programs

Duc-Hiep Chu, Joxan Jaffar and Minh-Thai Trinh


We consider the problem of automated reasoning about dynamically manipulated data structures. Essential properties are encoded as predicates whose definitions are formalised via user-defined recursive rules. Traditionally, proving relationships between such properties is limited to the \emph{unfold-and-match} (U+M) paradigm which employs systematic transformation steps of folding/unfolding the rules. A proof, using U+M, succeeds when we find a sequence of transformations that produces a final formula which is obviously provable by simply \emph{matching} terms.

Our contribution here is the addition of the fundamental principle of \emph{induction} to this automated process. We first show that some proof obligations that are dynamically generated in the process can be used as \emph{induction hypotheses} in the future, and then we show how to use these hypotheses in an \emph{induction step} which generates a new proof obligation aside from those obtained from the fold/unfold operations. While the adding of induction is an obvious need in general, no automated method has managed to include this in a systematic and general way. The main reason for this is the problem of avoiding \emph{circular reasoning}. Our main result overcomes this with a novel checking condition. Our contribution is a proof method which -- beyond U+M -- performs \emph{automatic} formula re-writing by treating previously encountered obligations in each proof path as possible induction hypotheses.

In the practical evaluation part of this paper, we show how the commonly used technique of using unproven \emph{lemmas} can be avoided, using realistic benchmarks. This not only removes the current burden of coming up with the appropriate lemmas, but also significantly boosts up the verification process, since lemma applications, coupled with unfolding, often induce a very large search space. In the end, our method automatically reasons about a new class of formulas arising from practical program verification.