A Framework for describing recursive data structures Kenneth Roe Scott Smith
Shape analysis and Recursive data structures The objective is to verify the integrity of dynamic data structures such as lists and trees Based on principles of separation logic Builds on work from Byron Cook and company The key contribution is reasoning about data structures more complex than linked lists –Regular expressions are used to describe paths through data structures Creating a COQ formalism
Sample progam Data Structures struct list { struct list *n; struct tree *t; }; struct tree { struct tree *l, *r; int value;};
Sample program code Struct list *p; void build pre order(struct tree *r) { struct list *i = NULL, *n, *x; struct tree *t = r; p = NULL; while (t) { n=p; p = malloc(sizeof(struct list)); p->l = t; p->n = n; if (t->l==NULL && t->r==NULL) { if (i==NULL) { t = NULL;} else { struct list *tmp = i->n; t = i->l; free(l); i = tmp;} } else if (t->r==NULL) {t = t->l; } else if (t->l==NULL) {t = t->r; } else {n = i; i = malloc(sizeof(struct list)); i->n = n; x = t->r; i->t = x; t = t->l; } } }
Invariants The program maintains two well formed linked lists, the heads of which are pointed to by i and p. p nt nt … i nt nt … r l4r l2r l1r l6r l3rl5r
Invariants The program maintains a well formed tree pointed to by r. p nt nt … i nt nt … r l4r l2r l1r l6r l3rl5r
Invariants t always points to an element in the tree rooted at r. p nt nt … i nt nt … r l4r l2r l1r l6r l3rl5r
Invariants The two lists and the tree do not share any nodes. p nt nt … i nt nt … r l4r l2r l1r l6r l3rl5r
Invariants Other than the memory used for the two lists and the tree, no other heap memory is allocated. p nt nt … i nt nt … r l4r l2r l1r l6r l3rl5r
Invariants The t field of every element in both list structures points to an element in the tree. p nt nt … i nt nt nil r l4r l2r l1r l6r l3rl5r t
State representation r (l |r)* t ( v. z.p n *v v t z r (l |r)* z) ( v. z.i n *v v t z r (l |r)* z) nt nt … R n (i, ) nt nt nil R (l |r) (r, ) l4r l2r l1r l6r l3rl5r R n (p, ) ** t
Backward reasoning Logic rules for back propagation Generated preconditions imply post condition Not guaranteed to get weakest pre-condition The system also contains rules for merging states –Becomes necessary when joining the branches of an if statement
Back-chaining example Last line of source code: n = i; i = malloc(sizeof(struct list)); i->n = n; x = t->r; i->t = x; t = t->l;
Back-chaining example r (l |r)* t ( v. z.p n *v v t z r (l |r)* z) ( v. z. i n *v v t z r (l |r)* z) R n (p, ) * R (l |r) (r, ) * R n (i, ) p nt nt … i nt nt nil r l4r l2r l1r l6r l3rl5r t
Back-chaining example t l q r (l |r)* q ( v. z.p n * v v l z r (l |r)* z) ( v. z. i n *v v l z r (l|r)* z) R n (p, ) * R (l |r) (r, ) * R n (i, ) p nt nt … i nt nt nil r l4r l2r l1r l6r l3rl5r t t = t->l
Back-chaining example t l q r (l |r)* t ( v. z.p n *v v l z r (l |r)* z) ( v. z.i n *v v l z r (l |r)* z) R n (p, ) * R (l|r) (r, ) * R n (i, ) p nt nt … i nt nt nil r l4r l2r l1r l6r l3rl5r t t = t->l
Back-chaining example We have back propagated over the last statement. We have several more statements to go n = i; i = malloc(sizeof(struct list)); i->n = n; x = t->r; i->t = x; t = t->l;
Back-chaining example After back-propagating over the remaining statements, we end up with the following which is almost our original invariant: t l q t r e r (l |r)* t ( v. z. p n *v v l z r (l |r)* z) ( v. z. i n *v v l z r (l |r)* z) | R n (p, ) * R (l |r) (r, ) * R n (i, )
Future work COQ verification (in progress) Arrays Length predicate Handling procedures More information: