Download presentation
Presentation is loading. Please wait.
Published byProsper Hawkins Modified over 9 years ago
1
Learning Universally Quantified Invariants of Linear Data Structures Pranav Garg 1, Christof Loding, 2 P. Madhusudan 1 and Daniel Neider 2 1 University of Illinois at Urbana-Champaign 2 RWTH Aachen, Germany
2
Renewed interest in application of learning to synthesizing invariants [Sharma et al. CAV-12], [Sharma et al. SAS-13], [Kong et al. APLAS-10] Black-box learning of invariants: Advantages with respect to white-box techniques: - verification of complex program with simple invariants - generalization - apply extremely scalable Machine Learning algorithms for verification. 2 Black-box learning of invariants check Hypothesis? Program Learner Teacher H (hypothesis)
3
Active learning: - learner queries teacher with equivalence and membership queries Passive learning: - given a sample = (examples, counter-examples), learn the simplest concept 3 Active Learning and Passive Learning Teacher Active Learner membership/ equivalence yes/no Learner Sample S
4
Build active learning algorithms for learning quantified formulas over linear data structures (arrays/lists). - introduce Quantified Data Automata normal form for such invariants. - build active learning algorithm for QDAs. Build passive learning algorithm using active learning algorithm. - based on an imprecise teacher that answers questions wrt the samples. Introduce elastic QDAs (EQDAs) that translate to decidable logics. - develop learning algorithms for EQDAs. 4 Overview 5 789 head List pointed to by head is sorted
5
Program Configuration/Data words 5 893 2 head 4 7 i Program configuration: Data word:
6
Quantified Data Automata QDAs represent universally quantified properties of linear data structures. 6 Example: head y1y1 y2y2 data(y 1 ) <= data(y 2 )
7
Quantified Data Automata Fix P – program pointer variables Fix Y – set of quantified variables Fix F – numerical abstract domain over data formulas QDA over linear data structures: - reads a data word annotated with pointers P and Y - checks whether data stored at these positions satisfy a data property QDA accepts a data word w with pointers P if it accepts all possible extensions of w with valuations for Y. 7 head y1y1 y2y2 data(y 1 ) <= data(y 2 )
8
Valuation words Valuation word = data word over P + valuation for Y 8 Data word Valuation words Universal Quantification QDA accepts a data word iff it accepts ALL corresponding valuation words. 893 2 head 4 7 i 893 2 head, y1 4 7 i, y2 893 2 head 4 7 i, y2y1
9
Quantified Data Automata Deterministic, finite, register automata over words - each state labeled with a data formula f For a valuation word, QDA reads ptr. and univ. vars. and stores the data values in the register reg. At the final state, QDA checks if these data values satisfy the formula labeling the state. - reg satisfies f(q) Accepts the valuation word - reg does not satisfy f(q) Rejects the valuation word 9 head 2 y 1 4 i 8 y 2 8 reg: f(q) = data(y 1 ) <= data(y 2 ) 893 2 head 4 7 i, y 2 y1y1 893 2 head 4 7 i, y 2 y1y1
10
QDAs are finite automata which output data formulas. Lift Angluin’s L* algorithm for learning DFAs to learn QDAs. Given a teacher, the unique minimal QDA can be learned in time polynomial in the size of this minimal QDA. 10 Learning QDAs head y1y1 y2y2 data(y 1 ) <= data(y 2 ) Regular expressionoutputs data(y 1 ) <= data(y 2 )
11
11 Elastic Quantified Data Automata (EQDA) Subclass of QDAs which translate to decidable logics - Array Property Fragment (APF) [Bradley et al. VMCAI-06] - decidable fragment of Strand over lists [Madhusudan et al. POPL-11] Cannot test whether two universal vars. are a bounded distance away. Restriction for EQDAs: All transitions on blank symbols (no ptr./univ. var) must be self-loops outside APF inside APF y1y1 y2y2 y1y1 y2y2 QDAEQDA
12
12 Elastic Quantified Data Automata (EQDA) Unique minimal over-approximation theorem: A QDA A can be uniquely minimally over-approximated by a language of valuation words that is accepted by an EQDA A el The construction of A el given QDA A is called elastification. Learning EQDAs <= learning QDAs + elastification. A A el B el C el
13
Passively learning QDAs Given the samples S + and S -, the teacher uses them to answer the active learner. The teacher wants the active learner to construct a QDA that includes S + and excludes S -. Membership query: - if s belongs to S +, return yes - if s belongs to S -, return no - otherwise, return no (errs on keeping the learned concept semantically small) Equivalence query: - checks if conjectured invariant is consistent with S + and S - The learned QDA might be non-optimal (usually small). Running time is polynomial in the size of the learned QDA. 13 Teacher Sample S +, S - Active Learner Passive Learner
14
14 Experiments Run the program on arrays/lists of small bounded sizes, with data values from a bounded data-domain, eg. {0, 1, 2}, etc. Extract the concrete data-structures that get manifest at loop headers. Obtain the set S + on which passive learning is performed. - fix F to the cartesian lattice of atomic formulas over relations {=, <, ≤} Learn QDAs using Angluin’s algorithm - The learner never asks long membership queries - The teacher, thus, often has correct answers. The learned QDA is over-approximated to an elastic QDA to get a quantified invariant over decidable Strand or APF.
15
15 Experiments Programs#Equiv.#Mem#StatesTime (teacher)Time (learner) BUBBLE-SORT3447120.190.01 QUICK-SORT13750.030.00 SELECTION-SORT3306110.180.01 INSERTION-SORT3305110.190.00 HEAP-SORT15760.050.01 SORTED-FIND61683150.040.01 SORTED-INSERT31096200.040.01 SORTED-MERGE157754210.500.06 SORTED-REVERSE2439180.020.00 COPY2146101.750.00 COMPARE2146100.510.00 MAX71608140.080.00 INIT5879100.070.01 FIND212180.050.00 PARTITITON10118073811.400.11 SPLIT2287140.210.00 COREUTILS-SORT173750.030.07
16
16 Experiments Programs#Equiv.#Mem#StatesTime (teacher)Time (learner) BUBBLE-SORT3447120.190.01 QUICK-SORT13750.030.00 SELECTION-SORT3306110.180.01 INSERTION-SORT3305110.190.00 HEAP-SORT15760.050.01 SORTED-FIND61683150.040.01 SORTED-INSERT31096200.040.01 SORTED-MERGE157754210.500.06 SORTED-REVERSE2439180.020.00 COPY2146101.750.00 COMPARE2146100.510.00 MAX71608140.080.00 INIT5879100.070.01 FIND212180.050.00 PARTITITON10118073811.400.11 SPLIT2287140.210.00 COREUTILS-SORT173750.030.07
17
17 Experiments Programs#Equiv.#Mem#StatesTime (teacher)Time (learner) BUBBLE-SORT3447120.190.01 QUICK-SORT13750.030.00 SELECTION-SORT3306110.180.01 INSERTION-SORT3305110.190.00 HEAP-SORT15760.050.01 SORTED-FIND61683150.040.01 SORTED-INSERT31096200.040.01 SORTED-MERGE157754210.500.06 SORTED-REVERSE2439180.020.00 COPY2146101.750.00 COMPARE2146100.510.00 MAX71608140.080.00 INIT5879100.070.01 FIND212180.050.00 PARTITITON10118073811.400.11 SPLIT2287140.210.00 COREUTILS-SORT173750.030.07
18
18 Experiments Programs#Equiv.#Mem#StatesTime (teacher)Time (learner) BUBBLE-SORT3447120.190.01 QUICK-SORT13750.030.00 SELECTION-SORT3306110.180.01 INSERTION-SORT3305110.190.00 HEAP-SORT15760.050.01 SORTED-FIND61683150.040.01 SORTED-INSERT31096200.040.01 SORTED-MERGE157754210.500.06 SORTED-REVERSE2439180.020.00 COPY2146101.750.00 COMPARE2146100.510.00 MAX71608140.080.00 INIT5879100.070.01 FIND212180.050.00 PARTITITON10118073811.400.11 SPLIT2287140.210.00 COREUTILS-SORT173750.030.07
19
19 Experiments Programs#Equiv.#Mem#StatesTime (teacher)Time (learner) BUBBLE-SORT3447120.190.01 QUICK-SORT13750.030.00 SELECTION-SORT3306110.180.01 INSERTION-SORT3305110.190.00 HEAP-SORT15760.050.01 SORTED-FIND61683150.040.01 SORTED-INSERT31096200.040.01 SORTED-MERGE157754210.500.06 SORTED-REVERSE2439180.020.00 COPY2146101.750.00 COMPARE2146100.510.00 MAX71608140.080.00 INIT5879100.070.01 FIND212180.050.00 PARTITITON10118073811.400.11 SPLIT2287140.210.00 COREUTILS-SORT173750.030.07
20
Related Work Daikon [Ernst et al. ICSE-00] - conjunctive Boolean learning - learns quantified invariants over arrays, to some extent. Applications of learning in verification - rely-guarantee contracts [Cobleigh et al. TACAS-03, Alur et al. CAV-05] - stateful interfaces [Alur et al. POPL-05] - learning quantified invariants over predicates [Kong et al. APLAS-10] Machine learning algorithms for invariant synthesis [Sharma et al. CAV-12, SAS-13, ESOP-13] 20
21
Conclusion Learning universally quantified invariants over linear data structures - Quantified Data Automata (QDA) / elastic QDAs - Active learning for QDAs - Unique elastification - Algorithm for passive learning QDAs/EQDAs. - Experimental validation Future Work: Extensions to trees to capture universally quantified properties like binary-search-tree, max-heap, … Combining automata based structural learning with machine learning algorithms for learning data formulas 21 Thank You !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.