Automated Theorem Proving: A Retrospection & Applications of Formal Methods CS3234 Aquinas Hobor and Martin Henz.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

S3 Useful Expressions.
The Open Ended Response
Dynamic Typing COS 441 Princeton University Fall 2004.
FLIPPING THE CLASSROOM: ADVENTURES IN STUDENTS’ SELF DIRECTED STUDY ERI TOMITA AND JULIE DEVINE.
Intro to CIT 594
Proof-system search ( ` ) Interpretation search ( ² ) Main search strategy DPLL Backtracking Incremental SAT Natural deduction Sequents Resolution Main.
CSC 212 – Data Structures Prof. Matthew Hertz WTC 207D /
Supplementing lectures with additional online materials Matthew Juniper, CUED June 2007.
CS 330 Programming Languages 09 / 16 / 2008 Instructor: Michael Eckmann.
Introduction to a Programming Environment
Test Preparation Strategies
ITP © Ron Poet Lecture 1 1 IT Programming Introduction.
By the end of this session you should be able to...
Introduction CSE 1310 – Introduction to Computers and Programming
Autonomous Learning Proficiency: Getting students to think about their learning Lynn Grinnell College of Business.
COMP 111 Programming Languages 1 First Day. Course COMP111 Dr. Abdul-Hameed Assawadi Office: Room AS15 – No. 2 Tel: Ext. ??
Welcome to CS 3260 Dennis A. Fairclough. Overview Course Canvas Web Site Course Materials Lab Assignments Homework Grading Exams Withdrawing from Class.
CSE 501N Fall ‘09 00: Introduction 27 August 2009 Nick Leidenfrost.
By Edward Lim 8.7.  What?  Today we started the Cornerstone Piece and we were given a few tasks to complete. The tasks were to watch the Kurt Fearnly.
Process of Science The Scientific Method.
Designing in and designing out: strategies for deterring student plagiarism through course and task design Jude Carroll, Oxford Brookes University 22 April.
How To Study To Improve Your Grades. Two Important Things Your study area is JUST as important as HOW you study. Reading over your notes is NEVER the.
Martin Henz and Aquinas Hobor School of Computing National University of Singapore.
Hello World! CSE442. Course Summary A semester long group project – You will develop software from idea to implementation You have full freedom to choose.
Listen and learn!. * “READ THE BOOKS. I don't understand why some kids think they can take a test on a book they have never read. That is actually crazy,
1 Project Information and Acceptance Testing Integrating Your Code Final Code Submission Acceptance Testing Other Advice and Reminders.
Week 5 - Wednesday.  What did we talk about last time?  Exam 1!  And before that?  Review!  And before that?  if and switch statements.
Understand About Essays What exactly is an essay? Why do we write them? What is the basic essay structure?
Introduction to the Hawkes Learning Systems Environment for Math 2205, Statistics.
IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.
17-Dec-03 Intro to CIT 594 ~matuszek/cit594.html.
Dana Nau: CMSC 722, AI Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Chapter 25 Formal Methods Formal methods Specify program using math Develop program using math Prove program matches specification using.
Programming for Beginners Martin Nelson Elizabeth FitzGerald Lecture 5: Software Design & Testing; Revision Session.
Session 4: PREPARE FOR TESTS Year 7 Life Skills Student Wall Planner and Study Guide.
Science Andrea’s Student Led Conference. Cover Letter This year in science I have learned about many things. We learned a ton of important information.
Current Assignments Homework 2 is available and is due in three days (June 19th). Project 1 due in 6 days (June 23 rd ) Write a binomial root solver using.
Please CLOSE YOUR LAPTOPS, and turn off and put away your cell phones, and get out your note- taking materials.
Welcome to Seminar 8 “The wastebasket is a writer’s best friend,” by Isaac Bashevis Singer. -- Why do you think that is ?
The one thing I need to continue to work on the most is socialization. I don’t purposely talk to my friends during class, but sometimes I look up and.
Thank you for the kind feedback. I truly do hope you have enjoyed the course and have had a good learning experience. Most people said they found the course.
Patrik Hultberg Kalamazoo College
Student Perceptions of Hybrid Courses. Like about Hybrid Format Course 1 For a few weeks, can take things at your own pace Can cover more topics in less.
By Edward Lim 8.7. What? Today, we continued our research on our chosen Cornerstone Piece, we got our learning journals up to date, we made sure all our.
CSCI1600: Embedded and Real Time Software Lecture 28: Verification I Steven Reiss, Fall 2015.
Stressed for your Test? Not anymore!. Acing a test: 3 Key factors  Taking good notes  Without a good outline to study from, you will either learn too.
Mr. Matzka.  One of the most important parts of writing is being able to plan.  Set a calendar.  Stick to it!  Procrastination is the devil.  No.
David Evans Class 15: P vs. NP (Smiley Puzzles and Curing Cancer) CS150: Computer Science University of Virginia Computer.
The single most important skill for a computer programmer is problem solving Problem solving means the ability to formulate problems, think creatively.
Key Stage 2 Assessment Meeting 2016 (What do we know so far and what can you do to help?) I’ve included one or two notes, just in case you weren’t able.
© 2015 albert-learning.com How to talk to your boss How to talk to your boss!!
An Interview Dialogue Name: Period:. Step Five Interview- An Interview Dialogue You are going to read the question and pick the best response. The person.
INF3110 Group 2 EXAM 2013 SOLUTIONS AND HINTS. But first, an example of compile-time and run-time type checking Imagine we have the following code. What.
Don’t Worry, Be Happy By: Kendra Nuttall, Grecia Corona, and Avenly Millar.
Test Taking Skills. Multiple Choice Timing Plan for 30 sec. to 1 min. per question If you are stuck on a question, make a note of it then move on! Don’t.
CHAPTER 4 – EVALUATION GRADING AND BACKWASH Presenter: Diane Whaley.
Classic Connections: Innovative Methods for Making Education Work.
Get Organized Binders, Homework, Lockers. Binder Organization Use a binder system that works best for you Put you name, address and phone number on the.
DSMA 0399 Comments of Past Students. DSMA 0399 Student Comments “Before this class as you probably remember I would not even accept that x or y could.
Welcome to Introduction to Psychology! Let’s share a bit about where we are all from…
Introduction to CSCI 1311 Dr. Mark C. Lewis
Stacey K. Vargas VMI Department of Physics and Astronomy
© 2013 McGraw-Hill Companies. All Rights Reserved.
E4 Comments I’ve read all your E4 blog entries with great interest!
Automata and Formal Languages
SUPER SUCCESS SERIES TIME MANAGEMENT VOL. 1
Tackling Timed Writings
Computational Models of Discourse Analysis
Software Development Techniques
Presentation transcript:

Automated Theorem Proving: A Retrospection & Applications of Formal Methods CS3234 Aquinas Hobor and Martin Henz

Outline Reflections on Coq Applications of Formal Methods How to study for the Final 2

A proverb “Only those who have been bitten by the snake can understand how it feels.” 3

Why are theorem provers used? Very high assurance due to mechanical checking Checkers are very through: don’t get tired, don’t get bored, don’t make mistakes If anything, the problem is the opposite – trying to convince a checker that a true thing is true can be frustrating. 4

In our class… Many times I helped students who were sure that something was true, and just needed a little help to understand how to convince Coq. Usually it was not more than a line or two… 5

In our class… Many times I helped students who were sure that something was true, and just needed a little help to understand how to convince Coq. Usually it was not more than a line or two… … several times the difficulty was that the thing that they were sure was true was actually false. (e.g., n – = n : Not true!) 6

There are some contexts where bugs have enormous cost. Proofs about real software are hard to get right; Coq found a previously unknown bug in the proof in the (widely-used, second-edition) textbook. 7

Disadvantages of Automated Theorem Proving Developing the hints / proof by hand can be very labor-intensive It can be very difficult to formalize correctness – “correct” operating system? – “correct” web browser? – “correct” compiler? Learning curve to use systems can be steep 8

Disadvantages of Automated Theorem Proving Developing the hints / proof by hand can be very labor-intensive It can be very difficult to formalize correctness – “correct” operating system? – “correct” web browser? – “correct” compiler? Learning curve to use systems can be steep 9 Now you understand…

A bit like writing software in a scripting language “Building such scripts is surprisingly addictive, in a videogame kind of way…” - Xavier Leroy The advantage of never having to worry about bugs in the finished product Can work on math at 3 AM without fear 10 One more advantage… they are fun to use!

A bit like writing software in a scripting language “Building such scripts is surprisingly addictive, in a videogame kind of way…” - Xavier Leroy The advantage of never having to worry about bugs in the finished product Can work on math at 3 AM without fear 11 One more advantage… they are fun to use! Unclear if this is an advantage

In your own words… “I found COQ is really an amazing tool. I like to play with it now. haha!” “I love video games. Coq is the first programming language I use as a game. I really enjoy it haha!” “[Proof completed.] is the best feeling in the world!” (and many others) Of course, it’s harder to give negative feedback, so… 12

In the words of Nick Benton Senior Researcher, Microsoft Research, Cambridge UK Wrote a paper in 2006 titled, “Machine Obstructed Proof” en-us/um/people/nick/mop.pdf 13

In the words of Nick Benton “After years doing programming language theory without going near a proof assistant, I was finally convinced by the POPLmark ‘buzz’ and conversations at ICFP’05 that it was time to try one.” … [there was a workshop giving an introduction to Coq] … “The workshop description says “the available tools are [...] difficult to learn, inadequately documented, and lacking in specific library facilities required for work in programming languages”. I can confirm that I have rarely felt as stupid and frustrated as I did during my first few weeks using Coq.” 14

In the words of Nick Benton “Scripts are unreadable by themselves, as one has no idea what the tactics are doing to the proof state, and the documentation for them is incomprehensible to the novice. The only thing that works is lots and lots of trial and error in an interactive environment, and I still couldn’t give a coherent general description of what some of the tactics I’ve used many times actually do, or how they differ from half a dozen apparently similar ones. And basic ones are still missing; I spent days fighting with elim, case, destruct and variations on induction and still kept finding myself having done case splits without the information about which branch I was in. This was so frustrating I gave up on Coq (and spent a week playing with HOL Light) until Georges Gonthier showed me the magic, and frankly bizarre, incantation generalize (refl_equal x); pattern x at -1; case x. ” 15

In the words of Nick Benton “There are bugs. On Windows, CoqIDE falls over, whilst Proof General only works with the original 2004 version of Coq 8. I spent ages defining Setoid structures on everything, only to find Setoid rewriting throws an ‘anomaly’ exception in interesting contexts.” (you may recall a HW assignment (hw5) where I said to use “ generalize ” instead of “ spec ”) 16

In the words of Nick Benton “I had many similar difficulties, but then started to make progress. Just having intermediate stages of the work in a computerized form rather than on many pages of paper proved a major benefit. Far more often than I’d expected, one can alter definitions and then mildly tweak the previous version of a proof to keep it up to date. On paper, I tend to keep going back to the top and doing everything from scratch to be sure everything is still consistent.” 17

In the words of Nick Benton “Automated proving is not just a slightly more fussy version of paper proving and neither (Curry-Howard notwithstanding) is it really like programming. It’s a strange new skill, much harder to learn than a new programming language or application, or even many bits of mathematics. I’m resistant to investing significant effort in tools (I don’t write clever TeX or Emacs macros), but the payoff really came the second time I used Coq: I was able to prove some elementary but delicate results for a different paper in just a day or so. Coq is worth the bother and it, or something like it, is the future, if only we could make the initial learning experience a few thousand times less painful.” 18

Of note There is a somewhat similar course being offered in the past year to undergraduates – Using Coq to examine program semantics – University of Pennsylvania – Princeton University – Harvard University (a bit different) – Probably others For good or evil, you have been on the bleeding edge of formal methods teaching… 19

So… Well done for getting through it! Hopefully you have learned a lot (we noticed that your pen-and-paper induction proofs improved significantly), and had some fun as well. With a bit of luck you don’t hate your Profs (too much!) for putting it in the course… If you have suggestions for making the process easier in future years, please let us know. 20

Outline Reflections on Coq Applications of Formal Methods How to study for the Final 21

How do people use formal methods? 1.Type theory 2.Program Analysis 3.Proof-Carrying Code 4.Certified compilation 22

Type Theory You are familiar with languages like Java, where your variables (& function parameters, etc.) have a declared type. – int myInt; – List > myListofListofInts; – Object myObject; The idea is to classify values into sets. During compilation, there is a phase called type-checking, where the compiler looks for type errors, such as – myListofListofInts := 5; // Oops… what does this mean? 23

Type Theory Why do we do this? Because type errors almost always are mistakes in the program. – Analogy is to units in science (e.g., a Newton of force is 1 kilogram-meter per second squared). – You check your units when you do science for the same reason that it is a good idea to do type checking when programming. We can (usually) do type-checking in ~ linear time, and can discover many bugs much faster than testing. – This is a form of proof generation (which we have only explained very little of in this course, but have used a lot, e.g., “auto”, “omega”, etc.) – The idea with a type system is to trade off expressivity for automatic generation. 24

Type Theory The field that studies program typing is called Type Theory. It turns out that there is a deep connection between formal logic and type theory, called the Curry-Howard Isomorphism. Languages with strong type systems include – Java / C# – SML / oCaml – Haskell Languages with weak type systems include – C / C++ Languages without any type systems include – Python/JavaScript – Assembly (some have weak typing) 25

Program Analysis Another use for formal methods is automatic program analysis; e.g., bounds checking. Lots of code uses integers that are supposed to stay within certain bounds – Arrays – Overflow (max_int + 1 – bad idea) It is possible to develop software that examines source code and automatically proves that (for example) an array access is always in-bounds. 26

Bounds Checking These automatic tools usually require a bit of help from the programmer. But they are very effective: can catch many bugs long before they would have been caught through testing. Companies that use this kind of technology: – Airbus – Microsoft – etc. 27

Bounds Checking How it works (one choice) Programmer puts annotations in his code – int x __inrange(0,10); – int n __lessthan(sizeof(myArray)); The tool then uses a form of Hoare logic to propagate these bounds in the code If it is unable to prove that the invariant is obeyed, it complains To run on real code, tools must be very complex – Pointer arithmetic, aliasing, etc. etc. 28

Proof-Carrying Code Problem: we would like to have a way to get code from an untrusted source, which we then want to run on our computer – e.g., Web Browser This sounds like a bad idea… But (it turns out) people really want to do it. 29

Proof-Carrying Code So, we will require that the untrusted code provider sends us both – The code, which we will run only after checking – A proof that the code obeys some safety policy Simplest version you are familiar with: – Java web applets 30

Proof-Carrying Code How it works: – Modify compiler to take a proof at the source code level (maybe just a proof that the source program is well-typed) and transform that proof as it compiles into a proof at the machine level. – Modify runtime system to accept both code and proof; it must check proof, then run code 31

Certified Compilation Bugs in a program that are due to a mistake in the compiler (“compiler bugs”) are some of the most difficult to find – can take weeks or months. The problem is, the source code is right. Compilers are some of the most complicated pieces of software there are. – Hundreds of thousands of lines, highly complicated algorithms 32

Certified Compilation The goal of a certified compiler is to have a compiler with no bugs. – That is, the source program is always correctly translated into the target language. We have seen in lecture 12 some of the tools used: you give both the source and target languages a formal semantics, and then prove: – ¾  ¾ ’ ) C( ¾ )  * C( ¾ ’) – That is, that the compiler C preserves the semantics of the program. 33

Outline Reflections on Coq Applications of Formal Methods How to study for the Final 34

Final There is a paper part and a Coq part – Total: 90 points – Paper: 50 points, 60% – Coq: 40 points, 40% You will have two hours (combined) to do both parts. – You can switch back and forth if you like. 35

Paper part: Format Fair game topics: – Everything covered in lectures 1-12, midterm, quizzes 1-3, homework 1-10 Not multiple choice; instead: 36

Paper part: Format Fair game topics: – Everything covered in lectures 1-12, midterm, quizzes 1-3, homework 1-10 Not multiple choice; instead: Proofs! (Oh boy!) – Natural deduction – Induction – Semantic proofs – Hoare proofs – etc. 37

Coq part: Format Five Questions 1.First-order logic * 2.Induction 3.Modal logic 4.Modal logic * 5.Hoare logic * Similar to questions 1 and 2 on the Coq quiz 38

Coq part: Format Each problem is worth 9 points (out of 90, so 40%) Maximum 36 points: pick 4 out of the 5 – Skip one if you get stuck! No extra credit – Spend the extra time on the paper part We are assuming that it will take about an hour – Around 15 minutes per question – Quiz: 2 questions, 23 minutes (11.5 minutes each) 39

Coq part: study strategy Make sure that you can do the quiz in a reasonable amount of time (again, two questions on the final are similar to the two quiz questions. If you get them done quickly then you have 20% of the points already.) Review major frameworks (Modal logic in HW5 and Hoare logic in HW10). You will not have enough time to figure them out again during the exam. You can redo some problems to remind yourself how those parts work. 40

Test strategy Remember: you can skip one Coq problem – If you get stuck, move on: maybe another one is easier. – Don’t Panic! If you get stuck then do paper part for awhile, maybe an idea for Coq part will occur. Watch the time. It’s not a short exam. Don’t get so absorbed with the Coq part that you neglect paper part, or vise versa. 41

Last but not least… Good Luck on your finals (including ours) and Thanks for taking CS3234! – Martin and Aquinas 42