Data Model Property Inference and Repair

Slides:



Advertisements
Similar presentations
Auto-Generation of Test Cases for Infinite States Reactive Systems Based on Symbolic Execution and Formula Rewriting Donghuo Chen School of Computer Science.
Advertisements

Verification of DSMLs Using Graph Transformation: A Case Study with Alloy Zekai Demirezen 1, Marjan Mernik 1,2, Jeff Gray 1, Barrett Bryant 1 1 Department.
TU e technische universiteit eindhoven / department of mathematics and computer science Modeling User Input and Hypermedia Dynamics in Hera Databases and.
Alan Shaffer, Mikhail Auguston, Cynthia Irvine, Tim Levin The 7th OOPSLA Workshop on Domain-Specific Modeling October 21-22, 2007 Toward a Security Domain.
Pontus Boström and Marina Waldén Åbo Akademi University/ TUCS Development of Fault Tolerant Grid Applications Using Distributed B.
Identifying, Modifying, Creating, and Removing Monitor Rules for SOC Ricardo Contreras Andrea Zisman
CS162 Week 2 Kyle Dewey. Overview Continuation of Scala Assignment 1 wrap-up Assignment 2a.
1 How to transform an analyzer into a verifier. 2 OUTLINE OF THE LECTURE a verification technique which combines abstract interpretation and Park’s fixpoint.
Alternative Approach to Systems Analysis Structured analysis
Presented by: Thabet Kacem Spring Outline Contributions Introduction Proposed Approach Related Work Reconception of ADLs XTEAM Tool Chain Discussion.
Background information Formal verification methods based on theorem proving techniques and model­checking –to prove the absence of errors (in the formal.
An Integration of Program Analysis and Automated Theorem Proving Bill J. Ellis & Andrew Ireland School of Mathematical & Computer Sciences Heriot-Watt.
Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Eduard C. Dragut Ramon Lawrence Eduard C. Dragut Ramon Lawrence.
Train Control Language Teaching Computers Interlocking By: J. Endresen, E. Carlson, T. Moen1, K. J. Alme, Haugen, G. K. Olsen & A. Svendsen Synthesizing.
Chapter 6: Design of Expert Systems
Semantic description of service behavior and automatic composition of services Oussama Kassem Zein Yvon Kermarrec ENST Bretagne France.
Architecture-driven Modeling and Analysis By David Garlan and Bradley Schmerl Presented by Charita Feldman.
CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.
CS 290C: Formal Models for Web Software Lecture 9: MVC Architecture and Navigation Analysis Based on the MVC Architecture Instructor: Tevfik Bultan.
Developing Verifiable Concurrent Software Tevfik Bultan Department of Computer Science University of California, Santa Barbara
CS 290C: Formal Models for Web Software Lecture 1: Introduction Instructor: Tevfik Bultan.
CS 290C: Formal Models for Web Software Lecture 6: Model Driven Development for Web Software with WebML Instructor: Tevfik Bultan.
Software Architecture Patterns (2). what is architecture? (recap) o an overall blueprint/model describing the structures and properties of a "system"
Leveraging User Interactions for In-Depth Testing of Web Applications Sean McAllister, Engin Kirda, and Christopher Kruegel RAID ’08 1 Seoyeon Kang November.
CS 267: Automated Verification Lecture 13: Bounded Model Checking Instructor: Tevfik Bultan.
1 Scenario-based Analysis of UML Design Class Models Lijun Yu October 4th, 2010 Oslo, Norway.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Basic Concepts The Unified Modeling Language (UML) SYSC System Analysis and Design.
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
CS 290C: Formal Models for Web Software Lecture 9: Analyzing Data Models Using Alloy Analyzer and SMT-Solvers Instructor: Tevfik Bultan.
Formalizing and Analyzing Feature models in Alloy
Data Model Analysis Tevfik Bultan University of California Santa Barbara Joint work with Jaideep Nijjar and Ivan Bocic.
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Using Mathematica for modeling, simulation and property checking of hardware systems Ghiath AL SAMMANE VDS group : Verification & Modeling of Digital systems.
Eliminating Bugs In MVC-Style Web Applications Tevfik Bultan Verification Lab (Vlab),
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Static program checking and verification Slides: Based.
Unbounded Data Model Verification Using SMT Solvers Jaideep NijjarTevfik Bultan University of California, Santa Barbara ASE 2012.
Database Systems: Design, Implementation, and Management Ninth Edition
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
Concepts and Terminology Introduction to Database.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
An Integrated Data Model Verifier with Property Templates Jaideep Nijjar Ivan Bocic Tevfik Bultan {jaideepnijjar, bo, University of.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
1 Automatic Refinement and Vacuity Detection for Symbolic Trajectory Evaluation Orna Grumberg Technion Haifa, Israel Joint work with Rachel Tzoref.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Validated Model Transformation Tihamér Levendovszky Budapest University of Technology and Economics Department of Automation and Applied Informatics Applied.
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Object Oriented Multi-Database Systems An Overview of Chapters 4 and 5.
FDT Foil no 1 On Methodology from Domain to System Descriptions by Rolv Bræk NTNU Workshop on Philosophy and Applicablitiy of Formal Languages Geneve 15.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Weaving a Debugging Aspect into Domain-Specific Language Grammars SAC ’05 PSC Track Santa Fe, New Mexico USA March 17, 2005 Hui Wu, Jeff Gray, Marjan Mernik,
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Preface IIntroduction Objectives I-2 Course Overview I-3 1Oracle Application Development Framework Objectives 1-2 J2EE Platform 1-3 Benefits of the J2EE.
Reasoning about the Behavior of Semantic Web Services with Concurrent Transaction Logic Presented By Dumitru Roman, Michael Kifer University of Innsbruk,
The Interpreter Pattern (Behavioral) ©SoftMoore ConsultingSlide 1.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Design-Directed Programming Martin Rinard Daniel Jackson MIT Laboratory for Computer Science.
Formal Verification. Background Information Formal verification methods based on theorem proving techniques and model­checking –To prove the absence of.
Hyperion Artifact Life Cycle Management Agenda  Overview  Demo  Tips & Tricks  Takeaways  Queries.
1 CEN 4020 Software Engineering PPT4: Requirement analysis.
2 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel Data Models Why data models are important About the basic data-modeling.
Finding bugs with a constraint solver daniel jackson. mandana vaziri mit laboratory for computer science issta 2000.
Artificial Intelligence Knowledge Representation.
Jeremy Nimmer, page 1 Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science Joint work with.
Ontology Evolution: A Methodological Overview
CS 174: Server-Side Web Programming February 12 Class Meeting
ece 627 intelligent web: ontology and beyond
SECTION 4: OO METHODOLOGIES
A framework for ontology Learning FROM Big Data
Presentation transcript:

Data Model Property Inference and Repair Jaideep Nijjar and Tevfik Bultan {jaideepnijjar, bultan}@cs.ucsb.edu Verification Lab Department of Computer Science University of California, Santa Barbara ISSTA 2013

Motivation Web applications influence every aspect of our lives Our dependence on web applications keep increasing It would be nice if we can improve the dependability of web applications

Acknowledgement: NSF Support

Three-Tier Architecture Browser Web Server Backend Database

Three-Tier Arch. + MVC Pattern Browser Web Server Controller Views Model Backend Database

Model-View-Controller Pattern DB Model View MVC has become the standard way to structure web applications Ruby on Rails Zend for PHP CakePHP Struts for Java Django for Python … Benefits of the MVC pattern: Separation of concerns Modularity Abstraction Controller Our Focus: MVC-Based Web Applications The user makes a request via her browser. Controllers receive user input and initiate a response by making calls on model objects. Models talk to the database, store and validate data, perform the business logic Controllers generate Views; the UI --- HTML, CSS, Javascript, etc The controller returns the view to the user. The MVC pattern has really become the standard way… Many MVC frameworks out there for the popular web application development languages. For example, Become popular for the many benefits you get from using it: … Our idea: exploit these design principles when doing modeling and verification of MVC web apps In particular, we focus on verifying the Model, or Data Model User Request User Response

Data Model Data model is the heart of the web application It specifies the set of objects and the associations (i.e., relations) between them Using an Object-Relational-Mapping the data model is mapped to the back-end datastore Any error in the data model can have serious consequences for the dependability of the application

Our Data Model Analysis Approach Active Records Data Model (Schema + Constraints) Data Model + Inferred Properties Failing Properties Model Extraction Property Inference Verification Repair Generation Orphan Prevention Transitive Relations Delete Propagation Inference is based on data model schema, extracted from the object-relational mapping ... we infer a set of properties about the data model that we expect to hold Verification component is our earlier work. Including bounded and unbounded verification, etc. The heuristics for inferring properties are described next. Data Model Schema Inferred Properties

Outline Motivation Overview of Our Approach Rails Data Models Basic Relations Options to Extend Relations Formalization of Semantics Verification Techniques Property Inference Property Repair Experiments Conclusions and Future Work

A Rails Data Model Example Role class User < ActiveRecord::Base has_and_belongs_to_many :roles has_one :profile, :dependent => :destroy has_many :photos, :through => :profile end class Role < ActiveRecord::Base has_and_belongs_to_many :users class Profile < ActiveRecord::Base belongs_to :user has_many :photos, :dependent => :destroy has_many :videos, :dependent => :destroy, :conditions => "format='mp4'" class Tag < ActiveRecord::Base belongs_to :taggable, :polymorphic => true class Video < ActiveRecord::Base belongs_to :profile has_many :tags, :as => :taggable class Photo < ActiveRecord::Base ... * * User 1 1 * 0..1 * 1 Photo Profile 1 1 format=.‘mp4’ A sample Active Records specification for a social networking application where users have profiles which store their photo and video files. The photos and videos can be tagged by users, and users can have different roles. (Note: body of Photo class same as Video class) * 1 Taggable Video * Tag

Rails Data Models Data model analysis verification: Analyzing the relations between the data objects Specified in Rails using association declarations inside the ActiveRecord files Three basic relations One-to-one One-to-many Many-to-many Extensions to the basic relationships using Options :through, :conditions, :polymorphic, :dependent

Three Basic Relations in Rails One-to-One . One-to-Many class User < ActiveRecord::Base has_one :profile end . class Profile < ActiveRecord::Base belongs_to :user end User 1 0..1 Profile Relationships are expressed by adding a pair of declarations in the corresponding Rails models of the related objects. Rails automatically matches the relation name with the appropriate class by the name given Inheritance relation in Rails the notation ChildClass < ParentClass ActiveRecord::Base class contains all the database-connection functionality class Profile < ActiveRecord::Base has_many :videos end . class Video < ActiveRecord::Base belongs_to :profile end Profile 1 * Video

Three Basic Relations in Rails Many-to-Many class User < ActiveRecord::Base has_and_belongs_to_many :users end class Role < ActiveRecord::Base has_and_belongs_to_many :roles User * * Role

Extensions to the Basic Relations :through Option To express transitive relations :conditions Option To relate a subset of objects to another class :polymorphic Option To express polymorphic relationships :dependent Option On delete, this option expresses whether to delete the associated objects or not For the sake of time, I will only be focusing on :through and :dependent option for the duration of the talk.

The :through Option Profile * User Photo class User < ActiveRecord::Base has_one :profile has_many :photos, :through => :profile end class Profile < ActiveRecord::Base belongs_to :user has_many :photos class Photo < ActiveRecord::Base belongs_to :profile :through option can be set on either the has_one or has_many declaration. This is example of has_many. Dashed line is actually the join of the other two. So for ease of access of book editions through Author class rather than going through the book class. Profile User Photo * 0..1 1

The :dependent Option class User < ActiveRecord::Base has_one :profile, :dependent => :destroy end class Profile < ActiveRecord::Base belongs_to :user has_many :photos, :dependent => :destroy Photo Profile User 1 0..1 1 * The User class has the :dependent option set on its :profilerelation. Thus, when a User object is deleted, the associated profile object will be deleted. Further, since the :dependent option is set to :destroy, any relation with the :dependent option set in the Profile class will also have its objects deleted. In this scenario, the delete also gets propagated to the photos associated with that deleted Profile object. If instead of :destroy the :dependent option in User was set to :delete, all related Profile objects would be deleted, but not any Photoss associated with the Profile. We can see how this may lead to dangling references if this option is not set correctly. Note that options can be combined, like using the :dependent option with the :conditions option. :delete directly delete the associated objects without looking at its dependencies :destroy first checks whether the associated objects themselves have associations with the :dependent option set and recursively propagates the delete to the associated objects

Formalizing Data Model Semantics Formal data model: M = <S, C, D> S: Data model schema The sets and relations of the data model, e.g., { Photo, Profile, Role, Tag, Video, User} and the relations between them C: Constraints on the relations Cardinality constraints, transitive relations, conditional relations, polymorphic relations D: Dependency constraints about deletions Express conditions on two consecutive instances of a relation such that deletion of an object from the first instance leads to the other instance We can construct a formal data model representing the objects and their relationships in a RoR application. We define a formal data model to be a 3-tuple <S, C, D> where S is the data model schema, identifying the sets and relations of the data model, C is a set of relational constraints and D is a set of dependency constraints.   The relational constraints in C express all the constraints on the relations. Ex) of the :conditions option, which limits the relation to those objects that meet a certain criteria. In this example, Video objects are only related to a Profile object if their format field is mp4. The formalization of this constraint defines a set of objects (oV ′ ) that is a subset of the Video objects (oV ) (corresponding to Video objects with format field “mp4") and restricts the relation between the Profile and Video objects (rP−V ) to that subset. The dependency constraints in D express conditions on two consecutive instances of a relation such that deletion of an object from one of them leads to the other instance by deletion of possibly more objects (based on the :dependent option).

Outline Motivation Overview of Our Approach Rails Data Models Verification Techniques Bounded Verification Unbounded Verification Property Inference Property Repair Experiments Conclusions and Future Work

Verification Overview Active Records Model Extraction Bound Properties Data model + properties bound Bounded Verification Unbounded Verification Alloy Translator SMT-LIB Translator formula formula Alloy Analyzer SMT Solver Input: Active Records + Properties Formal data model: based on formalization just presented Verification component. Bounded verif has extra input, the bound, which is the maximum number of instances of each object to instantiate during verification. Unbounded: for all instances of the formal data model. Z3 Unknown is a possible output (for unbounded only) since theory of uninterpreted functions is undecidable. Yay for counterexamples  sample data model instance for which prop does not hold (nice thing about these solvers, since not all will do that, e.g. theorem provers) Next part of talk: translation from formal data model to input language of solvers instance or unsat Results Interpreter Results Interpreter instance or unsat or unknown Property Verified Property Failed + Counterexample Unknown

Sample Translation to Alloy class User < ActiveRecord::Base has_one :profile end class Profile < ActiveRecord::Base belongs_to :user sig Profile {} sig User {} one sig State { profiles: set Profile, users: set User, relation: Profile lone -> one User } The keyword sig is used in Alloy to define a set of objects. Thus, a sig is created for each class in the input Rails data model. In this example, a sig is declared for the Profile and User classes. We also create a State sig, which we use to define the state of a data model instance. Since we only need to instantiate exactly one State object when checking properties, we prepend the sig declaration with a multiplicity of one. The State sig contains fields to hold the set of all objects and related object pairs. In this example, the State sig contains three fields. The first is named profiles and is a binary relation between State and Profile objects. The field uses the multiplicity operator set, meaning ’zero or more’. In other words, the state of a data model instance may contain zero or more Profile objects. The State sig contains a similar field for User objects. Finally, the one to one relation between Profile and User objects is translated as another field in the State sig. Named relation, it is defined to be a mapping between Profile and User objects. It uses the multiplicity operators to constrain the mapping between Profiles and Users to be one-to-one. IDAVER does this translation automatically.

Sample Translation to SMT-LIB class User < ActiveRecord::Base has_one :profile end class Profile < ActiveRecord::Base belongs_to :user (declare-sort User) (declare-sort Profile) (declare-fun relation (Profile) User) (assert (forall ((p1 Profile)(p2 Profile)) (=> (not (= p1 p2)) (not (= (relation p1) (relation p2) )) ) )) Types in SMT-LIB are declared using the declare-sort command. We use this command to declare types for User and Profile. The relation is translated as an uninterpreted function. Uninterpreted functions are created in SMT-LIB using the declare-fun command. We use this command to declare an uninterpreted function named relation whose domain is Profile and range is User. Since functions can map multiple elements in the domain to the same element in the range, and we instead desire a one-to-one relation relation, we constrain the function to be one-to-one to obtain the desired semantics. This constraint is expressed using the assert command, as shown above.

Property Inference: Motivation Verification techniques require properties as input Effectiveness depends on quality of properties written Manually writing properties is time-consuming, error-prone, lacks thoroughness Requires familiarity with the modeling language We propose techniques for automatically inferring properties about the data model of a web application Inference is based on the data model schema A directed, annotated graph that represents the relations Effectiveness: a verification tool cannot find an error if a property that exposes the error is not provided as input .. Lotta disadvantages of verification is the property-specification aspect. Many errors can be missed, ppl don’t use verif cuz of the manual effort required And then these properties can be checked using a verification technique like the ones I discussed earlier Schema is extracted from the object-relational mapping

Data Model Schema Example Extracted from the ORM of a customer-relation management application. Nodes = object classes. Edges = different relation types Explore the structure of the graph and look for patterns. Based on these patterns, we infer properties.

Outline Motivation Overview of Our Approach Rails Data Models Verification Techniques Property Inference Orphan Prevention Pattern Transitive Relation Pattern Delete Propagation Pattern Property Repair Experiments Conclusions and Future Work

Property Inference: Overview Identify patterns in the data model schema graph that indicates that certain property should hold in the data model Extract the data model schema from the ActiveRecords declarations Search for the identified patterns in the data model schema graph If a match is found, report the corresponding property

Orphan Prevention Pattern For objects of a class that has only one relation: when the object it is related to is deleted but the object itself is not, such an object becomes orphaned Orphan chains can also occur Heuristic looks at all one-to-many or one-to-one relations to identify all potential orphans or orphan chains Infers that deleting an object does not create orphans Orphan: one incoming relation (when delete an object, possible orphans deleted along with it) 1 . . . n . . .

Transitive Relation Pattern Looks at one-to-one or one-to-many relations in schema Finds paths of relations that are of length > 1 If there is a direct edge between the first and last node of the path, infer that this edge should be transitive, i.e. the composition of the others 1 2 . . . n

Delete Propagation Pattern Look at schema with all relations removed that are many-to-many or transitive Remove cycles in graph by collapsing strongly connected components to a single node Assign levels to all nodes indicating its depth in the graph Root node(s), those with no incoming edges, are at level 0 Remaining nodes are at level 1 more than the maximum of their predecessors Propagate deletes if levels between nodes is one We have three different heuristics for inferring three types of properties. The intuition here is that if the difference between the levels of the nodes is greater than one, then there could be other classes between these two classes that are related to both of them and therefore propagating the delete could lead to inconsistencies between the relations. A directed graph is called strongly connected if there is a path from each node in the graph to every other node. 1 2 c 1 2 3 4 level=0 level=1 level=2

Repair Generation Check the inferred properties on the formal data model If a property fails we can point out the option that needs to be set in the data model to make sure that the inferred property holds For delete propagates and orphan prevention patterns: Set the dependency option accordingly to propagate the deletes For transitive property: Set the through option accordingly to make a relation composition of two other relations

Repair Examples 1 class User < ActiveRecord::Base 2 has_one :preference, :conditions => "is_active=true”, 3 :dependent => :destroy 4 has_many :contexts, :dependent => :destroy 5 has_many :todos, :through => :contexts 6 end 7 class Preference < ActiveRecord::Base 8 belongs_to :user 9 end 10 class Context < ActiveRecord::Base 11 belongs_to :user 12 has_many :todos, :dependent => :delete 13 end 14 class Todo < ActiveRecord::Base 15 belongs_to :context 16 # belongs_to: user 17 has_and_belongs_to_many :tags 18 end 19 class Tag < ActiveRecord::Base 20 has_and_belongs_to_many :todos 21 end

Outline Motivation Overview of Our Approach Rails Data Models Verification Techniques Property Inference Property Repair Experiments Conclusions and Future Work

Experiments on Five Applications LOC Classes Data Model LovdByLess 3787 61 13 Substruct 15639 85 17 Tracks 6062 44 FatFreeCRM 12069 54 20 OSR 4295 41 15

A Social Networking Application LovdByLess: A social networking application Users can write blog entries Users can comment on a friend’s blog entry Friend deletes blog entry (Can also be update for ‘Amy uploaded a picture’) When User deletes blog entry, this is the error that occurs (after) How does error occur: deletePropagates[Blog, Comment] -> comments not deleted when blog entry deleted!! DM error! Comment entries still exist in db! -> further, this causes bug in app. when listing the user's 'recent activity' on her dashboard,

A Social Networking Application A friend writes a blog entry User comments on the friend’s blog entry Friend deletes the blog entry (Can also be update for ‘Amy uploaded a picture’) When User deletes blog entry, this is the error that occurs (after) How does error occur: deletePropagates[Blog, Comment] -> comments not deleted when blog entry deleted!! DM error! Comment entries still exist in db! -> further, this causes bug in app. when listing the user's 'recent activity' on her dashboard,

A Failing Inferred Property deletePropagates property inferred for LovdByLess delete should propagate Blog Comment LovdByLess: social networking application deletePropagates[Blog, Comment] What happens is associated comment not deleted. When UI gets it, it can’t find the associated blog to create the link and proper display string (created on the fly). So it compensates by displaying an empty string.

A Todo List Application Tracks: A todo list application Todos can be organized by Contexts Users can also create Recurring Todos Delete the Context. Then edit the Recurring Todo. Tracks: a todo list application Todos can be organized by Contexts (e.g. School, Work, Home) Also has recurring todos (so let's say you create a recurring todo like "feed the dog" and put it in the Home context.) Delete a Context. Try to edit the recurring todo. Application crash.

A Failing Inferred Property Data Model and Application Error: deletePropagates property inferred for Tracks delete should propagate Context RecurringTodo Tracks, a todo list application. Associated RecurringTodo should have been deleted (Todos are!) but delete isn’t propagated. :dependent option not set correctly. Thus when go to modify the RecurringTodo, application crashes because it isn’t expecting a null reference to Context.

False Positives deletePropagates property inferred for FatFreeCRM But in FatFreeCRM it is valid to have a contact not associated with any account Account Contact delete should propagate Delete account should propagate to associated Contacts

False Positives transitive property inferred for LovdByLess Just not a transitive relation due to semantics of the application User ForumTopic ForumPost It is not necessary that users must post to forum topics that they created, as transitivity requires

Experiment Results Application Property Type # Inferred # Timeout # Failed LovdByLess deletePropagates 13 10 noOrphans transitive 1 Substruct 27 16 2 4 Tracks 15 6 12 FatFreeCRM 32 19 5 OSR 7 0-32 properties inferred --> reasonable (Not too much, overwhelming) 3 timeouts

# Data Model & Application Errors # Failures Due to Rails Limitations Property Type # Data Model & Application Errors # Data Model Errors # Failures Due to Rails Limitations # False Positives deletePropagates 1 9 noOrphans transitive 3 5 7 18 6 12 Of the properties that failed, they can be categorized into four different types 29% false positives  reasonable

Conclusions and Future Work It is possible to extract formal specifications from MVC-based data models and analyze them We can automatically infer properties and find errors in real-world web applications Most of the errors come from the fact that developers are not using the ActiveRecords extensions properly This breaks the modularity, separation of concerns and abstraction principles provided by the MVC pattern We are working on analyzing actions that update the data store We are also investigating verifiable-model-driven development for data models

Related Work Automated Discovery of Likely Program Invariants Daikon [Ernst et al, ICSE 1999] discovers likely invariants by observing the runtime behaviors of a program [Guo et al, ISSTA 2006] extends this style of analysis and applies it to the inference of abstract data types We analyze the static structure of an extracted data model to infer properties Static Verification of Inferred Properties [Nimmer and Ernst, 2001] integrate Daikon with ESC/Java, a static verification tool for Java programs We focus on data model verification in web applications Works related to property inference in particular Instead of observing the runtime behavior of programs, we analyze the static structure of an extracted data model to infer properties …our domain and technique is different.

Related Work Verification of Web Applications [Krishnamurti et al, Springer 2006 ] focuses on correct handling of the control flow given the unique characteristics of web apps Works such as [Hallé et al, ASE 2010] and [Han et al, MoDELS 2007] use state machines to formally model navigation behavior In contrast to these works, we focus on analyzing the data model Formal Modeling of Web Applications WebAlloy [Chang, 2009]: user specifies the data model and access control policies; implementation automatically synthesized WebML [Ceri et al, Computer Networks 2000]: a modeling language developed specifically for modeling web applications; no verification In contrast, we perform model extraction (reverse engineering) Focus has largely been on **navigation** modeling and verification

Related Work Verification of Ruby on Rails applications Rubicon [Near et al, FSE 2012] verifies the Controller whereas we verify the Data Model Requires manual specification of application behavior, whereas we verify manually written properties Limited to bounded verification Data Model Verification using Alloy [Cunha and Pacheco, SEFM 2009] maps relational database schemas to Alloy; not automated [Wang et al, ASWEC 2006] translates ORA-SS specifications to Alloy, and uses the Analyzer to produces instances of the data model to show consistency [Borbar et al, Trends 2005] uses Alloy to discover bugs in browser and business logic interactions * Rubicon requires specification files describing expected behavior for each Controller - Also uses Alloy Analyzer ORA-SS = Object-Relationship-Attribute model for Semi-Structured data Last two works limited to bounded verification ... A different class of bugs than the data model related bugs we focus on

Related Work Unbounded Verification of Alloy Specifications using SMT Solvers [Ghazi et al, FM 2011], approach not implemented More challenging domain since Alloy language contains constructs such as transitive closures Specification and Analysis of Conceptual Data Models [Smaragdakis et al ASE 2009, McGill et al ISSTA 2011] use Object Role Modeling to express data model and constraints Focus is on checking consistency and producing test cases efficiently 1. ...such as transitive closures which do not appear in the data models we extract 2. ORM is a popular conceptual modeling language. Goals of these works: - Consistency: checking whether any instances of a data model exist - Automatic generation of test data that respect constraints expressed in ORM Our focus is on extracting the data model from an existing application and performing verification (that desirable properties hold.) as opposed to test case generation from a model (reverse engineering vs forward engineering)

Questions?