Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Analyzing Regression Test Selection Techniques
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Object-oriented Software Change Dynamic Impact Analysis Lulu Huang and Yeong-Tae Song Dept. of Computer and Information Sciences Towson University Towson,
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
Fast Algorithms For Hierarchical Range Histogram Constructions
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
SQL Server Accelerator for Business Intelligence (SSABI)
Program Slicing Mark Weiser and Precise Dynamic Slicing Algorithms Xiangyu Zhang, Rajiv Gupta & Youtao Zhang Presented by Harini Ramaprasad.
1 Program Slicing Purvi Patel. 2 Contents Introduction What is program slicing? Principle of dependences Variants of program slicing Slicing classifications.
1 Practical Object-sensitive Points-to Analysis for Java Ana Milanova Atanas Rountev Barbara Ryder Rutgers University.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Introduction to SQL Programming Techniques.
Using Natural Language Program Analysis to Locate and understand Action-Oriented Concerns David Shepherd, Zachary P. Fry, Emily Hill, Lori Pollock, and.
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
Algorithms and Problem Solving. Learn about problem solving skills Explore the algorithmic approach for problem solving Learn about algorithm development.
Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.
Recap from last time g() { lock; } h() { unlock; } f() { h(); if (...) { main(); } } main() { g(); f(); lock; unlock; } mainfgh ;;;;;;; u u ” ”””” ” ”
1 Design patterns Lecture 4. 2 Three Important skills Understanding OO methodology Mastering Java language constructs Recognizing common problems and.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Automatic Data Ramon Lawrence University of Manitoba
Chapter 2: Algorithm Discovery and Design
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 18 Object Database Management Systems.
Deep Typechecking and Refactoring Zachary Tatlock, Chris Tucker, David Shuffleton, Ranjit Jhala, Sorin Lerner 1 University of California, San Diego.
The Relational Model These slides are based on the slides of your text book.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 7-1 David M. Kroenke’s Chapter Seven: SQL for Database Construction and.
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
Control Flow Resolution in Dynamic Language Author: Štěpán Šindelář Supervisor: Filip Zavoral, Ph.D.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Refactoring.
Architecture-Based Runtime Software Evolution Peyman Oreizy, Nenad Medvidovic & Richard N. Taylor.
Ohio State University Department of Computer Science and Engineering Automatic Data Virtualization - Supporting XML based abstractions on HDF5 Datasets.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Bug Localization with Machine Learning Techniques Wujie Zheng
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
1 Fast and Efficient Partial Code Reordering Xianglong Huang (UT Austin, Adverplex) Stephen M. Blackburn (Intel) David Grove (IBM) Kathryn McKinley (UT.
Survey on Trace Analyzer (2) Hong, Shin /34Survey on Trace Analyzer (2) KAIST.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Which Configuration Option Should I Change? Sai Zhang, Michael D. Ernst University of Washington Presented by: Kıvanç Muşlu.
Chapter 18 Object Database Management Systems. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Motivation for object.
CISC Machine Learning for Solving Systems Problems Presented by: Alparslan SARI Dept of Computer & Information Sciences University of Delaware
© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 14 Using Relational Databases to Provide Object Persistence (Overview) Modern Database.
Hibernate 3.0. What is Hibernate Hibernate is a free, open source Java package that makes it easy to work with relational databases. Hibernate makes it.
RecBench: Benchmarks for Evaluating Performance of Recommender System Architectures Justin Levandoski Michael D. Ekstrand Michael J. Ludwig Ahmed Eldawy.
Christopher M. Pascucci.NET Programming: Databases & ADO.NET.
Design Patterns Gang Qian Department of Computer Science University of Central Oklahoma.
Presented by: Ashgan Fararooy Referenced Papers and Related Work on:
1 An Aspect-Oriented Implementation Method Sérgio Soares CIn – UFPE Orientador: Paulo Borba.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University July 21, 2008WODA.
Rainbow: XML and Relational Database Design, Implementation, Test, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor:
ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University.
CISC Machine Learning for Solving Systems Problems Presented by: Satyajeet Dept of Computer & Information Sciences University of Delaware Automatic.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
CS 343 presentation Concrete Type Inference Department of Computer Science Stanford University.
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
E.Bertino, L.Matino Object-Oriented Database Systems 1 Chapter 9. Systems Seoul National University Department of Computer Engineering OOPSLA Lab.
Banaras Hindu University. A Course on Software Reuse by Design Patterns and Frameworks.
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Whole Test Suite Generation. Abstract Not all bugs lead to program crashes, and not always is there a formal specification to check the correctness of.
Lecture #1: Introduction to Algorithms and Problem Solving Dr. Hmood Al-Dossari King Saud University Department of Computer Science 6 February 2012.
ECE 750 Topic 8 Meta-programming languages, systems, and applications Automatic Program Specialization for J ava – U. P. Schultz, J. L. Lawall, C. Consel.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Spatial Approximate String Search. Abstract This work deals with the approximate string search in large spatial databases. Specifically, we investigate.
SQL Environment.
Ravi Mangal Mayur Naik Hongseok Yang
Precise Condition Synthesis for Program Repair
Relax and Adapt: Computing Top-k Matches to XPath Queries
Presentation transcript:

Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University College London 2008

Database Management Systems (DBMS) Provide concurrent access efficient execution of complex queries over large datasets Often used with OO languages (C++, Java, and C#) –Results in impedance mismatch problem Complicates impact analysis

Motivation The effects of DB schema change estimated manually fragile and difficult expensive Proposition of assessing the effects of DB schema change in an automatic more reliable and cost-effective way Not the reconciliation of the impacts themselves, but rather the difficulty of discovering and predicting them

Example No default value Null values not allowed

The Impact of Schema Change any location in the application which will behave differently required to behave differently Change1Change2 Change3 (delete Name)

Data Access Practices Entry point types DBConnection, DBRecord, DBRecordSet, DBParam belonging to the persistence API Need to know: State of these types Exact query Where/how results are used

Approach Program Analysis (PA) is compile-time techniques for approximating the run-time properties of program –Previously has been used to extract queries from OO languages String Analysis (SA) is a form of PA where the possible run-time values of string variables are predicted for selected locations in the program. –Gould et al used SA to predict the values of strings passed to the Java JDBC library methods, in order to check that the queries are type safe with respect to the DB schema –Christensen et al created the JSA application using SA

Approach Overview (Requirements for Context-Sensitivity) Specifies how precisely the calling context of procedures are represented in dataflow analyses k-CFA where all or some of the propagated data in the dataflow analysis include a call string that represents the last k calling call-sites

Approach Overview (Required Precision of Context-Sensitivity) distinguishes between different values of the variables belonging to separate calling contexts

Approach Overview

Program Slicing –Extraction of a subset of the source application that can affect, or be affected by the DB calls Why? –As k increases, k-CFA analysis has exponential complexity with respect to program size

Program Slicing Example Cite the code used in the paper as an example and show the how the slicing will reduce the code.

Dataflow Analysis Computing set of runtime properties that can occur at a given point in time of a program –The analysis is based on string analysis by Choi et al. Two modifications in this existing string analysis –Increase Context Sensitivity –Addition of query types to the string analysis

Dataflow Analysis – Increasing Context Sensitivity Modify Choi et al’s algorithm from 1-CFA to k-CFA –Modify the property space of dataflow analysis –Abstract variables and abstract heap locations are distinguished from context locations –Extending identifiers to include a string of k call sites

Dataflow Analysis – Add query types to String Analysis query types – denotes all query representing types and those that are involved in execution and use of database queries Generate a dataflow graph by performing a standard fixed point iteration of the graph from the slicing stage Result: –All query representation types have an estimated set of possible runtime values –Other query type objects (returned result sets) have unique identifiers associated based on their instantiation information

Dataflow Analysis – Extracting Dataflow Information

Impact Calculation Involved in the prediction of possible effects of database schema change –Use CrocoPat [tool that efficiently executes relational programs against arbitrary data]

Impact Calculation (contd…) +

Implementation Currently used only for C# applications that use SQL Server databases Total size of SUITE = 19 KLOC (written in C#)

Evaluation - Basis What is this evaluation for ? –Evaluate the feasibility of the technique presented in this paper To generalize the evaluation, the subject application had to represent the real world practice for database driven applications

Evaluation - Setup Subject Application: irPublish (content management system) –Consists of 127 KLOC of C# code –Uses database schema of up to 101 tables and 615 columns and 568 stored procedures Three (interesting changes) out of 62 previous schema changes were chosen for evaluation System Configuration used: –2.13GHz Intel Pentium Processor –1.5GB RAM

Evaluation – Setup (contd…) Schema changes used in this evaluation are as shown: Value of k used for dataflow analysis is 2 (there were places where the value for k need to be set up to 7)

Evaluation Summary – Comparison of Predicted changes vs. Observed changes Program Slicing reduced the size to 37% (from instructions to instructions)

Evaluation - Execution times of the analysis vs. increase in context sensitivity

Summary of Results Highlights the importance of context sensitive program analysis –High level of context sensitivity is required in many real world architectures where similar architecture patterns are used The types of schema change that occurred agree with the predictions of the study

Related Work Impact analysis of database schema –A. Karahasanovic - Supporting Application consistency in Evolving Object-Oriented Systems by Impact Analysis and Visualization [2002] String analysis –A. S. Christensen, A. Moller, and M. I. Schwartzbach. - Precise analysis of string expressions [2003] –C. Gould, Z. Su, and P. Devanbu. Static Checking of Dynamically Generated Queries in Database Applications. [2004] –T.-H. Choi, O. Lee, H. Kim, and K.-G. Doh. - A practical string analyzer by the widening approach. [2006] Dataflow analysis –S. Horwitz, T. Reps, and M. Sagiv. - Demand interprocedural dataflow analysis [1995]

Future Work Investigate alternatives to program slicing technique to reduce cost of dataflow analysis To analyze impact of the available impact analysis techniques on the development of database applications

Conclusion SUITE demonstrates to be more precise than the related work currently available in the area of impact analysis of database changes