A Framework for Testing Database Applications Joint work with Phyllis G. Frankl (Polytechnic) Saikat Dan (Polytechnic) Filippos Vokolos (Lucent Technologies)

Slides:



Advertisements
Similar presentations
Logical DB Design: ER to Relational Entity sets to tables. Employees ssn name lot CREATE TABLE Employees (ssn CHAR (11), name CHAR (20), lot INTEGER, PRIMARY.
Advertisements

Chapter 10: Designing Databases
Database Languages Chapter 7. The Relational Algebra.
Outline  Introduction  Background  Distributed DBMS Architecture  Distributed Database Design  Semantic Data Control ➠ View Management ➠ Data Security.
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Chapter 3 An Introduction to Relational Databases.
SQL Lecture 10 Inst: Haya Sammaneh. Example Instance of Students Relation  Cardinality = 3, degree = 5, all rows distinct.
Introduction to Structured Query Language (SQL)
SPRING 2004CENG 3521 The Relational Model Chapter 3.
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
Getting Started (Excerpts) Chapter One DAVID M. KROENKE’S DATABASE CONCEPTS, 2 nd Edition.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 7-1 David M. Kroenke’s Chapter Seven: SQL for Database Construction and.
Dr. Kalpakis CMSC 461, Database Management Systems Introduction.
Database Constraints. Database constraints are restrictions on the contents of the database or on database operations Database constraints provide a way.
Database Systems Lecture 5 Natasha Alechina
SQL Constraints & Triggers May 10 th, Agenda Big picture –what are constraints & triggers? –where do they appear? –why are they important? In SQL.
IST Databases and DBMSs Todd S. Bacastow January 2005.
Introduction Chapter 1. Reference Book  Database Systems Thomas Connolly, Carolyn Begg, Anne Strachan Addison-Wesley 1999 ISBN:
Database Systems Lecture # 8 11 th Feb,2011. The Relational Model of Data The term relation is basically just a mathematical term for a table. DBMS products.
Lecture 2 The Relational Model. Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations.
Chapter 4 The Relational Model Pearson Education © 2014.
The Relational Model These slides are based on the slides of your text book.
1 Introduction to databases concepts CCIS – IS department Level 4.
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
Data Manipulation 11 After this lecture, you should be able to:  Understand the differences between SQL (Structured Query Language) and other programming.
 DATABASE DATABASE  DATABASE ENVIRONMENT DATABASE ENVIRONMENT  WHY STUDY DATABASE WHY STUDY DATABASE  DBMS & ITS FUNCTIONS DBMS & ITS FUNCTIONS 
Chapter 3 The Relational Model. 2 Chapter 3 - Objectives u Terminology of relational model. u How tables are used to represent data. u Connection between.
1 Structured Query Language (SQL) CIS*2450 Advanced Programming Concepts.
Chapter 10 Views. Topics in this Chapter What are Views For? View Retrievals View Updates Snapshots SQL Facilities.
Chapter 7 Relational Algebra. Topics in this Chapter Closure Revisited The Original Algebra: Syntax and Semantics What is the Algebra For? Further Points.
1 Chapter 1 Introduction. 2 Introduction n Definition A database management system (DBMS) is a general-purpose software system that facilitates the process.
1 The Relational Model. 2 Why Study the Relational Model? v Most widely used model. – Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. v “Legacy.
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
Data Definition After this lecture, you should be able to:
Relational Database. Database Management System (DBMS)
Databases Shortfalls of file management systems Structure of a database Database administration Database Management system Hierarchical Databases Network.
Data Manipulation 21 After this lecture, you should be able to:  Use SQL SELECT statement effectively to retrieve the data from multiple related tables.
Dec 8, 2003Murali Mani Constraints B term 2004: lecture 15.
A Framework for Testing Database Application Author: David Chays, Saikat Dan, Filippos I. Vokolos, Elaine J. Weyuker Presenter: Liping Liu.
Lecture 3 Book Chapter 3 (part 2 ) From ER to Relational.
Chapter 4 An Introduction to SQL. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.4-2 Topics in this Chapter SQL: History and Overview The.
The relational model A data model (in general) : Integrated collection of concepts for describing data (data requirements). Relational model was introduced.
An Introduction to SQL For CS Overview of SQL  It is the standard language for relational systems, although imperfect  Supports data definition.
Mr.Prasad Sawant, MIT Pune India Introduction to DBMS.
CS34311 The Relational Model. cs34312 Why Relational Model? Currently the most widely used Vendors: Oracle, Microsoft, IBM Older models still used IBM’s.
Starting with Oracle SQL Plus. Today in the lab… Connect to SQL Plus – your schema. Set up two tables. Find the tables in the catalog. Insert four rows.
The Relational Model © Pearson Education Limited 1995, 2005 Bayu Adhi Tama, M.T.I.
ASET 1 Amity School of Engineering & Technology B. Tech. (CSE/IT), III Semester Database Management Systems Jitendra Rajpurohit.
Chapter 3 The Relational Model. Objectives u Terminology of relational model. u How tables are used to represent data. u Connection between mathematical.
Constraints and Views Chap. 3-5 continued (7 th ed. 5-7)
Database System Concepts Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Transaction.
Databases Salihu Ibrahim Dasuki (PhD) CSC102 INTRODUCTION TO COMPUTER SCIENCE.
LECTURE TWO Introduction to Databases: Data models Relational database concepts Introduction to DDL & DML.
CHAPTER 1: INTRODUCTION Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Storage Management Database.
1 CS122A: Introduction to Data Management Lecture #5 (E-R  Relational, Cont.) Instructor: Chen Li.
1 CS122A: Introduction to Data Management Lecture #4 (E-R  Relational Translation) Instructor: Chen Li.
CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
Chapter 4 An Introduction to SQL. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.4-2 Topics in this Chapter SQL: History and Overview The.
Structured Query Language IV Asma Ahmad (C. J. Date) Database Systems.
Chapter 4 An Introduction to SQL.
Lecture 3 : Structured Query Language (SQL)
STRUCTURE OF PRESENTATION :
STRUCTURE OF PRESENTATION :
Chapter 2: Intro to Relational Model
The Relational Model Relational Data Model
پايگاه داده ها.
Unit 7 Normalization (表格正規化).
Question 1: Basic Concepts (45 %)
STRUCTURE OF PRESENTATION :
Presentation transcript:

A Framework for Testing Database Applications Joint work with Phyllis G. Frankl (Polytechnic) Saikat Dan (Polytechnic) Filippos Vokolos (Lucent Technologies) Elaine J. Weyuker (AT&T Labs - Research) David Chays Polytechnic University Brooklyn, NY

Motivation Database systems play an important role in virtually every modern organization Faults can be very costly Programmers/testers may lack experience and/or time Little attention has been paid to DB application program correctness

Outline of Talk Background Aspects of DB system correctness Issues in testing DB application programs Architecture of tool set Tool for generating database states Additional issues and approaches

DBMS and DB application DB application, eg., /* C program with embedded SQL*/ Database Management System DB DB schema, eg., Emp(ssn, name, addr, sal) Dept(id, dept-name)

Relational databases Data is viewed as a collection of relations –relation schema –relation (relation state) Table S ssnname Johnson Smith Jones Blake Tables, tuples, attributes, constraints for example, create table S (ssn char(11) primary key, name char(25) not null)

Aspects of Correctness Does the DBMS perform all operations correctly? Is concurrent access handled correctly? Is the system fault-tolerant?...  Does the application program behave as intended?

Traditional vs. DB programs function imperative nature function declarative nature input output input DB state output DB state

Customer-feature table: –customerID –address –features –... Billing table –customerID –billing plan –... Input customer ID and name of feature to which the customer wishes to subscribe. Invalid ID: return 0 feature unavailable in that area: return code 2 feature available but incompatible with existing features: return code 3 else update customer’s feature record, update billing table, return code 1 Example of an Informal Specification

What are the Input/Output Spaces? Naïve approach –I = {customer-IDs} X {feature-names} –0 = {0,1,2,3} More suitable approach: –I = {customer-IDs} X {feature-names} X {database-states} –0 = {0,1,2,3} X {database-states} Problem: –must control and observe the DB state

DB Application Testing Goal Select “interesting” DB states along with user inputs that exercise “interesting” behavior Cover wide variety of situations that could arise in practice Do so in a way that facilitates checking of output to user and resulting DB state

Situations to Explore Customer already subscribes to that feature Feature not available in customer’s area Feature available, but incompatible with other features customer already has Feature available and compatible with existing features Customer doesn’t yet subscribe to any features...

May involve interplay between several tables Table 1: incompatible features Table 2: features available in various areas Table 3: customers and features feature incompatible_feature F1 F2... feature area F F ID area F1 F2... FN

Will Live Data Suffice? May not reflect sufficiently wide variety of situations May be difficult to find the situations of interest May violate privacy or security constraints

Generating Synthetic Data DB state is a collection of relation states, each of which is a subset of the Cartesian product of some domains Generating domain elements and gluing them together isn’t enough, since constraints must be honored We attempt to generate interesting data that obey integrity constraints Use schema and user supplied info

Suggestions from tester DB schema App source App exec User input Output DB state Results Input Generator State Generator State Checker Output Checker

DB state generator Inputs DB schema (in SQL) Parses schema to derive info about –attributes –tables –constraints : uniqueness, not-NULL, referential integrity –inputs additional info from user –suggested attribute values, divided into groups, similar to Category-Partition Testing [Ostrand- Balcer] –additional annotations

create table s (sno char(5), sname char(20), status decimal(3), city char(15), primary key(sno)); create table p (pno char(6) primary key, pname char(20), color char(6), weight decimal(3), city char(15)); create table sp (sno char(5), pno char(6), qty decimal(5), primary key(sno,pno), foreign key(sno) references s, foreign key(pno) references p); Example Schema

Create table s( sno char(5), primary key(sno) ); Create table s( sno char(5) primary key ); Column Definition Nodetag type = T_ColumnDef colname = “sno” type name = “bpchar” Constraints = NIL Table Constraint Nodetag type = T_Constraint contype = CONSTR_PRIMARY keys T_IDENT name = “sno” Stmt Create Stmt Nodetag type = T_CreateStmt relname = “s” Column Definition Nodetag type = T_ColumnDef colname = “sno” type name = “bpchar” Constraints contype = CONSTR_PRIMARY Stmt Create Stmt Nodetag type = T_CreateStmt relname = “s”

P | 5 | pname | F| F| F| F| F| F| F| pno | F| F| F| F| F| F| F| weight| F| F| F| F| F| F| F| color | F| F| F| F| F| F| F| city | P | char | ~pr | ~un | ~nn pname | P | char | ~pr | ~un | ~nn pno | P | char | pr | un | ~nn weight | P | dec | ~pr | ~un | ~nn color | P | char | ~pr | ~un | ~nn cp S | 4 | globalTablePointer sname | F| F| F| F| F| F| F| sno | F| F| F| F| F| F| F| City | F| F| F| F| F| F| F| status | F| F| F| F| F| F| F| sname | S | char | ~pr | ~un | ~nn sno | F| F| F| F| F| F| F| City | F| F| F| F| F| F| F| status | F| F| F| F| F| F| F| sno | S | char | pr | un | ~nn city | S | char | ~pr | ~un | ~nn status | S | dec | ~pr | ~un | ~nn cp SP | 3 | Null pno |SP | char | pr | un | ~nn | foreign sno |SP | char | pr | un | ~nn | foreign qty |SP | dec | ~pr | ~un | ~nn cp

Selecting Attribute Values Initial prototype queries tester for suggested values and guidance on how to use those values Values may be partitioned into data groups (choices) Tester may specify probabilities for data groups

--choice_name: low choice_name: medium choice_name: high

Each category (column) can have a list of choices pointed to by cp. cp lowhighmedium

DB table generation Tester specifies table sizes Tool generates tuples for insertion –select data group or NULL, guided by annotations –select value from data group, obeying constraints –keep track of values used Outputs sequence of SQL insert statements

sno: --choice_name: sno S1 S2 S3 S4 S5 sname: --choice_name: sname Smith Jones Blake Clark Adams pname: --choice_name: interior seats airbags dashboard choice_name: exterior doors wheels bumper city: --choice_name: domestic --choice_prob: 90 Brooklyn Florham-Park Middletown choice_name: foreign --choice_prob: 10 London Bombay pno: --choice_name: pno P1 P2 P3 P4 P5 status: --choice_name: status --null_prob: color: --choice_name: color blue green yellow weight: --choice_name: weight Input files for Parts-Supplier database

city: --choice_name: domestic --choice_prob: 90 Brooklyn Florham-Park Middletown choice_name: foreign --choice_prob: 10 London Bombay status: --choice_name: status --null_prob:

A database state produced by the tool snopnoqty S1P15000 S1P2300 S1P310 S2P16000 S2P2400 S2P35000 S3P120 S3P2300 S3P330 S4P16000 pnopnamecolorweightcity P1NULLblue100Brooklyn P2Seatsgreen300Florham-Park P3airbagsyellow500Middletown snosnamestatuscity S1NULL0Brooklyn S2Smith1Florham-Park S3JonesNULLLondon S4BlakeNULLMiddletown Table sTable sp Table p

Related work Lyons-77, DB-Fill, TestBase Like our approach, rely on user to supply attribute values Do not handle integrity constraints as completely Require tester to describe tables in special- purpose language (rather than SQL)

Testing Techniques in DB literature Focus on DB system performance, rather than DB application correctness Benchmarks Performance of SQL processor –Generation of large number of DML statements [Slutz] Generation of huge tables with given statistical properties [Grey et al ]

Summary Issues Framework Prototype

Future Work Refinement based on feedback from DB application developers / testers Other DB state generation heuristics –boundary values –“missing” constraints –difficult SQL features Interplay between DB state and user inputs Checking DB state after test execution Checking application outputs