Marko Vrhovnik 1, Holger Schwarz 1, Oliver Suhre 2, Bernhard Mitschang 1, Volker Markl 3, Albert Maier 2, Tobias Kraft 1 1 Universität Stuttgart 2 IBM.

Slides:



Advertisements
Similar presentations
SQL*PLUS, PLSQL and SQLLDR Ali Obaidi. SQL Advantages High level – Builds on relational algebra and calculus – Powerful operations – Enables automatic.
Advertisements

Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 28 Database Systems I The Relational Data Model.
Manish Bhide, Manoj K Agarwal IBM India Research Lab India {abmanish, Amir Bar-Or, Sriram Padmanabhan IBM Software Group, USA
Technical Architectures
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Lecture-5 Though SQL is the natural language of the DBA, it suffers from various inherent disadvantages, when used as a conventional programming language.
A Guide to SQL, Seventh Edition. Objectives Create a new table from an existing table Change data using the UPDATE command Add new data using the INSERT.
CSCI 260 Database Applications Chapter 1 – Getting Started.
Introduction to Structured Query Language (SQL)
Business Process Orchestration
Getting Started (Excerpts) Chapter One DAVID M. KROENKE’S DATABASE CONCEPTS, 2 nd Edition.
A Guide to SQL, Seventh Edition. Objectives Embed SQL commands in PL/SQL programs Retrieve single rows using embedded SQL Update a table using embedded.
Database Systems More SQL Database Design -- More SQL1.
Chapter 14: Advanced Topics: DBMS, SQL, and ASP.NET
Getting Started Chapter One DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 5 th Edition.
Getting Started Chapter One DATABASE CONCEPTS, 7th Edition
A Guide to SQL, Seventh Edition. Objectives Understand, create, and drop views Recognize the benefits of using views Grant and revoke user’s database.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Query Processing Presented by Aung S. Win.
AN INTRODUCTION TO EXECUTION PLAN OF QUERIES These slides have been adapted from a presentation originally made by ORACLE. The full set of original slides.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Overview of the Database Development Process
Bordoloi and Bock CURSORS. Bordoloi and Bock CURSOR MANIPULATION To process an SQL statement, ORACLE needs to create an area of memory known as the context.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
ASP.NET Programming with C# and SQL Server First Edition
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
CS609 Introduction. Databases Current state? Future?
Database Technical Session By: Prof. Adarsh Patel.
Database System Concepts and Architecture Lecture # 2 21 June 2012 National University of Computer and Emerging Sciences.
Chapter 7 Working with Databases and MySQL PHP Programming with MySQL 2 nd Edition.
Michael Soffner A Variability Model for Query Optimizers Michael Soffner 1, Norbert Siegmund 1, Marko Rosenmüller 1, Janet Siegmund 1, Thomas.
 2004 Prentice Hall, Inc. All rights reserved. 1 Segment – 6 Web Server & database.
Chapter 18 Object Database Management Systems. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Motivation for object.
Lecture2: Database Environment Prepared by L. Nouf Almujally 1 Ref. Chapter2 Lecture2.
Views In some cases, it is not desirable for all users to see the entire logical model (that is, all the actual relations stored in the database.) In some.
CpSc 462/662: Database Management Systems (DBMS) (TEXNH Approach) Stored Procedure James Wang.
An Ontological Framework for Web Service Processes By Claus Pahl and Ronan Barrett.
Getting Started Chapter One DAVID M. KROENKE’S DATABASE CONCEPTS, 2 nd Edition.
Web Services Flow Language Guoqiang Wang Oct 7, 2002.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Commercial RDBMSs Access and Oracle. Access DBMS Architchecture  Can be used as a standalone system on a single PC: -JET Engine -Microsoft Data Engine.
Course FAQ’s I do not have any knowledge on SQL concepts or Database Testing. Will this course helps me to get through all the concepts? What kind of.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
Session 1 Module 1: Introduction to Data Integrity
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
Data The fact and figures that can be recorded in system and that have some special meaning assigned to it. Eg- Data of a customer like name, telephone.
Text TCS INTERNAL Oracle PL/SQL – Introduction. TCS INTERNAL PL SQL Introduction PLSQL means Procedural Language extension of SQL. PLSQL is a database.
2) Database System Concepts and Architecture. Slide 2- 2 Outline Data Models and Their Categories Schemas, Instances, and States Three-Schema Architecture.
Level 1-2 Trigger Data Base development Current status and overview Myron Campbell, Alexei Varganov, Stephen Miller University of Michigan August 17, 2000.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
SQL Basics Review Reviewing what we’ve learned so far…….
Introduction to Database Programming with Python Gary Stewart
Business Process Execution Language (BPEL) Pınar Tekin.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Fundamental of Database Systems
Databases (CS507) CHAPTER 2.
A Guide to SQL, Seventh Edition
Design Thoughts for JDSL 2.0
Database Management System
Chapter 15 QUERY EXECUTION.
Chapter 2 Database Environment.
Database.
A Guide to SQL, Eighth Edition
Chapter 8 Advanced SQL.
Query Optimization.
Database SQL.
Presentation transcript:

Marko Vrhovnik 1, Holger Schwarz 1, Oliver Suhre 2, Bernhard Mitschang 1, Volker Markl 3, Albert Maier 2, Tobias Kraft 1 1 Universität Stuttgart 2 IBM Böblingen 3 IBM Almaden Presented by: Megha Ramesh Kumar CSE 718 Professor : Michalis Petropoulos

Topics of Discussion: Introduction Workflow Languages And Data Management Rule Based Optimization of Business Processes Process Graph Model Rewrite rules Control Strategy Conclusion

Introduction Optimize business process revenues and profits. Introduce a set or rewrite rules such that Transform a business process into a more efficient one. Improve execution wrt data management. NO change in the semantics of the original process. Semi-procedural process graph Multi-stage control strategy Case Study

Workflow Languages & Data Mgmt. Business Process Execution Language [BPEL] It fosters a two-level programming model. Function Layer It consists of executable software components in the form of Web services that carry out basic activities. Choreography Layer It specifies a process model defining the execution order of activities. BPEL offers many language constructs Invoke activity Assign activity Sequence activity ForEach activity

BPEL & Data Management

Database vendors pursue various approaches IBM WebSphere Process Server Allows to process data in a set oriented manner BPEL/SQL Oracle BPEL Process Manager Provides XPath extension functions that are embedded in assign activities. Statements to be executed on a remote database are provided as a parameter to the function. Functions support any valid SQL statement Query results are stored in set-oriented process variables Microsoft Windows Workflow Foundation Uses SQL activities to provide database processing as part of business processes. Entire workflow, variables, activities are described by XOML.

Definitions SQL Activities Allows to pass data sets between activities by reference rather than by value. Set reference variables Refer to tables stored in a database system. Set variables Set-oriented data structure representing a table that is materialized in the process space. Retrieve set activity Specific SQL activity that allows to load data from a database system into the process space.

Sample Process

Rule Based Optimization of Business Processes Optimizer Engine Rewrite rules Condition needed to preserve the semantics of the process. It refers to the control flow dependencies and data flow dependencies of a process. Action defines the transformations applied to a process provided the corresponding condition is fulfilled.

Rule Based Optimization of Business Processes Optimizer Engine Control strategy Where on process structure In what order to apply rules Identify optimization spheres. Define the order in which rule conditions are checked for applicability and the order in which rules are finally applied.

Rule Based Optimization of Business Processes Optimization Spheres Parts of a process for which applicable rewrite rules should be identified. Determining such spheres is necessary, because if one applies rewrite rules across spheres, the semantics of a process may change. Process Graph Model PGM defines a process as a tuple (A, E c, E d, V, P) A:set of process activities E c : Directed control flow edges E d : Directed data flow edges V:Set of typed variable P:Partners

Generality issues PGM optimizer is independent from a specific workflow language and from the underlying database system. Important pre-conditions The optimizer engine needs to know the exact statements that are used in data management tasks. The optimizer engine needs to know control flow dependencies as well as data dependencies.

Classification of rewrite rules

Activity Merging Rules Web Service Pushdown Pushes an invoke activity into the SQL activity that depends on the Web service invocation. Hence, web service becomes a part of the SQL statement. Precondition: DBMS supports web service calls.

Example

Assign Pushdown It directly integrates an assign activity into an SQL activity. We push the assign operation into the SQL statement replacing the considered variable through its definition. This allows to omit the assign activity.

Eliminate Temporary Table If a table is created for each single process instance at process start up time, and if it is dropped as soon as the process instance has finished, we call it a temporary table. This rule removes the usage of temporary tables within SQL statements of SQL activities. This reduces the costs for the lifecycle management of temporary tables as well as for SQL processing.

Example

The Insert Tuple-to-Set Rule Insert Tuple to Set Rule: Replace the ForEachActivity by a single SQL activity. Set oriented. Avoids calling a database at each step of the loop. Two Conditions: Semantics of the process has to remain unchanged. Process representation that explicitly defines control flow and data dependencies is mandatory. Assumptions: Single data source. Process without parallel activities referencing the same variable.

The Insert Tuple-to-Set Rule Rule Conditions: P is transformed into process P* V={v set, v row, v sr } V set : set variable V row : a row of materialization set V sr : set reference variable A is a set of activities

The Insert Tuple-to-Set Rule Rule Conditions: Activity Condition A1: Activity a i is of type SQL providing the results of query expression expr i in a set variable. Activity Condition A2: ForEach activity a j iterates over the set and provides the current row in a row variable v row. Activity Condition A3: SQL activity a k is the only activity in the loop body of a j. It executes an INSERT statement.

The Insert Tuple-to-Set Rule Rule Action Transform a k to a k * by rewriting the SQL statement of a k We “pull up” the INSERT statement by joining expr i with a correlated table reference containing the results of expression expr k for each row. Due to the correlation between the joined tables within the FROM clause, we add the keyword TABLE to the table reference.

The Insert Tuple-to-Set Rule Rule Action: Replace a j including a k by a k * Remove a i and adapt the control flow accordingly, that is, connect all direct preceding activities with all direct succeeding activities of a i This opens up optimization at the database level and thus leads to performance improvements

The Insert Tuple-to-Set Rule Data Dependency Condition D1: A single write-read data dependency based on v set does exist between a i and a j, such that a i writes v set before a j reads v set Data Dependency Condition D2: There is a single write-read data dependency based on v row between a j and a k, such that a j writes v row before a k reads it

The Insert Tuple-to-Set Rule Value Stability Condition S1: v set is stable, that is, it does not change between its definition and its usage Value Stability Condition S2: In each iteration of a j, a k reads that value of v row that is provided by a j

Control Strategy It divides the overall process in several optimization spheres and applies rewrite rules considering their dependencies. Our control strategy exploits dependencies among rewrite rules. The application of any Activity Merging rule to the activities inside a ForEach activity may reduce the number of these activities to one. In turn, this may enable the application of the Tuple- to-Set rule.

Control Strategy The application of an Update Merging rule may reduce the number of updates on a table to a single one. If such a single update is executed on a temporary table, the Eliminate Temporary Table rule might become applicable. There is no specific order among the Tuple-to-Set Rule.

Enabling Relationships

Control Strategy Merging activities produces more sophisticated SQL statements. This enables optimization at the database level. The performance gain depends on The optimization potential of the SQL statements. The capabilities of the query optimizer of the database management system that processes these statements.

Control Strategy Scope Optimization Sphere (SOS) Scope of a closed optimization sphere. Loop Optimization Spheres (LOS) They comprise a ForEach activity with its nested activities and all surrounding activities that are necessary for applying a Tuple-to-Set rule.

Control Strategy Tree represents a hierarchical ordering on all optimization spheres. We process all nested spheres prior to a enclosing sphere. For each sphere type, we use a different control strategy.

Control Strategy

Algorithm Algorithm : OptimizeSphere Require: sphere s Ensure: optimized sphere s cs ← getControlStrategy(s) while cs is not finished do r ← getNextRule(cs) while s is not fully traversed do a ← getNextActivity(s) m ← findMatch(a, s, r) if m = ∅ then applyRule(m, r) end if end while

Algorithm Algorithm OptimizeSphereHierarchy Require: sphere-hierarchy sh Ensure: optimized sphere-hierarchy sh while sh is not fully traversed do s ← getNextSphere(sh) optimizeSphere(s) end while

Experiments

Conclusion Data management tasks are increasingly treated as first class citizens in workflow languages. New optimization opportunities arise. Applying rewrite rules to the definition of business processes results in remarkable performance improvements. Main components of the optimizer engine: set of rewrite rules process graph model as internal representation of workflows control strategy