Lecture 21: ML Optimizers

Slides:



Advertisements
Similar presentations
CS CS4432: Database Systems II Logical Plan Rewriting.
Advertisements

Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Query Optimization Goal: Declarative SQL query
CS 257, Spring’08 Presented By: Presented By: Gayatri Gopalakrishnan Gayatri Gopalakrishnan ID : 201.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
1 Distributed Databases CS347 Lecture 14 May 30, 2001.
Cost-Based Plan Selection Choosing an Order for Joins Chapter 16.5 and16.6 by:- Vikas Vittal Rao ID: 124/227 Chiu Luk ID: 210.
CS186 Final Review Query Optimization.
16.2 ALGEBRAIC LAWS FOR IMPROVING QUERY PLANS Ramya Karri ID: 206.
©Silberschatz, Korth and Sudarshan14.1Database System Concepts 3 rd Edition Chapter 14: Query Optimization Overview Catalog Information for Cost Estimation.
...Looking back Why use a DBMS? How to design a database? How to query a database? How does a DBMS work?
1 Optimization. 2 Why Optimize? Given a query of size n and a database of size m, how big can the output of applying the query to the database be? Example:
CS 4432query processing - lecture 121 CS4432: Database Systems II Lecture #12 Query Processing Professor Elke A. Rundensteiner.
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
CSCE Database Systems Chapter 15: Query Execution 1.
EN : Adv. Storage and TP Systems Cost-Based Query Optimization.
Department of Computer Science and Engineering, HKUST Slide Query Processing and Optimization Query Processing and Optimization.
Query Optimization Arash Izadpanah. Introduction: What is Query Optimization? Query optimization is the process of selecting the most efficient query-evaluation.
SCUHolliday - COEN 17814–1 Schedule Today: u Query Processing overview.
1 Database Systems ( 資料庫系統 ) December 3, 2008 Lecture #10.
16.7 Completing the Physical- Query-Plan By Aniket Mulye CS257 Prof: Dr. T. Y. Lin.
CS4432: Database Systems II Query Processing- Part 2.
CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #8.
QUERY PROCESSING RELATIONAL DATABASE KUSUMA AYU LAKSITOWENING
Optimization Overview Lecture 17. Today’s Lecture 1.Logical Optimization 2.Physical Optimization 3.Course Summary 2 Lecture 17.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 15 – Query Optimization.
CS 440 Database Management Systems Query Optimization 1.
Chapter 13: Query Processing
CS4432: Database Systems II Query Processing- Part 1 1.
Tallahassee, Florida, 2016 COP5725 Advanced Database Systems Query Optimization Spring 2016.
Database Applications (15-415) DBMS Internals- Part IX Lecture 20, March 31, 2016 Mohammad Hammoud.
CPS216: Advanced Database Systems Notes 02:Query Processing (Overview) Shivnath Babu.
1 Ullman et al. : Database System Principles Notes 6: Query Processing.
CHAPTER 19 Query Optimization. CHAPTER 19 Query Optimization.
CS 440 Database Management Systems
Query Optimization Heuristic Optimization
15.1 – Introduction to physical-Query-plan operators
Lecture 26: Query Optimizations and Cost Estimation
Resource Elasticity for Large-Scale Machine Learning
Prepared by : Ankit Patel (226)
Database Performance Tuning and Query Optimization
Lecture 16: Relational Operators
Introduction to Query Optimization
Chapter 15 QUERY EXECUTION.
Database Management Systems (CS 564)
Introduction to Database Systems
Examples of Physical Query Plan Alternatives
April 20th – RDBMS Internals
Lecture 26: Query Optimization
Lecture 18: SQL and UFDs.
Lecture 22: Compressed Linear Algebra for Large Scale ML
CS143:Evaluation and Optimization
Focus: Relational System
Database Applications (15-415) DBMS Internals- Part IX Lecture 21, April 1, 2018 Mohammad Hammoud.
CMPT 354: Database System I
Database Management Systems (CS 564)
Lecture 33: The Relational Model 2
Query Execution Presented by Jiten Oswal CS 257 Chapter 15
Chapter 11 Database Performance Tuning and Query Optimization
CS639: Data Management for Data Science
Database Applications (15-415) DBMS Internals- Part X Lecture 22, April 3, 2018 Mohammad Hammoud.
Database Systems (資料庫系統)
CS639: Data Management for Data Science
Distributed Database Management Systems
CPS216: Data-Intensive Computing Systems Query Processing (contd.)
CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
CS222: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
CPS216: Data-Intensive Computing Systems Query Processing (Overview)
CPSC-608 Database Systems
Presentation transcript:

Lecture 21: ML Optimizers Slides by Boehm and Surve

Announcement Open office hours tomorrow after 1pm to talk about projects. Please drop by CS4361.

Today DB optimizers ML programs Optimization Techniques

Section 1 1. DB Optimizers

Logical vs. Physical Optimization Section 1 Logical vs. Physical Optimization SQL Query Relational Algebra (RA) Plan Logical optimization: Find equivalent plans that are more efficient Intuition: Minimize # of tuples at each step by changing the order of RA operators Physical optimization: Find algorithm with lowest IO cost to execute our plan Intuition: Calculate based on physical parameters (buffer size, etc.) and estimates of data size (histograms) Optimized RA Plan Execution

Translating to RA Π 𝐴,𝐷 ( 𝜎 𝐴<10 𝑇⋈ 𝑅⋈𝑆 ) Π 𝐴,𝐷 R(A,B) S(B,C) Section 1 Translating to RA R(A,B) S(B,C) T(C,D) Π 𝐴,𝐷 R(A,B) S(B,C) T(C,D) sA<10 SELECT R.A,S.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; Π 𝐴,𝐷 ( 𝜎 𝐴<10 𝑇⋈ 𝑅⋈𝑆 )

Optimizing RA Plan Π 𝐴,𝐷 ( 𝜎 𝐴<10 𝑇⋈ 𝑅⋈𝑆 ) Section 1 Optimizing RA Plan Push down selection on A so it occurs earlier R(A,B) S(B,C) T(C,D) Π 𝐴,𝐷 SELECT R.A,S.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; sA<10 T(C,D) R(A,B) S(B,C) Π 𝐴,𝐷 ( 𝜎 𝐴<10 𝑇⋈ 𝑅⋈𝑆 )

Optimizing RA Plan Π 𝐴,𝐷 𝑇⋈ 𝜎 𝐴<10 (𝑅)⋈𝑆 Section 1 Optimizing RA Plan Push down selection on A so it occurs earlier R(A,B) S(B,C) T(C,D) SELECT R.A,S.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; Π 𝐴,𝐷 T(C,D) sA<10 S(B,C) Π 𝐴,𝐷 𝑇⋈ 𝜎 𝐴<10 (𝑅)⋈𝑆 R(A,B)

Optimizing RA Plan Π 𝐴,𝐷 𝑇⋈ 𝜎 𝐴<10 (𝑅)⋈𝑆 Section 1 Optimizing RA Plan Push down projection so it occurs earlier R(A,B) S(B,C) T(C,D) SELECT R.A,S.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; Π 𝐴,𝐷 T(C,D) sA<10 S(B,C) Π 𝐴,𝐷 𝑇⋈ 𝜎 𝐴<10 (𝑅)⋈𝑆 R(A,B)

Optimizing RA Plan Π 𝐴,𝐷 𝑇⋈ Π 𝐴,𝑐 𝜎 𝐴<10 (𝑅)⋈𝑆 Section 1 Optimizing RA Plan We eliminate B earlier! R(A,B) S(B,C) T(C,D) Π 𝐴,𝐷 In general, when is an attribute not needed…? SELECT R.A,S.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; T(C,D) Π 𝐴,𝐶 sA<10 Π 𝐴,𝐷 𝑇⋈ Π 𝐴,𝑐 𝜎 𝐴<10 (𝑅)⋈𝑆 S(B,C) R(A,B)

Physical Optimization Section 1 Physical Optimization Physical optimization: Find algorithm with lowest IO cost to execute our plan Intuition: Calculate based on physical parameters (buffer size, etc.) and estimates of data size (histograms) Index-based selections Different join algorithms

2. ML Programs

ML Program Compilation Section 2 ML Program Compilation

Distributed Matrix Representation Section 2 Distributed Matrix Representation

Distributed Matrix Representation (2) Section 2 Distributed Matrix Representation (2)

Common Workload Characteristics Section 2 Common Workload Characteristics

3. Optimization Techniques

SystemML’s Compilation Chain Section 3 SystemML’s Compilation Chain HOP = high-level operations LOP = low-level operations

Basic HOP and LOP DAG Compilation Section 3 Basic HOP and LOP DAG Compilation

Static and Dynamic Rewrites Section 3 Static and Dynamic Rewrites

Example Static Rewrites Section 3 Example Static Rewrites

Example Dynamic Rewrites Section 3 Example Dynamic Rewrites

Matrix Multiplication Chain Optimization Section 3 Matrix Multiplication Chain Optimization

Matrix Multiplication Chain Optimization (2) Section 3 Matrix Multiplication Chain Optimization (2)

Matrix Multiplication Chain Optimization (3) Section 3 Matrix Multiplication Chain Optimization (3)

Example Operator Selection: Matrix Multiplication Section 3 Example Operator Selection: Matrix Multiplication

Fused Operators: WSLoss Section 3 Fused Operators: WSLoss

Section 3 More optimizations Dynamic Recompilation to address unknown/changing sizes Sparsity of intermediates Decisions: Split HOP DAGs for recompilation Mark HOP DAGs with unknown sizes / sparsity

Section 3 Resource Optimizer

From SystemR to SystemML – A Comparison