CS422 Principles of Database Systems Introduction to Query Processing Chengyu Sun California State University, Los Angeles.

Slides:



Advertisements
Similar presentations
EXTERNAL SORTING ALGORITHMS AND IMPLEMENTATIONS Ayhan KARGIN Ahmet MERAL.
Advertisements

CS 540 Database Management Systems
Evaluation of Relational Operators CS634 Lecture 11, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
6.830/6.814 Lecture 5 Database Internals Continued September 17, 2014.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes 1.
Conceptual Architecture of PostgreSQL PopSQL Andrew Heard, Daniel Basilio, Eril Berkok, Julia Canella, Mark Fischer, Misiu Godfrey.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
1 Week 4 Questions / Concerns Comments about Lab1 What’s due: Lab1 check off this week (see schedule) Homework #3 due Wednesday (Define grammar for your.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
1-1 Homework 3 Practical Implementation of A Simple Rational Database Management System.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
. n COMPILERS n n AND n n INTERPRETERS. -Compilers nA compiler is a program thatt reads a program written in one language - the source language- and translates.
CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #6.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
CS422 Principles of Database Systems Introduction to Query Processing Chengyu Sun California State University, Los Angeles.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Database Applications (15-415) DBMS Internals- Part VIII Lecture 17, Oct 30, 2016 Mohammad Hammoud.
CS4222 Principles of Database System
CS422 Principles of Database Systems Introduction to Query Processing
CS320 Web and Internet Programming SQL and MySQL
CS 440 Database Management Systems
CS422 Principles of Database Systems Course Overview
Overview of Compilation The Compiler Front End
Overview of Compilation The Compiler Front End
PROGRAMMING LANGUAGES
CS1222 Using Relational Databases and SQL
Database Performance Tuning and Query Optimization
Introduction to Query Optimization
Relational Algebra Chapter 4, Part A
CS416 Compiler Design lec00-outline September 19, 2018
Database Management Systems (CS 564)
Introduction CI612 Compiler Design CI612 Compiler Design.
February 12th – RDBMS Internals
April 20th – RDBMS Internals
CS222P: Principles of Data Management Notes #11 Selection, Projection
CS1222 Using Relational Databases and SQL
Relational Algebra Chapter 4, Sections 4.1 – 4.2
Database Applications (15-415) DBMS Internals- Part IX Lecture 21, April 1, 2018 Mohammad Hammoud.
Conceptual Architecture of PostgreSQL
Query Optimization CS 157B Ch. 14 Mien Siao.
Conceptual Architecture of PostgreSQL
CS2011 Introduction to Programming I Loop Statements (II)
Instructor 彭智勇 武汉大学软件工程国家重点实验室 电话:
CS416 Compiler Design lec00-outline February 23, 2019
Chengyu Sun California State University, Los Angeles
Overview of Query Evaluation
CS1222 Using Relational Databases and SQL
CS1222 Using Relational Databases and SQL
CS3220 Web and Internet Programming SQL and MySQL
Chapter 11 Database Performance Tuning and Query Optimization
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
CS222: Principles of Data Management Notes #11 Selection, Projection
CPSC-608 Database Systems
CPSC-608 Database Systems
Chapter 10: Compilers and Language Translation
CS3220 Web and Internet Programming SQL and MySQL
CS1222 Using Relational Databases and SQL
Chengyu Sun California State University, Los Angeles
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #10 Selection, Projection Instructor: Chen Li.
CS1222 Using Relational Databases and SQL
Presentation transcript:

CS422 Principles of Database Systems Introduction to Query Processing Chengyu Sun California State University, Los Angeles

DBMS psql pgAdmin phpPgAdmin user applications … ClientServer DB SQL Results

DBMS Internals Query Compiler and Optimizer Execution Engine Index and Table Manager Disk Manager Buffer Manager Transaction and Concurrency Control Queries Results index files data files system catalog Logging and Recovery

SimpleDB Developed by Edward Sciore A simplified DBMS for educational purpose Source code - cysun/course_materials/cs422/simpledb cysun/course_materials/cs422/simpledb

SimpleDB Basics Server Server class: simpledb.server.SimpleDB Startup program: simpledb.server.Startup Clients simpledb.client.SQLInterpreter Some other programs

Query Processing Departments( dId, dName ) Students( sId, sName, majorId ) select sName, dName from Students, Departments where majorId = dId and sId = 1; What happens in a DBMS server when we run a query?

Query Processing in SimpleDB ExecutionPlanner Lexer Parser SQLTokens Query PlanResultSet

Query Parsing Analyze the query string and convert it into some data structure that can be used for query execution

Lexical Analysis Split the input string into a series of tokens select sname from students where sid = 1 keywordidentifierdelimiterconstant

Token TypeValue keywordselect identifiersname keywordfrom identifierstudents keywordwhere identifierid delimiter= intconstant1

SimpleDB Token Types Single-character delimiter Integer constants String constants Keywords Identifiers

SimpleDB Lexer Implementation Example: StreamTokenizerTest Java StreamTokenizerSimpleDB Lexer Number Word Quoted String Single-character Token Integer Keyword String Single-character Delimiter Identifier

Lexer API … The API used by the parser Iterate through the tokens Check the current token – “Match”  matchKeyword(), matchId(), matchIntConstant() … Consume the current token – “Eat”  eatKeyword(), eatId(), eatIntConstant() …

… Lexer API select sname from students where sid = 1 lexer.matchKeyword(“select”); lexer.eatKeyword(“select”); current token select sname from students where sid = 1 current token

Syntax A set of rules that describes the strings that could possibly be meaningful statements Example: a syntactically wrong statement select from a and b where c – 3;

Part of SimpleDB Grammar … := IdTok := StrTok | IntTok := | := = := [ AND ] Backus-Naur Form (BNF)

… Part of SimpleDB Grammar := SELECT FROM [ WHERE ] := [, ] := IdTok [, ] := CREATE TABLE IdTok ( ) := [, ] := IdTok := INT | VARCHAR ( IntTok ) Full SimpleDB Grammar at

Recursive Definition in Grammar := [, ] select a, b, c from t where x = 10; ??

Using Grammar Which of the following are valid SimpleDB SQL statements?? create table students (id integer, name varchar(10)) select * from students;

From Grammar to Code … public QueryData query() { lex.eatKeyword( “select” ); Collection fields = selectList(); lex.eatKeyword( “from” ); Collection tables = tableList(); Predicate pred = new Predicate(); if( lex.matchKeyword(“where”) ) { lex.eatKeyword(“where”); pred = predicate(); } return new QueryData( fields, tables, pred ); }

… From Grammar to Code public Collection selectList() { Collection L = new ArrayList (); L.add( field() ); if( lex.matchDelim(‘,’) ) { lex.eatDelim(‘,’); L.addAll( selectList() ); } return L; } public String field() { return lex.eatId(); }

After Parsing select sName, dName from Students, Departments where majorId = dId and sId = 1; More in the simpledb.parse package. QueryData in SimpleDB: fields { “sName”, “dName” } tables { “Students”, “Departments” } pred { {“majorId”, “dId”}, {“sId”, 1} }

Query Planning Break a query into individual operations, and organize them into certain order, i.e. a query plan.

Relational Algebra Operations Selection, projection, product Join Rename Set operations: union, intersection, difference Extended Relation Algebra operations Duplicate elimination Sorting Extended projection, outer join Aggregation and grouping

Selection sidsname 1Joe 2Amy sidsname 1Joe sid=1 InputOutput

Projection sidsname 1Joe 2Amy sname InputOutput sname Joe Amy

Product sidsnamedept 1Joe10 2Amy20 diddname 10CS 20Math sidsnamedeptdiddname 1Joe10 CS 1Joe1020Math 2Amy2010CS 2Amy20 Math Input Output

Scan A scan is an interface to a RA operation implementation public interface Scan { public boolean next(); // move to the next result public int getInt( String fieldName ); public String getString( String fieldName ); }

Scan Example: TableScan public TableScan( TableInfo ti, Transaction tx ) { recordFile = new RecordFile( ti, tx ); } public boolean next() { return recordFile.next(); } public int getInt( String fieldName ) { return recordFile.getInt( fieldName ); } public int getString( String fieldName ) { return recordFile.getString( fieldName ); }

Scan Example: SelectScan public SelectScan( Scan s, Predicate pred ) { this.s = s; this.pred = pred; } public boolean next() { while( s.next() ) if( pred.isSatisfied(s) ) return true; return false; }

Query Execution select name from students where id = 1; ProjectScan ( name ) SelectScan( sid=1) TableScan(students) Result next()

About Implementations of RA Operations Each RA operation can be implemented and optimized independently from others A RA operation may have multiple implementations E.g. table scan vs. index scan for selection The efficiency of an implementation depends on the characteristics of the data

A Query Plan Projection: { sName, dName } Selection: majorId=dId and sId=1 Product StudentsDepartments select sName, dName from Students, Departments where majorId = dId and sId = 1;

A Better Query Plan – Query Optimization Projection: { sName, dName } Selection: majorId=dId Product Students Departments Selection: sId=1

Readings Textbook Chapter 17, 18, 19