Using Partial Indexes with PostgreSQL By Lloyd Albin 4/3/2012.

Slides:



Advertisements
Similar presentations
PHP II Interacting with Database Data. The whole idea of a database-driven website is to enable the content of the site to reside in a database, and to.
Advertisements

Arrays.
1 Constraints, Triggers and Active Databases Chapter 9.
AN INTRODUCTION TO PL/SQL Mehdi Azarmi 1. Introduction PL/SQL is Oracle's procedural language extension to SQL, the non-procedural relational database.
PL/SQL. Introduction to PL/SQL PL/SQL is the procedure extension to Oracle SQL. It is used to access an Oracle database from various environments (e.g.
By Lloyd Albin 12/3/2013.  Functions The Basics - Languages.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Slide: 1 Presentation Title Presentation Sub-Title Copyright 2010 Robert Haas, EnterpriseDB Corporation. Creative Commons 3.0 Attribution. The PostgreSQL.
PHP (2) – Functions, Arrays, Databases, and sessions.
Introduction to Structured Query Language (SQL)
Chapter 3: Using SQL Queries to Insert, Update, Delete, and View Data
Introduction to Structured Query Language (SQL)
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
AN INTRODUCTION TO EXECUTION PLAN OF QUERIES These slides have been adapted from a presentation originally made by ORACLE. The full set of original slides.
SQL Within PL / SQL Chapter 4. 2 SQL Within PL / SQL SQL Statements DML in PL / SQL Pseudocolums Transaction Control.
PL / SQL P rocedural L anguage / S tructured Q uery L anguage Chapter 7 in Lab Reference.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor Ms. Arwa.
PHP meets MySQL.
Constraints  Constraints are used to enforce rules at table level.  Constraints prevent the deletion of a table if there is dependencies.  The following.
Lecture 4 PL/SQL language. PL/SQL – procedural SQL Allows combining procedural and SQL code PL/SQL code is compiled, including SQL commands PL/SQL code.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
Stored Procedures, Transactions, and Error-Handling
Ashwani Roy Understanding Graphical Execution Plans Level 200.
1 Definition of a subquery Nested subqueries Correlated subqueries The ISNULL function Derived tables The EXISTS operator Mixing data types: CAST & CONVERT.
Programming in postgreSQL with PL/pgSQL ProceduralLanguageextension topostgreSQL.
Fundamentals of Python: First Programs
Stored Procedures Week 9. Test Details Stored Procedures SQL can call code written in iSeries High Level Languages –Called stored procedures SQL has.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
3 3 Chapter 3 Structured Query Language (SQL) Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
SQL Fundamentals  SQL: Structured Query Language is a simple and powerful language used to create, access, and manipulate data and structure in the database.
Database Fundamental & Design by A.Surasit Samaisut Copyrights : All Rights Reserved.
Chapter 9: Advanced SQL and PL/SQL Guide to Oracle 10g.
Professor: Dr. Shu-Ching Chen TA: Hsin-Yu Ha Stored Procedure used in PosgreSQL.
Professor: Dr. Shu-Ching Chen TA: Hsin-Yu Ha Function, Trigger used in PosgreSQL.
Professor: Dr. Shu-Ching Chen TA: Hsin-Yu Ha Views, Sequence, and Stored Procedure used in PosgreSQL.
 Control Flow statements ◦ Selection statements ◦ Iteration statements ◦ Jump statements.
Starting with Oracle SQL Plus. Today in the lab… Connect to SQL Plus – your schema. Set up two tables. Find the tables in the catalog. Insert four rows.
Query Processing – Implementing Set Operations and Joins Chap. 19.
CSC314 DAY 8 Introduction to SQL 1. Chapter 6 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SQL OVERVIEW  Structured Query Language  The.
LM 5 Introduction to SQL MISM 4135 Instructor: Dr. Lei Li.
MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Sravanthi Lakkimsety Mar 14,2016.
1 Designing Tables for a Database System. 2 Where we were, and where we’re going The Entity-Relationship model: Used to model the world The Relational.
Programming in postgreSQL with PL/pgSQL ProceduralLanguageextension topostgreSQL 1.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
CS422 Principles of Database Systems Stored Procedures and Triggers Chengyu Sun California State University, Los Angeles.
1. Advanced SQL Functions Procedural Constructs Triggers.
The PostgreSQL Query Planner Robert Haas PostgreSQL East 2010.
Programming in postgreSQL with PL/pgSQL
Tuning Transact-SQL Queries
MySQL Subquery Source: Dev.MySql.com
PL/pgSQL
Prof: Dr. Shu-Ching Chen TA: Hsin-Yu Ha
Query-by-Example (QBE)
PostgreSQL Conference East 2009
Writing PostgreSQL Functions and how to debug them
JavaScript: Functions.
Stored Procedure used in PosgreSQL
Designing Tables for a Database System
MATLAB: Structures and File I/O
Prof: Dr. Shu-Ching Chen TA: Yimin Yang
Prof: Dr. Shu-Ching Chen TA: Hsin-Yu Ha
Stored Procedure used in PosgreSQL
Stored Procedure used in PosgreSQL
SQL LANGUAGE and Relational Data Model TUTORIAL
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Prof: Dr. Shu-Ching Chen TA: Haiman Tian
Stored Procedure used in PosgreSQL
Contents Preface I Introduction Lesson Objectives I-2
Diving into Query Execution Plans
IST 318 Database Administration
Presentation transcript:

Using Partial Indexes with PostgreSQL By Lloyd Albin 4/3/2012

The Problem We receive faxes that are multi-page TIFF’s. The TIFF file name are called the raster id. We have data where we have the full path of the file name, the raster id and the raster id with page number. Examples: 0000/ / /studydata/studyname/0000/000000

Finding the Raster ID The first thing to do, is to be able to find the Raster ID, no matter which format is supplied. CREATE FUNCTION find_raster (raster varchar) RETURNS VARCHAR(11) AS $$ BEGIN CASE length(raster) WHEN 11 THEN -- Format: 1234/ Returns: 1234/ RETURN raster; WHEN 15 THEN -- Format: 1234/ Returns: 1234/ RETURN substr(raster,1,11); ELSE -- Format: /study_data/study_name/1234/ Returns: 1234/ RETURN substr(raster,length(raster)-10,11); END CASE; END; $$ LANGUAGE plpgsql IMMUTABLE RETURNS NULL ON NULL INPUT;

Function Volatility Category IMMUTABLE STABLE VOLATILE These attributes inform the query optimizer about the behavior of the function. At most one choice can be specified. If none of these appear, VOLATILE is the default assumption. IMMUTABLE indicates that the function cannot modify the database and always returns the same result when given the same argument values; that is, it does not do database lookups or otherwise use information not directly present in its argument list. If this option is given, any call of the function with all-constant arguments can be immediately replaced with the function value. STABLE indicates that the function cannot modify the database, and that within a single table scan it will consistently return the same result for the same argument values, but that its result could change across SQL statements. This is the appropriate selection for functions whose results depend on database lookups, parameter variables (such as the current time zone), etc. (It is inappropriate for AFTER triggers that wish to query rows modified by the current command.) Also note that the current_timestamp family of functions qualify as stable, since their values do not change within a transaction. VOLATILE indicates that the function value can change even within a single table scan, so no optimizations can be made. Relatively few database functions are volatile in this sense; some examples are random(), currval(), timeofday(). But note that any function that has side-effects must be classified volatile, even if its result is quite predictable, to prevent calls from being optimized away; an example is setval(). For additional details see Section 35.6.Section 35.6

Examples of the find_raster Function -- Test returning of Raster ID when Submitting Raster ID SELECT find_raster('1234/567890'); -- Returns: 1234/ Test returning of Raster ID when Submitting Raster ID with 4 Digit Page Number SELECT find_raster('1234/ '); -- Returns: 1234/ Test returning of Raster ID when Submitting Filename that includes Raster ID SELECT find_raster('/study_data/study_name/1234/567890'); -- Returns: 1234/567890

Create Tables for Testing -- Format: 0000/ CREATE TABLE raster ( raster VARCHAR(11) ); -- Format: 0000/ CREATE TABLE raster_page ( raster VARCHAR(15) ); -- Format: /study_data/study_name/0000/ CREATE TABLE raster_file ( raster VARCHAR );

Creating Test Data -- Now lets create a function to create a lot of data. CREATE OR REPLACE FUNCTION create_data () RETURNS INTEGER AS $$ DECLARE count integer; r raster%rowtype; BEGIN count := 0; LOOP INSERT INTO raster VALUES (to_char(clock_timestamp(), '0MS/US')); count := count + 1; EXIT WHEN count >= ; END LOOP; INSERT INTO raster_file SELECT '/study_data/study_name/' || raster.raster FROM raster; count := 5; LOOP INSERT INTO raster_page SELECT raster.raster || substr(to_char(count, '0009'),2,4) FROM raster; count := count - 1; EXIT WHEN count = 0; END LOOP; RETURN 1; END; $$ LANGUAGE plpgsql; SELECT create_data();

What a Normal Join Looks Like SELECT raster.*, raster_page.* FROM (SELECT * FROM raster OFFSET LIMIT 100) raster LEFT JOIN raster_page ON raster.raster = substr(raster_page.raster, 1, 11); Hash Left Join (cost= rows= width=88) (actual time= rows=540 loops=1) Hash Cond: ((public.raster.raster)::text = substr((raster_page.raster)::text, 1, 11)) -> Limit (cost= rows=100 width=40) (actual time= rows=100 loops=1) -> Seq Scan on raster (cost= rows=56956 width=40) (actual time= rows=50100 loops=1) -> Hash (cost= rows= width=48) (actual time= rows= loops=1) -> Seq Scan on raster_page (cost= rows= width=48) (actual time= rows= loops=1) Total runtime: ms Total runtime: s

What a Normal Join Looks Like SELECT raster.*, raster_file.* FROM (SELECT * FROM raster OFFSET LIMIT 100) raster LEFT JOIN raster_file ON raster.raster = substr(raster_file.raster, length(raster_file.raster)-10, 11); Hash Left Join (cost= rows=50000 width=75) (actual time= rows=109 loops=1) Hash Cond: ((public.raster.raster)::text = substr((raster_file.raster)::text, (length((raster_file.raster)::text) - 10), 11)) -> Limit (cost= rows=100 width=12) (actual time= rows=100 loops=1) -> Seq Scan on raster (cost= rows= width=12) (actual time= rows=50100 loops=1) -> Hash (cost= rows= width=35) (actual time= rows= loops=1) -> Seq Scan on raster_file (cost= rows= width=35) (actual time= rows= loops=1) Total runtime: ms Total runtime: s

Adding the Indexes CREATE INDEX [ name ] ON table ( expression ) expression An expression based on one or more columns of the table. The expression usually must be written with surrounding parentheses, as shown in the syntax. However, the parentheses can be omitted if the expression has the form of a function call. CREATE INDEX raster_raster_idx ON raster find_raster(raster); CREATE INDEX raster_file_raster_idx ON raster_file find_raster(raster); CREATE INDEX raster_page_raster_idx ON raster_page find_raster(raster);

What the New Join Looks Like SELECT raster.*, raster_page.* FROM (SELECT * FROM raster OFFSET LIMIT 100) raster LEFT JOIN raster_page ON raster.raster = find_raster(raster_page.raster); Nested Loop Left Join (cost= rows= width=88) (actual time= rows=540 loops=1) -> Limit (cost= rows=100 width=40) (actual time= rows=100 loops=1) -> Seq Scan on raster (cost= rows= width=40) (actual time= rows=50100 loops=1) -> Index Scan using raster_page_raster_idx on raster_page (cost= rows=2500 width=48) (actual time= rows=5 loops=100) Index Cond: ((public.raster.raster)::text = (find_raster(raster_page.raster))::text) Total runtime: ms Total runtime: s

What the New Join Looks Like SELECT raster.*, raster_file.* FROM (SELECT * FROM raster OFFSET LIMIT 100) raster LEFT JOIN raster_file ON raster.raster = find_raster(raster_file.raster); Nested Loop Left Join (cost= rows=50000 width=75) (actual time= rows=109 loops=1) -> Limit (cost= rows=100 width=12) (actual time= rows=100 loops=1) -> Seq Scan on raster (cost= rows= width=12) (actual time= rows=50100 loops=1) -> Index Scan using raster_file_raster_idx on raster_file (cost= rows=500 width=35) (actual time= rows=1 loops=100) Index Cond: ((public.raster.raster)::text = (find_raster(raster_file.raster))::text) Total runtime: ms Total runtime: s

Another type of Partial Indexes CREATE [ UNIQUE ] INDEX [ name ] ON table [ USING method ] ( { column } [,...] ) [ WHERE predicate ] CREATE TABLE tests ( subject text, target text, success boolean,... ); CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target) WHERE success;