Partitioning & Creating Hardware Tablespaces for Performance

Slides:



Advertisements
Similar presentations
How to corrupt your data by accident BY: LLOYD ALBIN 9/3/2013.
Advertisements

Fundamentals, Design, and Implementation, 9/e Chapter 11 Managing Databases with SQL Server 2000.
Working with SQL and PL/SQL/ Session 1 / 1 of 27 SQL Server Architecture.
Module 9: Managing Schema Objects. Overview Naming guidelines for identifiers in schema object definitions Storage and structure of schema objects Implementing.
SQL Basics. SQL SQL (Structured Query Language) is a special-purpose programming language designed from managing data in relational database management.
Postgres Bug #8545 pg_dump fails to dump database grants BY: LLOYD ALBIN 11/5/2013.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor Ms. Arwa.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Component 4/Unit 6f Topic VI: Create simple querying statements for the database The SELECT statement Clauses Functions Joins Subqueries Data manipulation.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Views Lesson 7.
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
Maintaining a Database Access Project 3. 2 What is Database Maintenance ?  Maintaining a database means modifying the data to keep it up-to-date. This.
Constraints Lesson 8. Skills Matrix Constraints Domain Integrity: A domain refers to a column in a table. Domain integrity includes data types, rules,
Chapter 5 : Integrity And Security  Domain Constraints  Referential Integrity  Security  Triggers  Authorization  Authorization in SQL  Views 
Session 1 Module 1: Introduction to Data Integrity
Professor: Dr. Shu-Ching Chen TA: Hsin-Yu Ha Function, Trigger used in PosgreSQL.
1 Intro stored procedures Declaring parameters Using in a sproc Intro to transactions Concurrency control & recovery States of transactions Desirable.
A Guide to SQL, Eighth Edition Chapter Six Updating Data.
Relational Database Management System(RDBMS) Structured Query Language(SQL)
SQL Introduction to database and SQL. Chapter 1: Databases and Database Users 6 Introduction to Databases Databases touch all aspects of our lives. Examples:
Unit-8 Introduction Of MySql. Types of table in PHP MySQL supports various of table types or storage engines to allow you to optimize your database. The.
Retele de senzori Curs 2 - 1st edition UNIVERSITATEA „ TRANSILVANIA ” DIN BRAŞOV FACULTATEA DE INGINERIE ELECTRICĂ ŞI ŞTIINŢA CALCULATOARELOR.
SQL Basics Review Reviewing what we’ve learned so far…….
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Database Constraints Ashima Wadhwa. Database Constraints Database constraints are restrictions on the contents of the database or on database operations.
Vacuum ● Records deleted or obsoloted by an update are not reclaimed as free space and cannot be reused ● Vacuum claims that space for the system to reuse,
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Understanding Core Database Concepts Lesson 1. Objectives.
Fundamentals of DBMS Notes-1.
Trigger used in PosgreSQL
Practical Database Design and Tuning
SQL Query Getting to the data ……..
“Introduction To Database and SQL”
INLS 623– Database Systems II– File Structures, Indexing, and Hashing
Indexing Structures for Files and Physical Database Design
The Basics of Data Manipulation
Chapter 6 - Database Implementation and Use
Microsoft Office Access 2010 Lab 2
Prepared by : Moshira M. Ali CS490 Coordinator Arab Open University
Physical Changes That Don’t Change the Logical Design
SQL Implementation & Administration
Instructor: Jason Carter
Informatica PowerCenter Performance Tuning Tips
Chapter 4 Relational Databases
Introduction to Computers
Database Construction (and Usage)
“Introduction To Database and SQL”
DATABASE MANAGEMENT SYSTEM
Chapter 8 Working with Databases and MySQL
Database Fundamentals
Session #, Speaker Name Indexing Chapter 8 11/19/2018.
Lecture Set 14 B new Introduction to Databases - Database Processing: The Connected Model (Using DataReaders)
The Basics of Data Manipulation
Practical Database Design and Tuning
Advanced SQL: Views & Triggers
Chapter 4 Indexes.
CH 4 Indexes.
Microsoft SQL Server 2014 for Oracle DBAs Module 7
Chapter Four UNIX File Processing.
CH 4 Indexes.
Manipulating Data.
Oracle9i Developer: PL/SQL Programming Chapter 8 Database Triggers.
Accounting Information Systems 9th Edition
Contents Preface I Introduction Lesson Objectives I-2
Relational Database Design
Chapter 11 Managing Databases with SQL Server 2000
Prof. Arfaoui. COM390 Chapter 9
Indexes and more Table Creation
Understanding Core Database Concepts
Presentation transcript:

Partitioning & Creating Hardware Tablespaces for Performance By Lloyd Albin Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

What you will be learning Partitioning How to split a single table into multiple tables by date or some other criteria that you will use in your where clauses, so that not all the data needs to be read during queries. How to truncate and reload child tables quickly without affecting the all the tables. How to remove child tables and archive them, but also add archived tables back in with a single command. Tablespaces How to split your data across multiple media such as SSD, Hard Drive clusters, Long Term Tape Storage, etc. How to put your temp tables, used for sorting, etc, onto your fastest media. Partitioning & Tablespaces How to combine Partitioning & Tablespaces to create a really efficient system to access your most frequently needed information. Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Partitioning Creating Parent / Child Table Relationships Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Partitioning Postgres Documention: http://www.postgresql.org/docs/9.3/static/ddl-partitioning.html Partitions can be created using either ranges or a list of values, such as date ranges. Here are some examples: By Month By Quarter By Year By Customer By Study Parent and Child tables do not need to live in the same schema and each child table can even be in separate schemas. For example each study could be in its own schema and the parent table link all the child studies together. Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Create your parent table This is really simple, just create a normal table. CREATE TABLE journal (   key BIGSERIAL,   record_timestamp TIMESTAMP,   dfstudy NUMERIC,   … ); Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Creating a child table Now we can create a child table based on the format of the parent table. Lets create two child tables, one for this month and one for last month. It is a nicety to name the constraints, but Postgres will also auto name them for you. CREATE TABLE journal_201404 ( CHECK (     record_timestamp >= '2014-04-01 00:00:00'::timestamp AND     record_timestamp < '2014-05-01 00:00:00'::timestamp   ),   PRIMARY KEY (key) ) INHERITS (journal); CREATE INDEX journal_201404_dfstudy_idx ON journal_201404 (dfstudy); CREATE TABLE journal_201403 ( CONSTRAINT journal_201403_check CHECK (     record_timestamp >= '2014-03-01 00:00:00'::timestamp AND     record_timestamp < '2014-04-01 00:00:00'::timestamp   ),   CONSTRAINT journal_201403_pkey PRIMARY KEY (key) ) INHERITS (journal); CREATE INDEX journal_201403_dfstudy_idx ON journal_201403 (dfstudy); Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Inserting Data There three ways to insert data into the child tables. Directly into the Child Table Parent Table Trigger Parent Table Rules Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Inserting Data Directly If you reload the data on a schedule, such as a cron job, there are some benefits to this since we can quickly truncate the table and then reload the data via a COPY command. Any query that tries to read this sub table during the transaction will wait until the COMMIT completes. BEGIN; TRUNCATE TABLE journal_201403; COPY journal_201403 FROM STDIN WITH HEADER TRUE, DELIMITER '|'; COMMIT; Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Data via Trigger If you reload the data on a schedule, such as a cron job, there are some benefits to this since we can quickly truncate the table and then reload the data via a COPY command. Any query that tries to read this sub table during the transaction will wait until the COMMIT completes. CREATE OR REPLACE FUNCTION journal_trigger_func() RETURNS TRIGGER AS $$ DEFINE   subtable TEXT;   old_subtable TEXT; BEGIN IF (TG_OP = 'INSERT') THEN   … ELSIF (TG_OP = 'DELETE') THEN … ELSIF (TG_OP = 'UPDATE') THEN   … ELSE     RAISE EXCEPTION '% is not supported via the journal table!', TG_OP; END IF; RETURN NULL; -- Abort write to parent table EXCEPTION WHEN OTHERS THEN     RAISE EXCEPTION 'Writing journal child table failed!';     RETURN NULL; END; $$ LANGUAGE plpgsql;   CREATE TRIGGER journal_trigger     BEFORE INSERT, UPDATE, DELETE ON journal     FOR EACH ROW EXECUTE PROCEDURE journal_trigger_func(); Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Inserting Data via Trigger We need to create the table name that we want to insert into and then insert the data into the table. You can also create the child tables automatically from within this function if they do not exist. IF (TG_OP = 'INSERT') THEN   subtable = 'journal_' || date_part('year', NEW.record_timestamp) || date_part('month', NEW.record_timestamp); -- CREATE TABLE IF NOT EXISTS subtable   IF length(subtable) = 14 THEN     EXECUTE 'INSERT INTO $1 VALUES (($2).*);' USING quote_ident(subtable), NEW; ELSE     RAISE EXCEPTION 'Failed to properly generate journal subtable name!'; END IF; ELSIF (TG_OP = 'DELETE') THEN Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Deleteing Data via Trigger You should also check to make sure the child table exists, so that you can trap that error specifically instead of using the general error handler provided by the OTHERS. ELSIF (TG_OP = 'DELETE') THEN   old_subtable = 'journal_' || date_part('year', OLD.record_timestamp) || date_part('month', OLD.record_timestamp); -- Check to make sure the table exists   IF length(old_subtable) = 14 THEN     EXECUTE 'DELETE FROM $1 WHERE key = $2;' USING quote_ident(old_subtable), OLD.key; ELSE     RAISE EXCEPTION 'Failed to properly generate journal old_subtable name!'; END IF; ELSIF (TG_OP = 'UPDATE') THEN Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Updating Data via Trigger You should also check to make sure the child table exists, so that you can trap that error specifically instead of using the general error handler provided by the OTHERS. ELSIF (TG_OP = 'UPDATE') THEN   subtable = 'journal_' || date_part('year', NEW.record_timestamp) || date_part('month', NEW.record_timestamp);   old_subtable = 'journal_' || date_part('year', OLD.record_timestamp) || date_part('month', OLD.record_timestamp); -- Check to make sure subtable exists   IF subtable = old_subtable THEN     EXECUTE 'UPDATE $1 SET record_timestamp = $2, dfstudy = $3, … WHERE key = $4;' USING quote_ident(old_subtable), NEW.record_timestamp, NEW.dfstudy, …, OLD.key; ELSE     EXECUTE 'DELETE FROM $1 WHERE key = $2;' USING quote_ident(old_subtable), OLD.key;     EXECUTE 'INSERT INTO $1 VALUES (($2).*);' USING quote_ident(subtable), NEW; END IF; ELSE Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Inserting Data via Rule Rules are not executed by COPY commands and this is why functions are normally used. Rule are faster for INSERT since they run once per query rather than once per row, but they also have significantly more overhead than a trigger. CREATE RULE journal_insert_201404 AS ON INSERT TO journal WHERE ( record_timestamp >= '2014-04-01 00:00:00'::timestamp AND record_timestamp < '2014-05-01 00:00:00'::timestamp ) DO INSTEAD     INSERT INTO journal_201404 VALUES (NEW.*); Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Selecting Data The first example will get all the data from the parent and child tables. The second example will get the data only from the parent table and nothing from the child tables. The third and fourth examples get data directly from the child tables. The fifth example get the data from the journal & journal_201404 tables. SELECT * FROM journal; SELECT * FROM ONLY journal; SELECT * FROM journal_201403; SELECT * FROM journal_201404; SELECT * FROM journal WHERE record_timestamp = '2014-04-26 12:00:00'; Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Query Plan In our example we are using 18 child tables, one per month with a total of 6,425,250 rows of data at the time this example was run. 368 rows returned in 31 ms SELECT * FROM journal WHERE record_timestamp >= '2014-01-01 00:00:00'::timestamp AND record_timestamp < '2014-01-02 00:00:00'::timestamp; Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Traditional Single Table Query Plan In our example we are one table with a total of 6,425,250 rows of data at the time this example was run. 368 rows returned in 4.228 sec The traditional table version is more than 100 times slower. CREATE TABLE journal_all AS SELECT * FROM journal; SELECT * FROM journal_all WHERE record_timestamp >= '2014-01-01 00:00:00'::timestamp AND record_timestamp < '2014-01-02 00:00:00'::timestamp; Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Caveats No way to check to see if to of the CHECK statements collide, such as they both cover the same day. Can’t use timestamp such as NOW() for use in queries because they will cause all tables to be searched. Each CHECK constraint on the parent table must be evaluated where queries are executed against the parent table. Use up to hundred of partitions but not thousands of partitions. Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Tablespaces Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Multiple Disks & Directories Tablesspaces Multiple Disks A B C Tablespaces are physical disk locations that Postgres may use to store information. Tablespaces are cluster wide, this means that definition for the tablespace is at the cluster level and are not backed up by pg_dump. Multiple databases can use the same or different tablespaces. You may even select different tablespaces per table, index, etc. Tablespaces can only be created by superusers, but may be owned by a specified user. Multiple Directories C B A Multiple Disks & Directories E D C B A Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Defining Tablespaces CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data'; Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Default Tablespace For any connection, you may change the default tablespace. SET default_tablespace = 'tbname'; RESET default_tablespace; Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Tablespaces for your temp tables SET temp_tablespaces = 'tbname,…'; RESET temp_tablespaces; Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Tablespaces for different Drive types Example: SSD Storage for data that you query all the time, large data sets, indexes, temp tables. Local Storage for data that you access regularly. NAS Storage for data access less often. Tape Library Storage for data that is only access one a year or less. Multiple Disks SSD Local NAS Tape Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Caveats pg_dump will not dump your tablespace information. You must use pg_dump_all to get the information. Tablespaces can’t be created inside transcations. Tablespaces are only supported on systems that support symbolic links. Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Using Partitions & Tablespaces Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014

Tablespaces for different Drive types Example: SSD for last two months of data Local Storage for 3 months to 1 year NAS Storage for 1 to 5 years Tape Storage for 6 years + Multiple Disks SSD Local NAS Tape Partitioning & Creating Hardware Tablespaces for Performance 4/27/2014