Teradata Physical Implementation – Case Study

Slides:

Advertisements

Similar presentations

Dr. Alexandra I. Cristea CS 252: Fundamentals of Relational Databases: SQL5.

Advertisements

SQL Rohit Khokher.

Copyright © by Royal Institute of Information Technology Introduction To Structured Query Language (SQL) 1.

Monday, 08 June 2015Dr. Mohamed Osman1 What is Database Administration A high level function (technical Function) that is responsible for ► physical DB.

Indexing Techniques CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 Indexing Goal: Increase efficiency of data.

SQL 2 – The Sequel R&G, Chapter 5 Lecture 10. Administrivia Homework 2 assignment now available –Due a week from Sunday Midterm exam will be evening of.

Introduction to Structured Query Language (SQL)

SQL components In Oracle. SQL in Oracle SQL is made up of 4 components: –DDL Data Definition Language CREATE, ALTER, DROP, TRUNCATE. Creates / Alters.

Introduction to Structured Query Language (SQL)

Structured Query Language SQL: An Introduction. SQL (Pronounced S.Q.L) The standard user and application program interface to a relational database is.

DATABASES AND SQL. Introduction Relation: Relation means table(data is arranged in rows and columns) Domain : A domain is a pool of values appearing in.

MySQL Dr. Hsiang-Fu Yu National Taipei University of Education

Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.

CODD’s 12 RULES OF RELATIONAL DATABASE

DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.

CHAPTER:14 Simple Queries in SQL Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.

Chapter 7 SQL HUANG XUEHUA. SQL SQL server2005 introduction Install components  management studio.

SQL pepper. Why SQL File I/O is a great deal of code Optimal file organization and indexing is critical and a great deal of code and theory implementation.

MODULE 1 1 Please logon to

SQL SQL Server : Overview SQL : Overview Types of SQL Database : Creation Tables : Creation & Manipulation Data : Creation & Manipulation Data : Retrieving.

Nitin Singh/AAO RTI ALLAHABAD 1 SQL Nitin Singh/AAO RTI ALLAHABAD 2 OBJECTIVES §What is SQL? §Types of SQL commands and their function §Query §Index.

CS146 References: ORACLE 9i PROGRAMMING A Primer Rajshekhar Sunderraman

Intro to SQL| MIS 2502  Spacing not relevant › BUT… no spaces in an attribute name or table name  Oracle commands keywords, table names, and attribute.

BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 7 Introduction to Structured.

Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.

Visual Programing SQL Overview Section 1.

Indexes and Views Unit 7.

1 DBS201: More on SQL Lecture 3. 2 Agenda How to use SQL to update table definitions How to update data in a table How to join tables together.

DAY 21: MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Aliya Farheen October 29,2015.

ITEC 3220A Using and Designing Database Systems Instructor: Prof. Z. Yang Course Website: 3220a.htm

Chapter 5 Index and Clustering

1 MySQL and SQL. 2 Topics  Introducing Relational Databases  Terminology  Managing Databases MySQL and SQL.

Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.

Introduction to Teradata Client Tools. 2 Introduction to Teradata SQL  OBJECTIVES :  Teradata Product Components.  Accessing Teradata – Database /

1 Comparison between Oracle and Teradata Center of Excellence Data Warehousing Wipro Technologies.

Table Structures and Indexing. The concept of indexing If you were asked to search for the name “Adam Wilbert” in a phonebook, you would go directly to.

Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata RDBMS Concepts.

CSCI N311: Oracle Database Programming 5-1 Chapter 15: Changing Data: insert, update, delete Insert Rollback Commit Update Delete Insert Statement –Allows.

7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.

LM 5 Introduction to SQL MISM 4135 Instructor: Dr. Lei Li.

MySQL Tutorial. Databases A database is a container that groups together a series of tables within a single structure Each database can contain 1 or more.

LAB: Web-scale Data Management on a Cloud Lab 11. Query Execution Plan 2011/05/27.

Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Physical Database Design Considerations.

SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.

IS232 Lab 9. CREATE USER Purpose: Use the CREATE USER statement to create and configure a database user, which is an account through which you can log.

Standard language for querying and manipulating data Structured Query Language Many standards out there: ANSI SQL, SQL92 (a.k.a. SQL2), SQL99 (a.k.a. SQL3),

SQL Server Statistics and its relationship with Query Optimizer

Rob Gleasure robgleasure.com

SQL Implementation & Administration

Serial Number and Indexing in PostgreSQL

Choosing Access Path The basic methods.

Using SQL Server through Command Prompt

Advanced Teradata SQL GLOBAL Temporary Vs VOLATILE Temporary Vs Derived tables WITH and WITH BY Special Index function Trigger Online Analytical Function.

Teradata Join Processing

Methodology – Physical Database Design for Relational Databases

SQL: Advanced Options, Updates and Views Lecturer: Dr Pavle Mogin

Lecture#7: Fun with SQL (Part 2)

Optimizing Queries Using Materialized Views

CHAPTER 6: INTRODUCTION TO SQL

Lecturer: Mukhtar Mohamed Ali “Hakaale”

Chapter # 7 Introduction to Structured Query Language (SQL) Part II.

ORACLE I 2 Salim Phone : YM : talim_bansal.

MySQL Dr. Hsiang-Fu Yu National Taipei University of Education

Database systems Lecture 3 – SQL + CRUD

Intro to Relational Databases

CMPT 354: Database System I

Troubleshooting Techniques(*)

Chapter # 7 Introduction to Structured Query Language (SQL) Part I.

Instructor: Samia arshad

Presentation transcript:

Teradata Physical Implementation – Case Study

Create Table - Distribution Check & PI Change - Fallback Create Index - USI - NUSI Create Join Index Create & Collect Statistics

Create Table – Copy Data CREATE SET TABLE TPCH.Customer ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ADDRESS VARCHAR(40) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_COMMENT VARCHAR(117) CHARACTER SET LATIN CASESPECIFIC NOT NULL) UNIQUE PRIMARY INDEX ( C_CUSTKEY ); SELECT databasename, Tablename, sum(CurrentPerm) FROM DBC.TABLESIZE where databasename = 'TPCH' group by databasename, Tablename DatabaseNameTableNameSum(CurrentPerm) TPCH ORDERTBL 7334912 TPCH LINEITEM 34191360 TPCH PARTTBL 1211392 TPCH PARTSUPP 5183488 TPCH NATION 5632 TPCH REGION 3072 TPCH CUSTOMER 1080832 TPCH SUPPLIER 67584 show table TPCH.Customer; If Privileges missing grant to your user GRANT SELECT ON DBC TO TRAINER; GRANT SELECT ON TPCH TO TRAINER; Login using your ID

Data Distribution Check 1) Create Customer Table in your User/Database; - Keep the same definition (No Fallback & Same PI) - You can create the table and get the data OR can be achieved as below. CREATE TABLE TRAINER.CUSTOMER AS TPCH.CUSTOMER WITH DATA; Show table and check the definition and the data in the table. show table TRAINER.Customer; 2) Check the Table size by AMP SELECT * FROM DBC.TABLESIZE where databasename = 'TRAINER' and Tablename= 'CUSTOMER‘ VprocDatabaseNameAccountNameTableNameCurrentPermPeakPerm 0TRAINER DBC CUSTOMER 540672540672 1TRAINER DBC CUSTOMER 540160540160 -- Current PI of the Table is C_CUSTKEY, Degree of Uniqueness is 100% select count(distinct C_CUSTKEY), count(1) from TRAINER.CUSTOMER Count(Distinct(C_CUSTKEY))Count(1) 60006000

Change in PI 1) Consider the different PI for CUSTOMER Table. CREATE MULTISET TABLE TRAINER.Customer_PI ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ADDRESS VARCHAR(40) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_COMMENT VARCHAR(117) CHARACTER SET LATIN CASESPECIFIC NOT NULL) PRIMARY INDEX ( C_MKTSEGMENT ); 1) Consider the different PI for CUSTOMER Table. select C_MKTSEGMENT, count(1) from TRAINER.CUSTOMER group by C_MKTSEGMENT C_MKTSEGMENTCount(1) FURNITURE 1169 MACHINERY 1174 BUILDING 1296 HOUSEHOLD 1171 AUTOMOBILE1190 select count(distinct C_MKTSEGMENT), count(1) Count(Distinct(C_MKTSEGMENT))Count(1) 56000 2) Create the Table CUSTOMER_PI , same definition as Customer but with C_MKTSEGMENT as PI. 3) Check the size by AMP. VprocDatabaseNameAccountNameTableNameCurrentPermPeakPerm 0TRAINER DBC CUSTOMER_PI 444416444416 1TRAINER DBC CUSTOMER_PI 635904635904

Fallback Impact CREATE MULTISET TABLE TRAINER.Customer_FB , FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ADDRESS VARCHAR(40) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_COMMENT VARCHAR(117) CHARACTER SET LATIN CASESPECIFIC NOT NULL) PRIMARY INDEX ( C_MKTSEGMENT ); 1) Create Table CUSTOMER_FB with the FALLBACK ON and check the Table Size. 2) Check the size by AMP. - Note the size is doubled in total. - This is just a two amp system so one AMP is the FALLBACK for the other so shows same size with FALLBACK. VprocDatabaseNameAccountNameTableNameCurrentPermPeakPerm 0TRAINER DBC CUSTOMER_FB 10792961079296 1TRAINER DBC CUSTOMER_FB 10792961079296

Create Table - Distribution Check & PI Change - Fallback Create Index - USI - NUSI Create Join Index Create & Collect Statistics

Creating USI Explain the below Query (PI is C_MKT_SEGMENT) EXPLAIN SELECT * FROM CUSTOMER_PI WHERE C_CUSTKEY = 1613 We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_PI by way of an all-rows scan with a condition of ( Note – Its is doing the full Table scan 2) Create a USI on C_CUSTKEY CREATE UNIQUE INDEX IDX_CKEY (C_CUSTKEY) ON CUSTOMER_PI; 3) Note the change in the Explain for Index scan and the change in the response time CREATE MULTISET TABLE TRAINER.Customer_PI ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ADDRESS VARCHAR(40) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_COMMENT VARCHAR(117) CHARACTER SET LATIN CASESPECIFIC NOT NULL) PRIMARY INDEX ( C_MKTSEGMENT ); EXPLAIN – WITH USI 1) First, we do a two-AMP RETRIEVE step from TRAINER.CUSTOMER_PI by way of unique index # 4 "TRAINER.CUSTOMER_PI.C_CUSTKEY = 1613" with no residual conditions. The estimated time for this step is 0.02 seconds. -> The row is sent directly back to the user as the result of statement 1. The total estimated time is 0.02 seconds. EXPLAIN – WITHOUT USI 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_FB. 2) Next, we lock TRAINER.CUSTOMER_FB for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_FB by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_FB.C_CUSTKEY = 1613") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 707 rows. The estimated time for this step is 0.14 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.14 seconds.

Creating NUSI Explain the below Query EXPLAIN select * from CUSTOMER_PI where C_NATIONKEY = 17 Note – Its is doing the full Table scan 2) Create a NUSI on C_NATIONKEY CREATE INDEX IDX_NTKEY (C_NATIONKEY) ON CUSTOMER_PI; 3)Explain the same Query again Note the change in the Explain for Index scan and the change in the response time – The Index is not used - We can only Create the Index but can not enforce the usage. The Usage depends on the Optimizer. EXPLAIN – WITHOUT NUSI 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_PI. 2) Next, we lock TRAINER.CUSTOMER_PI for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_PI by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_PI.C_NATIONKEY = 17") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 707 rows. The estimated time for this step is 0.14 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.14 seconds. EXPLAIN – WITH NUSI 1) First, we do a two-AMP RETRIEVE step 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_PI. 2) Next, we lock TRAINER.CUSTOMER_PI for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_PI by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_PI.C_NATIONKEY = 17") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 283 rows. The estimated time for this step is 0.13 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.13 seconds.

Create Table - Distribution Check & PI Change - Fallback Create Index - USI - NUSI Create Join Index Create & Collect Statistics

Creating JOIN Index EXPLAIN the below query: EXPLAIN SELECT SUM(C_ACCTBAL) FROM CUSTOMER WHERE C_NATIONKEY = 10 2) Create a Single table Join Index and CREATE JOIN INDEX CUSTOMER_JI AS SELECT C_CUSTKEY, C_NATIONKEY, C_ACCTBAL, C_MKTSEGMENT PRIMARY INDEX (C_NATIONKEY); 3) Run the same explain plan again. EXPLAIN – WITHOUT JI 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER. 2) Next, we lock TRAINER.CUSTOMER for read. 3) We do an all-AMPs SUM step to aggregate from TRAINER.CUSTOMER by way of an all-rows scan with a condition of ("TRAINER.CUSTOMER.C_NATIONKEY = 10"). Aggregate Intermediate Results are computed globally, then placed in Spool 3. The size of Spool 3 is estimated with high confidence to be 1 row. The estimated time for this step is 0.13 seconds. 4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with high confidence to be 1 row. The estimated time for this step is 0.03 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. EXPLAIN – WITH JI 1) First, we do a single-AMP SUM step to aggregate from TRAINER.CUSTOMER_JI by way of the primary index "TRAINER.CUSTOMER_JI.C_NATIONKEY = 10" with no residual conditions, and the grouping identifier in field1. Aggregate Intermediate Results are computed locally, then placed in Spool 3. The size of Spool 3 is estimated with high confidence to be 1 row. The estimated time for this step is 0.03 seconds. 2) Next, we do a single-AMP RETRIEVE step from Spool 3 (Last Use) by way of the primary index "TRAINER.CUSTOMER_JI.C_NATIONKEY = 10“ into Spool 1 (one-amp), which is built locally on that AMP. The size of Spool 1 is estimated with high confidence to be 1 row. The estimated time for this step is 0.03 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1.

Create Table - Distribution Check & PI Change - Fallback Create Index - USI - NUSI Create Join Index Create & Collect Statistics

Collect Statistics COLLECT STATISTICS EXPLAIN – WITH STATISTICS Run the Below Explain EXPLAIN SELECT * FROM CUSTOMER_FB WHERE C_NATIONKEY = 10; Note the LOW confidence and the number of rows 707. - Collect Statistics COLLECT STATISTICS ON CUSTOMER_FB COLUMN(C_NATIONKEY); Explain Again Note the HIGH confidence and the number of rows 290. Actual count in the Table SELECT COUNT(1) WHERE C_NATIONKEY = 10 246 EXPLAIN – WITHOUT STATISTICS 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_FB. 2) Next, we lock TRAINER.CUSTOMER_FB for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_FB by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_FB.C_NATIONKEY = 10") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 707 rows. The estimated time for this step is 0.14 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.14 seconds. EXPLAIN – WITH STATISTICS 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_FB. 2) Next, we lock TRAINER.CUSTOMER_FB for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_FB by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_FB.C_NATIONKEY = 10") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with high confidence to be 290 rows. The estimated time for this step is 0.13 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.13 seconds.