Teradata Parallel Transporter Scripts with Simplified Syntax

Slides:



Advertisements
Similar presentations
Session 2Introduction to Database Technology Data Types and Table Creation.
Advertisements

Introducing JavaScript
The Web Warrior Guide to Web Design Technologies
Chapter 51 Scripting With JSP Elements JavaServer Pages By Xue Bai.
VBA Modules, Functions, Variables, and Constants
Working with JavaScript. 2 Objectives Introducing JavaScript Inserting JavaScript into a Web Page File Writing Output to the Web Page Working with Variables.
Linux+ Guide to Linux Certification, Second Edition
XP 1 Working with JavaScript Creating a Programmable Web Page for North Pole Novelties Tutorial 10.
A Guide to Oracle9i1 Advanced SQL And PL/SQL Topics Chapter 9.
A Guide to SQL, Seventh Edition. Objectives Embed SQL commands in PL/SQL programs Retrieve single rows using embedded SQL Update a table using embedded.
XP Tutorial 1 New Perspectives on JavaScript, Comprehensive1 Introducing JavaScript Hiding Addresses from Spammers.
Chapter 11 ASP.NET JavaScript, Third Edition. 2 Objectives Learn about client/server architecture Study server-side scripting Create ASP.NET applications.
C++ fundamentals.
Module 2: Using Transact-SQL Querying Tools. Overview SQL Query Analyzer Using the Object Browser Tool in SQL Query Analyzer Using Templates in SQL Query.
Chapter 15 Introductory Bash Programming
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
Bordoloi and Bock CURSORS. Bordoloi and Bock CURSOR MANIPULATION To process an SQL statement, ORACLE needs to create an area of memory known as the context.
PL / SQL P rocedural L anguage / S tructured Q uery L anguage Chapter 7 in Lab Reference.
1 Chapter 5: Names, Bindings and Scopes Lionel Williams Jr. and Victoria Yan CSci 210, Advanced Software Paradigms September 26, 2010.
Chapter 6 Additional Database Objects
1 PHP and MySQL. 2 Topics  Querying Data with PHP  User-Driven Querying  Writing Data with PHP and MySQL PHP and MySQL.
Week 7 Working with the BASH Shell. Objectives  Redirect the input and output of a command  Identify and manipulate common shell environment variables.
CNIT 133 Interactive Web Pags – JavaScript and AJAX JavaScript Environment.
An Introduction to Unix Shell Scripting
Distributed Systems Fall 2014 Zubair Amjad. Outline Motivation What is Sqoop? How Sqoop works? Sqoop Architecture Import Export Sqoop Connectors Sqoop.
XP Tutorial 10New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with JavaScript Creating a Programmable Web Page for North Pole.
Creating Dynamic Web Pages Using PHP and MySQL CS 320.
Chapter 6 Additional Database Objects Oracle 10g: SQL.
 2008 Pearson Education, Inc. All rights reserved JavaScript: Functions.
Linux+ Guide to Linux Certification, Third Edition
Linux+ Guide to Linux Certification Chapter Eight Working with the BASH Shell.
1 Functions Lecfture Abstraction abstraction is the process of ignoring minutiae and focusing on the big picture in modern life, we are constantly.
Overview · What is PL/SQL · Advantages of PL/SQL · Basic Structure of a PL/SQL Block · Procedure · Function · Anonymous Block · Types of Block · Declaring.
A NoSQL Database - Hive Dania Abed Rabbou.
Database Handling, Sessions, and AJAX. Post Back ASP.NET Functionality The IsPostBack method in ASP.NET is similar to the BlackBerry.refresh method –IsPostBack.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
7 1 User-Defined Functions CGI/Perl Programming By Diane Zak.
1 PL\SQL Dev Templates. 2 TEMPLATE DEFINITION Whenever you create a new program unit, its initial contents are based upon a template which contains pre-defined.
Chapter 9: Advanced SQL and PL/SQL Guide to Oracle 10g.
Variables and control statements in PL\SQL Chapter 10.
8 Chapter Eight Server-side Scripts. 8 Chapter Objectives Create dynamic Web pages that retrieve and display database data using Active Server Pages Process.
Professor: Dr. Shu-Ching Chen TA: Hsin-Yu Ha Function, Trigger used in PosgreSQL.
1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.
Starting with Oracle SQL Plus. Today in the lab… Connect to SQL Plus – your schema. Set up two tables. Find the tables in the catalog. Insert four rows.
Linux+ Guide to Linux Certification, Second Edition
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. A Concise Introduction to MATLAB ® William J. Palm III.
Text TCS INTERNAL Oracle PL/SQL – Introduction. TCS INTERNAL PL SQL Introduction PLSQL means Procedural Language extension of SQL. PLSQL is a database.
JavaScript Introduction and Background. 2 Web languages Three formal languages HTML JavaScript CSS Three different tasks Document description Client-side.
Lab 2 Writing PL/SQL Blocks CISB514 Advanced Database Systems.
1 CSC160 Chapter 1: Introduction to JavaScript Chapter 2: Placing JavaScript in an HTML File.
Module 10 Merging Data and Passing Tables. Module Overview Using the MERGE Statement Implementing Table Types Using Table Types As Parameters.
​ TdBench 7.2 – tdb.sh Utility Script. 2 Created for TdBench 7.x release to consolidate tools Open architecture – looks for scripts in the./tools directory.
SQL Triggers, Functions & Stored Procedures Programming Operations.
XP Tutorial 10New Perspectives on HTML, XHTML, and DHTML, Comprehensive 1 Working with JavaScript Creating a Programmable Web Page for North Pole Novelties.
Linux Administration Working with the BASH Shell.
E Copyright © 2006, Oracle. All rights reserved. Using SQL Developer.
Oracle 11g: SQL Chapter 5 Data Manipulation and Transaction Control.
1 Copyright © 2004, Oracle. All rights reserved. PL/SQL Programming Concepts: Review.
1 Lecture 2 - Introduction to C Programming Outline 2.1Introduction 2.2A Simple C Program: Printing a Line of Text 2.3Another Simple C Program: Adding.
C Copyright © 2009, Oracle. All rights reserved. Using SQL Developer.
BIL 104E Introduction to Scientific and Engineering Computing Lecture 4.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
A Guide to SQL, Seventh Edition
SQL and SQL*Plus Interaction
Building Web Applications
Session - 6 Sequence - 1 SQL: The Structured Query Language:
Contents Preface I Introduction Lesson Objectives I-2
Chapter 8 Advanced SQL.
05 | Processing Big Data with Hive
Presentation transcript:

Teradata Parallel Transporter Scripts with Simplified Syntax Nam Tran, Teradata Parallel Transporter Teradata Corporation October 6, 2011

Today’s Agenda Introduction Job Script Example Before Simplified Syntax Using Operator Templates Using Generated Schemas Using Generated SQL INSERT Statements Job Script Example After Simplified Syntax Q & A

Introduction Customer ease-of-use is of paramount importance to Teradata’s strategy. Teradata Parallel Transporter 13.10 and 14.0 introduces a simplified script syntax: Smaller and simpler job scripts Less job script maintenance with use of operator templates User-defined templates allow any degree of customization Generated schema objects Generated SQL insert statement Standardized the names of job variables that correspond to operator attributes Target audience of this presentation: Users of script-based TPT Users who write TPT job scripts Users who work with TPT job script generation tools

Job Script Example Before Simplified Syntax Two main script sections DEFINE JOB File_Load DESCRIPTION ‘Load table from flat file’ ( DEFINE SCHEMA Customer_Accounts Account_Number INTEGER, Account_Name VARCHAR(50), Trans_Number INTEGER, Trans_Date ANSIDATE, Trans_Amount VARCHAR(20) ); DEFINE OPERATOR File_Reader TYPE DATACONNECTOR PRODUCER SCHEMA Customer_Accounts ATTRIBUTES VARCHAR FileName = @Filename, VARCHAR Format = @Format, VARCHAR OpenMode = @Openmode, VARCHAR TextDelimiter = @Delimiter, VARCHAR PrivateLogName = @DCLog /* continued on next page */ Declarative Section Defines the “what”: DEFINE SCHEMA object(s) DEFINE OPERATOR object(s)

Job Script Example Before Simplified Syntax (continued) /* continued from previous page */ DEFINE OPERATOR Load_Op TYPE LOAD SCHEMA * ATTRIBUTES ( VARCHAR TdpId = @TdpId, VARCHAR UserName = @UserName, VARCHAR UserPassword = @UserPW, VARCHAR TargetTable = @TarTable, VARCHAR LogTable = @LogTable, VARCHAR PrivateLogName = @LoadLog ); APPLY ‘INSERT INTO TABLE_X VALUES (:Account_Number, :Account_Name, :Trans_Number, :Trans_Date, :Trans_Amount);’ TO OPERATOR( Load_Op[2] ) SELECT * FROM OPERATOR( File_Reader ); Executive Section Defines the “how” APPLY statement(s)

Job Variables File Stores and specifies common operator attributes Reusable across multiple job scripts Tdpid = ‘drill’ ,Username = ‘johndoe’ ,Userpw = ‘janedoe’ ,TarTable = ‘TABLE_X’ ,LogTable = ‘TABLE_X_LOG’ ,LoadLog = ‘load.log’ ,Filename = ‘flatfile1.txt’ ,Format = ‘formatted’ ,Openmode = ‘read’ ,TextDelimiter = ‘|’ ,DCLog = ‘dc.log’

Job Variables Defined as <name> = ‘value’ pair in ”-u” command line option, external job variables file, or in job script SET directive Once defined, @<name> can be used anywhere in job script to specify ‘value’ (except within quoted strings and comments) Most commonly used to assign values to operator attributes Script users are encouraged to use job variables for easier script maintenance. TPT simplified syntax will rely heavily on predefined job variable names

Simplifying TPT Syntax Our goal? To make job scripts simpler by: Removing the declarative section Reusing common job variables Reducing potential keystroke errors DEFINE OPERATOR object: Using Operator Templates instead DEFINE SCHEMA object: Using Generated Schemas instead SQL INSERT statement: Using Generated SQL INSERT Statements instead

Using Operator Templates What are operator templates? Stored DEFINE OPERATOR statements for all TPT operators Uses formulaic names for the operators they define: $EXPORT, $LOAD, $UPDATE, $STREAM, etc. All possible operator attributes are declared Each operator attribute has a job variable reference as its value Standardizes job variable names that correspond to operator attributes

Operator Template Example DEFINE OPERATOR $DDL DESCRIPTION 'Teradata Parallel Transporter DDL Operator' TYPE DDL ATTRIBUTES ( VARCHAR UserName = @TargetUserName, VARCHAR UserPassword = @TargetUserPassword, VARCHAR TdpId = @TargetTdpId, VARCHAR AccountId = @TargetAccountId, VARCHAR WorkingDatabase = @TargetWorkingDatabase, VARCHAR LogonMech = @TargetLogonMech, VARCHAR LogonMechData = @TargetLogonMechData, VARCHAR DataEncryption = @DDLDataEncryption, VARCHAR ARRAY ErrorList = @DDLErrorList, VARCHAR LogSQL = @DDLLogSQL, VARCHAR PrivateLogName = @DDLPrivateLogName, VARCHAR ARRAY QueryBandSessInfo = @DDLQueryBandSessInfo, VARCHAR ReplicationOverride = @DDLReplicationOverride, VARCHAR ARRAY TraceLevel = @DDLTraceLevel );

Referencing Templates in Your Script Templates are imported into the job script when operators are referenced in an APPLY statement using their template name, for example: APPLY ‘INSERT INTO TABLE_X (:col1, :col2);’ TO OPERATOR ($LOAD); By referencing the $LOAD operator template name, TPT knows to import the corresponding template

Using Template Job Variables Templates contain predefined, conventionally-named job variables assigned to each operator attribute Assign predefined job variables as <name> = ‘value’ pair, for example: SourceUserName = ‘johndoe’ SourceUserPassword = ‘janedoe’ Unassigned job variables cause TPT to ignore value assignments

Job Variables Naming Convention For producer templates: Logon-associated: Source<AttributeName> VARCHAR UserName = @SourceUserName, VARCHAR UserPassword = @SourceUserPassword, VARCHAR TdpId = @SourceTdpId For consumer templates: Logon-associated: Target<AttributeName> VARCHAR UserName = @TargetUserName, VARCHAR UserPassword = @TargetUserPassword, VARCHAR TdpId = @TargetTdpId All other job variables names: <TemplateName><AttributeName> VARCHAR ARRAY ErrorList = @DDLErrorList, VARCHAR PrivateLogName = @DDLPrivateLogName

User-Defined Templates Users can create their own templates to maximize usefulness: Template name must begin with ‘$’ and be unique Use SCHEMA * for consumer template and SCHEMA $ for producer template All operator attributes not assigned fixed values should be assigned predefined, conventionally-named job variables Store template in $TWB_ROOT/template directory

User-Defined Operator Template Example DEFINE OPERATOR $MY_DELIMITED_READER DESCRIPTION ‘My user-defined delimited file reader’ TYPE DATACONNECTOR PRODUCER SCHEMA $ ATTRIBUTES ( VARCHAR FileName = @SourceFileName, VARCHAR Format = ‘Delimited’, VARCHAR TextDelimiter = ‘,’, INTEGER IOBufferSize = @FileReaderIOBufferSize, : : VARCHAR PrivateLogName = @FileReaderPrivateLogName, VARCHAR TraceLevel = @FileReaderSkipRows, );

Using Generated Schemas All producer operators require a schema specification. How does TPT simplified syntax overcome this? Explicitly generated schemas A simpler way of defining a schema object via a DBS table name and allowing TPT to auto-generate the column definitions Inferred generated schemas A method by which TPT analyzes job step information to auto-generate a schema object and its column definitions

Explicitly Generated Schemas Explicitly specify the name of a DBS table in DEFINE SCHEMA to: auto-generate a schema, for example: DEFINE SCHEMA WEEKLY_TRANSACTIONS ‘Weekly_Trans’; auto-generate a delimited-file schema, for example: DEFINE SCHEMA WEEKLY_TRANSACTIONS DELIMITED ‘Weekly_Trans’; Explicitly specify the name of a DBS table in the APPLY statement’s producer operator reference to: APPLY ‘INSERT INTO TABLE_X (:col1, :col2);’ TO OPERATOR ($LOAD); FROM OPERATOR ($EXPORT (‘TABLE_X’)); FROM OPERATOR ($EXPORT (DELIMITED ‘TABLE_X’));

Inferred Generated Schemas What is an inferred schema? The schema of a producer template operator whose column content can be inferred from information in the same job step How does TPT infer a schema? By analyzing the script information of all operators invoked in any job step that employs one of more producer templates

Inferred Generated Schemas: Example 1 STEP LOAD_2 ( APPLY <DML statement> TO OPERATOR( $LOAD ) SELECT * FROM OPERATOR( $EXPORT ) UNION ALL SELECT * FROM OPERATOR( EXPORT_OP_2 ); ); Producer Operator $EXPORT: is an operator template instantiation does not have an explicit schema Producer Operator EXPORT_OP_2: is defined previously in the job script has a script-defined schema Source data from both producers is merged into a single input data stream Thus, TPT will “infer” the same schema for both operators

Inferred Generated Schemas: Example 2 STEP INSERT_DAILY_TRANS ( APPLY <DML statement> TO OPERATOR( $LOAD ) SELECT * FROM OPERATOR( $SELECTOR ATTR( SelectStmt = ‘SELECT * FROM Daily_Trans;‘ )); ); Producer Operator $SELECTOR: is an operator template instantiation does not have an explicit schema The SelectStmt attribute has been assigned an SQL SELECT statement Thus, TPT will “infer” the schema based on the columns of the SELECT’s results table Works for producer templates $EXPORT and $SELECTOR, since both require the SelectStmt attribute

Inferred Generated Schemas: Example 3 STEP INSERT_MONTHLY_SHIPMENTS ( APPLY <DML statement> TO OPERATOR( $LOAD ATTR( TargetTable = 'Monthly_Shipments‘ )) SELECT * FROM OPERATOR( $FILE_READER ); ); Consumer operator $LOAD: is an operator template instantiation has a TargetTable attribute assigned to ‘Monthly_Shipments’ Producer operator $FILE_READER: does not have an explicit schema TPT will assume source data is loaded unchanged and “infer” the source schema based on the TargetTable Limitations: Assumption may not be correct in all cases, resulting in schema mismatch causing job to terminate TPT cannot infer a template’s schema from a target table when the job step contains multiple target tables

How TPT Generates Schema Whenever TPT generates a schema based on a DBS table, it must: Make a HELP TABLE call to DBS Construct DEFINE SCHEMA object Merge generated DEFINE SCHEMA object into job script Substitute generated schema for the SCHEMA $ in producer operator template

Generated Schemas Advantages & Limitations Major convenience when number of schema column definitions is large Reducing keyboarding time and keystroke errors TPT job scripts simpler, more compact, easier to read Limitations Requires the name of a DBS table that already exists prior to running the job Inferred schemas based on TargetTable attribute assumes data loaded unchanged

Using Generated SQL INSERT Statements TPT supports one additional feature to reduce script size and eliminate keystroke errors: Generating any SQL INSERT statement: if the target table is specified or can be unambiguously determined For example: DBS table ’Invoice_Counts’ has 4 columns col1, col2, col3, and col4 By specifying: APPLY $INSERT 'Invoice_Counts' TO OPERATOR( $UPDATE ) SELECT * FROM OPERATOR( $SELECTOR ); TPT will replace $INSERT 'Invoice_Counts' with the following generated SQL INSERT statement: 'INSERT INTO Invoice_Counts VALUES ( :col1, :col2, :col3, :col4 );'

Our Example Job Script – Simplified! Before Before After After DEFINE JOB File_Load DESCRIPTION ‘Load table from flat file’ ( DEFINE SCHEMA Customer_Accounts Account_Number INTEGER, Account_Name VARCHAR(50), Trans_Number INTEGER, Trans_Date ANSIDATE, Trans_Amount VARCHAR(20) ); DEFINE OPERATOR File_Reader TYPE DATACONNECTOR PRODUCER SCHEMA Customer_Accounts ATTRIBUTES VARCHAR FileName = @Filename, VARCHAR Format = @Format, VARCHAR OpenMode = @Openmode, VARCHAR TextDelimiter = @Delimiter, VARCHAR PrivateLogName = @DCLog /* continued on next page */ DEFINE JOB File_Load DESCRIPTION ‘Load table from flat file’ ( APPLY $INSERT TO OPERATOR( $LOAD[2] ) SELECT * FROM OPERATOR( $FILE_READER ); );

Our Example Job Script – Simplified! (continued) Before Before After After /* continued from previous page */ DEFINE OPERATOR Load_Op TYPE LOAD SCHEMA * ATTRIBUTES ( VARCHAR TdpId = @TdpId, VARCHAR UserName = @UserName, VARCHAR UserPassword = @UserPW, VARCHAR TargetTable = @TarTable, VARCHAR LogTable = @LogTable, VARCHAR PrivateLogName = @LoadLog ); APPLY ‘INSERT INTO TABLE_X VALUES (:Account_Number, :Account_Name, :Trans_Number, :Trans_Date, :Trans_Amount);’ TO OPERATOR( Load_Op[2] ) SELECT * FROM OPERATOR( File_Reader ); DEFINE JOB File_Load DESCRIPTION ‘Load table from flat file’ ( APPLY $INSERT TO OPERATOR( $LOAD[2] ) SELECT * FROM OPERATOR( $FILE_READER ); );

Job Variables Files Before After, with Conventionalized Names Tdpid = ‘drill’ ,Username = ‘johndoe’ ,Userpw = ‘janedoe’ ,TarTable = ‘TABLE_X’ ,LogTable = ‘TABLE_X_LOG’ ,LoadLog = ‘load.log’ Filename = ‘flatfile1.txt’ ,Format = ‘formatted’ ,Openmode = ‘read’ ,TextDelimiter = ‘|’ ,DCLog = ‘dc.log’ TargetTdpId = ‘drill’ ,TargetUserName = ‘johndoe’ ,TargetUserPassword = ‘janedoe’ ,LoadTargetTable = ‘TABLE_X’ ,LoadLogTable = ‘TABLE_X_LOG’ ,LoadPrivateLogName = ‘load.log’ ,FileReaderFileName = ‘flatfile1.txt’ ,FileReaderFormat = ‘formatted’ ,FileReaderOpenmode = ‘read’ ,FileReaderTextDelimiter = ‘|’ ,FileReaderPrivateLogName = ‘dc.log’

Q & A

Thank you!