A – Pre Join Indexes
Types of Joins Ad-hoc joins Join index Joined at query time An internal structure that relates table columns Represents two or more tables, including inner, left outer, and right outer rows May provide better performance than ad-hoc joins Requires additional disk space and time to create May 6, 2019
Ad-Hoc Joins A join merges rows from two tables into a result table Six types of ad-hoc join algorithms Nested loop Hash Sort/Merge Sort/Merger push down Nested loop push down Hash push down Cartesian nested loop Nested loop: Takes a row from the larger table of the join and compares it to each row of the smaller table You want the rows from the smaller side of the join to stay in memory Hash: Similar to nested loop, except the smaller side is “hashed” into an in-memory table with a hashing function applied to the join keys Requires fewer comparisons than a nested loop, but adds the cost of computing the hash key for each row on both sides of the join Sort/merge: Sorts the rows on both sides and then compares the rows Sort/merge push down: Bloom-filter based semijoin that significantly reduces the number of rows projected and the number of rows to be sorted Nested loop push down: IQ turns the keys of the smaller side of the join into an IN list on the large table Hash push down: Combination of a hash join and ASIQ’s special index processing Cartesian nested loop: A variation of the nested loop join. A many-to-many or Cartesian join will produce a row for every combination of rows from both Cartesian nested loop sides. May 6, 2019
When to Use Ad-Hoc Joins Instead of Join Indexes Always try ad-hoc joins first before creating join indexes If the ratio of data in join tables is higher than 10 to 1 For example, lookup tables Ad-hoc joins recommended due to data explosion For self-joins For many-to-many relationships May 6, 2019
When to Create a Join Index After trying ad-hoc queries If certain tables are typically joined together in queries There is adequate disk space Fairly static tables Ratio of data in join index is less than 10 to 1 This limit is due to high disk overload Recommend a ratio of less than 5 to 1 for performance reasons For performance reasons, a ratio of less than 5 to 1 is preferred The space required for a join index can be HUGE Join indexes can take a LONG time to build May 6, 2019
Ad-Hoc Joins Versus Join Indexes Join indexes can involve multiple tables and are subject to certain rules Even if extra build time is available and additional disk space is not an issue, join indexes cannot be created in all cases Synchronizing is discussed in detail later in the module May 6, 2019
Creating a Join Index Step Overview Step 1: Create the tables involved in the join index CREATE TABLE... Step 2: Define the relationships Step 3: Create the indexes for the tables being joined CREATE INDEX... Step 4: Load the data into the tables LOAD TABLE... or INSERT... Step 5: Create the join index CREATE JOIN INDEX... Step 6: Synchronize join index, if required SYNCHRONIZE JOIN INDEX... May 6, 2019
Step 1: Create Tables Create the tables involved in the join index CREATE TABLE command Sybase Central See Module 7, “Creating Tables and Indexes” Example: CREATE TABLE customer (customer_id int not null ,cust_type char(4) null ,organization char(30) null ...) The example in this module joins customer and product tables. You would also need to create the product table as part of step 1. May 6, 2019
Step 2: Define the Relationships Must determine the relationships between each pair of tables Types One-to-many (ONE>>MANY) Many-to-one (MANY>>ONE) One-to-one (ONE>>ONE) ONE >>MANY: One row in one table potentially matches with one or more rows in another table MANY>>ONE: One or many rows in one table potentially match with one row in another table ONE >>ONE: One row in the first table potentially matches with one row in the second table If the join relationship specified is incorrect, the extra rows are discarded and not included in the join index May 6, 2019
Join Hierarchy All IQ join relationships must have a join hierarchy A tree that illustrates how all the tables in a join are connected Determine the top table, which is always on the MANY side Rules A table can occur only once A table can have only one arrow leaving it All tables must be connected An arrowhead is the MANY side of a relationship In this example, Table C is the “top table” May 6, 2019
Types of Join Hierarchies Linear joins Each table is related to the table above it, until you reach the top table Each pair of tables represents a one-to-many relationship Star joins Each table is related to one table in the middle The table in the middle represents the MANY side, sometimes called the fact table The ONE tables are called dimension tables You can create join indexes or have ad-hoc joins for both linear and star joins May 6, 2019
Entity-Relationship Diagram This example is used throughout this module In this entity-relationship diagram, a customer can have one or more orders May 6, 2019
Join Hierarchy with the Top Table order becomes the top table because it is on the MANY side of the relationship order table is the “Top Table” Continued … May 6, 2019
Creating a Join Index Step Overview Step 1: Create the tables involved in the join index CREATE TABLE... Step 2: Define the relationships Step 3: Create the indexes for the tables being joined CREATE INDEX... Step 4: Load the data into the tables LOAD TABLE... or INSERT... Step 5: Create the join index CREATE JOIN INDEX... Step 6: Synchronize the join index, if required SYNCHRONIZE JOIN INDEX ... May 6, 2019
Step 3: Create the Indexes Create the column indexes for the tables being joined Columns involved in joins must have an HG or LF index Example: CREATE HG index customer_id_HG on customer(customer_id); CREATE LF index cust_type_LF on customer(cust_type); ... To determine the index types for each column, see Module 5, “ASIQ Index Types and Data Types” To create indexes on each column, see Module 6, “Creating Tables and Indexes” Remember to create all the indexes for the customer and order tables Indexes can be created either using the CREATE INDEX command or using Sybase Central Continued … May 6, 2019
Step 4: Load Data Continued … Load the data into the tables Example: Can use LOAD TABLE or INSERT command See Modules 9, 10, and 11 on data loading Example: LOAD TABLE customer (customer_id '|' ... ) FROM ... ; You need to load or insert data into the order table also Continued … May 6, 2019
Creating a Join Index Step Overview Step 1: Create the tables involved in the join index CREATE TABLE... Step 2: Define the relationships Step 3: Create the indexes for the tables being joined CREATE INDEX... Step 4: Load the data into the tables LOAD TABLE... or INSERT... Step 5: Create the join index CREATE JOIN INDEX... Step 6: Synchronize join index, if required SYNCHRONIZE JOIN INDEX... May 6, 2019
Step 5: Create the Join Index Must be DBA or the table owner and have RESOURCE authority to CREATE JOIN INDEX Syntax: CREATE JOIN INDEX join-index-name FOR join-clause Example: CREATE JOIN INDEX customer_order_join FOR customer FULL OUTER JOIN order ON customer.customer_id = order.customer_id join-clause [ ( ] join-expression join-type join-expression [ on search-condition ] [ ) ] join-expression { table-name | join-clause } join-type [ NATURAL ] FULL [ OUTER ] JOIN You need to code the join index correctly. ASIQ assumes the sub-tree on the left is the ’one’, while the join expression on the right is the ’many’ side. If you specify incorrectly, the create will return an error. Another example and further explanation follow later in this module Continued … May 6, 2019
Step 5: Create the Join Index Specifying the Join Type Join type is always FULL OUTER; OUTER keyword is optional If you are joining equivalent columns with the same names, you specify NATURAL JOIN If you are joining columns based on keys, you must have specified FOREIGN and PRIMARY keys Use the ON clause to join equivalent values in two tables Continued … May 6, 2019
Step 5: Create the Join Index Same example (natural join): CREATE JOIN INDEX nat_customer_order_join FOR customer NATURAL FULL OUTER JOIN order Join column names must be the same in the tables being joined Foreign and primary keys must be defined during the CREATE TABLE operation for customer and order tables in the following Key join example: CREATE JOIN INDEX key_customer_order_join FOR customer FULL OUTER JOIN order Continued … May 6, 2019
Step 5: Create the Join Index Results: When you create a join index, a message in the log appears: : Join Index 'customer_order_join' created from the following join relations. : Table Name Relationship :-------------------------------------------- :1. customer joined to order One >> Many :The ultimate/top table is order You can create a join index in Sybase Central. Sybase Central See Chapter 4 of the Adaptive Server IQ Administration and Performance Guide Continued … May 6, 2019
Creating a Join Index Step Overview Step 1: Create the tables involved in the join index CREATE TABLE... Step 2: Define the relationships Step 3: Create the indexes for the tables being joined CREATE INDEX... Step 4: Load the data into the tables LOAD TABLE... or INSERT... Step 5: Create the join index CREATE JOIN INDEX... Step 6: Synchronize join index, if required SYNCHRONIZE JOIN INDEX... May 6, 2019
Step 6: Synchronize Join Indexes This step ensures that the data is loaded in the correct order for the join It automatically occurs when: You create the join index Changes are made to the top table of the join hierarchy Otherwise, must be manually performed, which can be time-consuming Schedule during off-peak hours Synchronize after entire set of inserts and deletes (do not do after every insert or delete) If you complete step 5 (CREATE JOIN INDEX) before step 4 (LOAD TABLE or INSERT), you do not have to synchronize the data for the initial setup Make sure you have plenty of space in your IQ Temporary Store. The synchronize process uses the IQ Tempory Store. Continued … May 6, 2019
Step 6: Synchronize Join Indexes Syntax: SYNCHRONIZE JOIN INDEX [ join-index-name [, join-index-name] If you omit the index name, IQ synchronizes all join indexes Because this process is so time consuming, it is best to specify joins and schedule them individually Example: SYNCHRONIZE JOIN INDEX customer_order_join May 6, 2019
Multiple-Table Join Example Join Hierarchy: May 6, 2019
To Create the Join Index CREATE JOIN INDEX cust_prod_order_join FOR customer FULL OUTER JOIN order ON customer.customer_id = order.customer_id FULL OUTER JOIN product ON order.product_id = product.product_id If no key definitions exist a ONE>>MANY is assumed left to right May 6, 2019
Rules for Join Indexes Conditions and Restrictions You must be Database Administrator or table owner You can combine steps 1 and 2 by defining relationships when you create the table (define keys) You need to synchronize after additional inserts/deletes unless you have updated the top table This approach is not recommended for data ratios greater than 10 to 1 (less than 5 to 1 may provide more optimal performance) Join indexes cannot be created on many-to-many relationships May 6, 2019
Estimating Join Index Size Stored procedure syntax: sp_iqestjoin table1, #ofrows, table2, #ofrows [, relation] [, iq_page_size] Estimates the space needed to create join indexes Example: sp_iqestjoin customer, 10000, order, 30000 You must run this procedure for each pair of tables being joined Default values: Relationship: one-to-many IQ page size: 64KB May 6, 2019
Actual Join Index Size Stored procedure syntax: sp_iqjoinindexsize (join-index-name) Returns the total size of the index in bytes Example: sp_iqjoinindexsize customer_order_join May 6, 2019
Dropping a Join Index Syntax: Example: DROP JOIN INDEX join-index-name Example: DROP JOIN INDEX customer_order_join You must first drop the join index before dropping an underlying table May 6, 2019
Data Modification in a Join Index Always insert, update, or delete from the underlying tables, not from the join index You cannot do partial-width inserts to tables that participate in a join index Drop the join index, do the partial-width insert, and then recreate the join index Do not forget to synchronize Insert to include: LOAD TABLE INSERT FROM SELECT INSERT VALUES Prior to version 12.4.3, update was not allowed on joined indexes May 6, 2019
How Join Indexes Are Used The SELECT statement determines whether a join index will be used Determined by the FROM and WHERE clauses Do not need to specify the join index name in a query IQ can use the same join index to resolve a query that involves the full join relationship or a query that involves a contiguous subset Continued … May 6, 2019
How Join Indexes Are Used Queries that can be resolved Entire relationship A to D B to D D to F C to E E to F C to E to F Example of a query that cannot be resolved using the join index: E to D May 6, 2019
Summary Try using ad-hoc joins first. Create a join index if: Ratio of data in join tables is less than 10 to 1 (recommend less than 5 to 1). No many-to-many relationships. Same tables are frequently joined together in queries. There is adequate disk space. May 6, 2019