Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Tuning Seminar JOIN Dongmin Shin IDS Lab., SNU 2008.07.25.

Similar presentations


Presentation on theme: "Database Tuning Seminar JOIN Dongmin Shin IDS Lab., SNU 2008.07.25."— Presentation transcript:

1 Database Tuning Seminar JOIN Dongmin Shin IDS Lab., SNU 2008.07.25.

2 Copyright  2006 by CEBT Index  Join Method What is Join Method? Necessity of Join Nested-Loop Join Sort Merge Join Hash Join  Join Type What is Join Type? Basic Join Outer Join Semijoin Antijoin Cartesian Join Center for E-Business Technology

3 Copyright  2006 by CEBT Chapter 17 Join Method Center for E-Business Technology

4 Copyright  2006 by CEBT Necessity of Join  Without Join  With Join Center for E-Business Technology Employee Number Varchar(5) Employee Name Varchar(10) Address Varchar(10) Salary Varchar(10) Departmen t Varchar(10) Departmen t Address Varchar(10) Employee Number Varchar(5) Employee Name Varchar(10) Address Varchar(10) Salary Varchar(10) Departmen t Varchar(10) Departmen t Varchar(10) Departmen t Address Varchar(10) EMPLOY (500,000 row) EMP (500,000 row) DEPT (1,000 row) Size of Table = 235Byte * 500,000 = 112MB Size of Table = 135Byte * 500,000 + 110Byte * 1,000 = 65M

5 Copyright  2006 by CEBT Join Method  Join Method How to evaluate Join Type(explained in Next Chapter) * Join Type – Basic Join – Outer Join – Semijoin – Cartesian Join – Antijoin Ex. Basic join can use Sort Merge Join Method for some SQL, Nested-Loop Join Method for others Center for E-Business Technology

6 Copyright  2006 by CEBT Nested-Loop Join  Pros OLTP or small-sized Set  Cons Big-size Set  Issues Join condition – Full Scan Join Condition – Index Scan Join Condition Driving(Outer) Table and Inner Table Center for E-Business Technology

7 Copyright  2006 by CEBT Nested-Loop Join  Full Scan Join Condition Center for E-Business Technology Driving Table Inner Table DEPT EMP

8 Copyright  2006 by CEBT Nested-Loop Join  Full Scan Join Condition Consideration for effective evaluation plans – If adding Index is possible, consider Index Scan Join Condition – If not, consider Hash Join not Nested-Loop Join – Make smaller sized table as Inner Table Center for E-Business Technology

9 Copyright  2006 by CEBT Nested-Loop Join  Double( 양쪽 ) Index Scan Join Condition Center for E-Business Technology Driving Table Inner Table DEPT I_DEPT (LOCATION+DEPTNO) I_EMP (DEPTNO) EMP

10 Copyright  2006 by CEBT Nested-Loop Join  Double Index Scan Join Condition Consideration for effective evaluation plans – Driving Table 선정 : make the table whose Feature Access Set is smaller as Driving Table – Feature Access Set( 처리 대상 집합 ) : 처리 대상 집합이란 실제 원하는 row 를 추출하기 위해 access 하는 모든 row 를 의미한다. 처리 대상 집합 이 적다는 것은 where 조건에 의해 추출되는 row 가 적다는 것을 의미하 는 것은 아니다. 처리 대상 집합이란 전체 table 중 어느 정도를 처리해야 해당 where 조건을 만족하는 row 를 추출할 수 있냐는 것이다. 예를 들어 table full scan 을 하여 2 rows 를 추출한다면 해당 table 의 처리 대상 집합 은 table 전체 row 가 된다. Center for E-Business Technology

11 Copyright  2006 by CEBT Nested-Loop Join  Single( 한쪽 ) Index Scan Join Condition Case I : DEPT Table - Driving Table & EMP Table - Inner Table Center for E-Business Technology Driving Table Inner Table DEPT EMP I_DEPT (LOCATION+DEPTNO)

12 Copyright  2006 by CEBT Nested-Loop Join  Single( 한쪽 ) Index Scan Join Condition Case II : EMP Table - Driving Table & DEPT Table - Inner Table Center for E-Business Technology Driving Table EMP Inner Table DEPT I_DEPT (LOCATION+DEPTNO)

13 Copyright  2006 by CEBT Nested-Loop Join  Single Index Scan Join Condition Consideration for effective evaluation plans – Make the table which satisfies Index Scan Join Condition as Inner Table – Make the table which satisfies Full Scan Joni Condition as Driving Table Center for E-Business Technology

14 Copyright  2006 by CEBT Nested-Loop Join  Check Condition & Index Scan Condition Convert Check Condition to Index Scan Condition, if possible Center for E-Business Technology Check ConditionIndex Scan Condition Table 처리 범위를 감소시키지 않음 Table 처리 범위를 감소시킴

15 Copyright  2006 by CEBT Nested-Loop Join Center for E-Business Technology

16 Copyright  2006 by CEBT Sort Merge Join – full scan Center for E-Business Technology 1.Full scan DEPT table and sort it by DEPTNO where whose LOCATION value is ‘SEOUL’ 2. Full scan EMP table and sort it by DEPTNO where whose SAL value is more than 200 3. Merge

17 Copyright  2006 by CEBT Sort Merge Join  Previous Version of Hash Join  Hardly used now  Pros Random access can be removed by full scan sort merge join Better than Nested-Loop Join with large-sized data Not effected by Index existence  Cons Random access happens by Index Scan Sort Merge Join Hash join or Nested-Loop is better in general case – Small-sized data : Index Scan Condition Nested-loop join is better – Large-sized data : Hash Join is better Effective Sort Algorithm is needed  Conclusion Hardly used If Sort Merge Join is used in evaluation plans, you have to consider tuning it. Center for E-Business Technology

18 Copyright  2006 by CEBT Hash Join Center for E-Business Technology 1.Scan index Entry of DEPT table(I_DEPT Index) where LOCATION=‘SEOUL’ If amount of accessed row is more than 3~5% of whole table, consider table full scan using Parallel Processing 2. Apply Hash Function to DEPTNO of DEPT Table and get hash value and build hash table 3. Apply Hash Function to DEPTNO of EMP Table and get hash value and apply it to hash table

19 Copyright  2006 by CEBT Hash Join  Consideration for effective evaluation plans Build table and Probe table – Select the table whose Feature-access Set( 처리대상집합 ) is smaller as Build Table As Hash Table size is bigger, more Disk I/O is needed Parallel Processing – Offer parallel processing to tables-to-be-join – Elapsed Time is improved by effective resource management Parameters – Hash_area_size Related with # of disk I/O – Db_file_multiblock_read_count Related with table full scan Center for E-Business Technology

20 Copyright  2006 by CEBT Hash Join  Pros Good for processing large-sized data Not effected by Index existence Can fully use system resources  Cons Excessive sort and disk I/O can occur Excessive use of system resources Center for E-Business Technology

21 Copyright  2006 by CEBT Chapter 18 Join Type Center for E-Business Technology

22 Copyright  2006 by CEBT Join Type  Join Type – Set which will be extracted – Executed by Join Method Basic JoinA ∩ B Outer JoinA or B SemijoinA ∩ B (execution way is different from Basic Join) AntijoinA - B Cartesian JoinA * B Center for E-Business Technology AB

23 Copyright  2006 by CEBT Basic Join SQL> SELECT A.DNAME, B.ENAME, B.SAL FROM EMP B, DEPT A WHERE A.LOCATION = ‘SEOUL’ AND B.SAL>200 AND A.DEPTNO = B.DEPTNO ; Center for E-Business Technology DEPTNO is identical AND LOCATION = ‘SEOUL’ AND SAL>200 EMP SAL>200 DEPT LOCATION = ‘SEOUL’

24 Copyright  2006 by CEBT Outer Join  Feature Access Set = (A-B) + (A ∩ B) = A Center for E-Business Technology SQL> SELECT A.DNAME, B.ENAME, B.SAL FROM EMP B, DEPT A WHERE A.LOCATION = ‘SEOUL’ AND B.SAL(+)>200 AND A.DEPTNO = B.DEPTNO (+);

25 Copyright  2006 by CEBT Outer Join  Comparison between Basic Join and Outer Join Center for E-Business Technology

26 Copyright  2006 by CEBT Outer Join  Consideration for effective evaluation plans The order of table to join is fixed – DEPT table is Driving Table(in case of Nested-Loop Join) or Build Table(in case of Hash Join) – You have to consider this restriction when you make SQL Extraction of exact data – Is outer join really needed? Center for E-Business Technology

27 Copyright  2006 by CEBT Semijoin  Existence-check Join type Center for E-Business Technology Driving Table Inner Table DEPT I_EMP (DEPTNO) EMP

28 Copyright  2006 by CEBT Semijoin  Filter Process Join method to execute Semijoin Before Oracle 8i, Filter Buffer can store only one value Center for E-Business Technology

29 Copyright  2006 by CEBT Semijoin  Filter Buffer Problem of Filter Buffer From Oracle 9i, Multi Filter Buffer is supported Center for E-Business Technology

30 Copyright  2006 by CEBT Semijoin  Consideration for effective evaluation plans Effective Iterative execution of Subquery – Add DEPTNO+SAL Index to EMP Table Effective Extraction of Feature Access Set of Main Query – If extracted rows(LOCATION=‘SEOUL’) are 3~5% of whole table, Index Scan – If not, Full Scan by Parallel Processing Center for E-Business Technology SQL> SELECT DEPTNO FROM DEPT A WHERE EXISTS ( SELECT ‘X’ FROM EMP B WHERE A.DEPTNO = B.DEPTNO AND B.SAL > 200 ) AND A.LOCATION = ‘SEOUL’ ; Main Query Subquery

31 Copyright  2006 by CEBT Antijoin  Negative Join Type  Consideration for effective evaluation plans Same as that of Semijoin Center for E-Business Technology AB A - B SQL> SELECT COUNT (*) FROM EMP WHERE ENAME LIKE ‘KIM%’ AND IB IS NOT NULL AND DEPTNO IS NOT NULL AND DEPTNO NOT IN ( SELECT /* + HASH_AJ */ DEPTNO FROM DEPT WHERE LOCATION=‘SEOUL’);

32 Copyright  2006 by CEBT Cartesian Join  No Join Condition All combination of rows are qualified to join Center for E-Business Technology

33 Copyright  2006 by CEBT Cartesian Join  All almost cases, Cartesian Join is chosen as evaluation plan by mistake Then, Cartesian Join is useless?  Star Join Executing Nested-Loop Join Dimension Table and Fact Table to analyze fact table is inefficient Center for E-Business Technology

34 Copyright  2006 by CEBT Cartesian Join  Better way by using Cartesian Join Executing Cartesian Join with Dimension Tables – Make all the cases Join it with Fact Table Center for E-Business Technology


Download ppt "Database Tuning Seminar JOIN Dongmin Shin IDS Lab., SNU 2008.07.25."

Similar presentations


Ads by Google