Query Rewrite Starburst Model (IBM)
DB2 Query Optimizer (Starburst) Parsing and Semantic Checking Query Rewrite Plan Optimization Query Evaluation System Query Graph Model Executable Plan Data Flow Control Flow Compile Time Run Time
Goal of Query Rewrite Make queries as declarative as possible : Poorly expressed queries could force the optimizer into choosing suboptimal plans Perform natural heuristics For example, “predicate pushdown”
Components of Rewrite Engine Rewrite rules (more later) Rule engine –control strategies sequential (rules are processed sequentially) priority (higher priority rules are given a chance first) statistical (next rule is chosen randomly based on a user defined probability distribution –budget to avoid spending too much time on rewrites, the processing stops at a consistent state of QGM when the budget is exhausted Search facility –browses through QGM providing the context for the rules to work on
Problem How do we choose between competing incompatible transformations? Optimal solution: apply cost analysis and pick the transformation leading to a cheaper plan Practical solution (why?): generate multiple alternatives and send them to plan optimization phase (problems?)
Rewrite Rules: SELECT Merge CREATE VIEW itpv AS (SELECT DISTINCT itp.itemn, pur.vendn FROM itp, pur WHERE itp.ponum = pur.ponum AND pur.odate > ’85’) SELECT DISTINCT itm.itmn, pur.vendn FROM itm, itp, pur WHERE itp.ponum = pur.ponum AND itm.itemn = itpv.itemn AND pur.odate > ’85’ AND itm.itemn > ’01’ AND itm.itemn < ’20’ SELECT itm.itmn, itpv.vendn FROM itm, itpv WHERE itm.itemn = itpv.itemn AND itm.itemn > ’01’ AND itm.itemn < ’20’ Speedup: 200 times
Rewrite Rules: Existential Subquery Merge SELECT * FROM itp WHERE itm.itemn IN (SELECT itl.itmn FROM itl WHERE itl.wkcen = ‘WK468’ AND itl.locan = ‘L’ ) SELECT DISTINCT itp.* FROM itp, itl WHERE itp.itmn = itl.itemn AND itl.wkcen = ‘WK468’ AND itl.locan = ‘L’ Speedup: 15 times
Rewrite Rules: Intersect to Exists SELECT itemn FROM wor WHERE empno = ‘EMPN1279’ INTERSECT SELECT itmn FROM itl WHERE entry_time = ‘9773’ AND wkctr = ‘WK195’ ) Speedup: 8 times SELECT DISTINCT itemn FROM wor, itl WHERE empno = ‘EMPN1279’ entry_time = ‘9773’ AND wkctr = ‘WK195’ ) AND itl.itmn = wor.itemn
The Count Bug parts(PNUM,QOH) supply(PNUM,QUAN,SHIPDATE) Query: Find the part numbers of those parts whose quantities on hand equal the number of shipments of those parts before select PNUM from parts where QOH = ( select count(SHIPDATE) from supply where supply.PNUM = parts.PNUM and SHIPDATE < )
The Count Bug (cont.) select PNUM from parts where QOH = ( select count(SHIPDATE) from supply where supply.PNUM = parts.PNUM and SHIPDATE < ) temp (SUPPNUM,CT) = ( select PNUM, count(SHIPDATE) from supply where SHIPDATE < ) group by PNUM) select PNUM from parts, temp where parts.QOH = temp.CT and temp.PNUM = parts.PNUM
The Count Bug (cont.) PNUMQOH PNUMQUANSHIPDATE select PNUM from parts where QOH = ( select count(SHIPDATE) from supply where supply.PNUM = parts.PNUM and SHIPDATE < ) SupplyParts PNUM 10 8 Result
The Count Bug (cont.) PNUMQOH PNUMQUANSHIPDATE SupplyParts Temp temp (SUPPNUM,CT) = ( select PNUM, count(SHIPDATE) from supply where SHIPDATE < ) group by PNUM) SuppnumCT
The Count Bug (cont.) PNUMQOH Parts Temp SUPPNUMCT select PNUM from parts, temp where parts.QOH = temp.CT and temp.PNUM = parts.PNUM Result PNUM 10
The Count Bug – solution with outer joins X A B Y B C E RSR=+S XY Anull BB C E
The Count Bug – solution with outer joins temp (SUPPNUM,CT) = ( select parts.PNUM, count(SHIPDATE) from parts, supply where SHIPDATE < and parts.PNUM =+ supply.PNUM group by parts.PNUM) Parts.PNUMParts.QOHSupply.PNUMSupply.QUONSupply.SHIPDATE null parts.PNUM =+ supply.PNUM (for SHIPDATE < )