Oracle Optimizer
Types of Optimizers There are different modes for the optimizer RULE: Rule-based optimizer (RBO) –Deprecated; not updated since 1994 CHOOSE: Default option. Chooses cost-based optimizer (CBO) if statistics available, rule- based optimizer otherwise. ALTER SESSION SET optimizer_mode = {choose|rule|first_rows(_n)|all_rows}
Types of Optimizers ALL_ROWS (CBO): Execute the query so that all of the rows are returned as quickly as possible –Merge Join has priority over Block Nested Loop Join FIRST_ROWS(n) (CBO): execute the query so that all of the first n rows are returned as quickly as possible –Block Nested Loop Join has priority over Merge Join
analyze table | index | compute statistics | estimate statistics [sample rows | percent] | delete statistics; analyze table Sailors estimate statistics sample 25 percent; Analyzing the Data
Viewing the Execution Plan You need a PLAN_TABLE table. So, the first time that you want to see execution plans, run the command: Set autotrace on to see all plans –Display the execution path for each query, after being
Operations that Access Tables TABLE ACCESS FULL: sequential table scan –Oracle optimizes by reading multiple blocks –Used whenever there is no where clause on a query select * from Sailors TABLE ACCESS BY ROWID: access rows by their RowID values. –How do you get the rowid? From an index! select * from Sailors where sid > 10
Types of Indexes Unique: each row of the indexed table contains a unique value for the indexed column Nonunique: the row’s indexed values can repeat
Operations that Use Indexes INDEX UNIQUE SCAN: Access of an index that is defined to be unique INDEX RANGE SCAN: Access of an index that is not unique or access of a unique index for a range of values
When are Indexes Used/Not Used? If you set an indexed column equal to a value, e.g., sname = 'Jim' If you specify a range of values for an indexed column, e.g., sname like 'J%' –sname like '%m': will not use an index –UPPER(sname) like 'J%' : will not use an index –sname is null: will not use an index, since null values are not stored in the index –sname is not null: will not use an index, since every value in the index would have to be accessed
When are Indexes Used? (cont) 2*age = 20: Index on age will not be used. Index on 2*age will be used. sname != 'Jim': Index will not be used. MIN and MAX functions: Index will be used Equality of a column in a leading column of a multicolumn index. For example, suppose we have a multicolumn index on (sid, bid, day) –sid = 12: Can use the index –bid = 101: Cannot use the index
Optimizer Hints You can give the optimizer hints about how to perform query evaluation Hints are written in /*+ */ right after the select Note: These are only hints. The Oracle optimizer can choose to ignore your hints
Hints FULL hint: tell the optimizer to perform a TABLE ACCESS FULL operation on the specified table ROWID hint: tell the optimizer to perform a TABLE ACCESS BY ROWID operation on the specified table INDEX hint: tells the optimizer to use an index- based scan on the specified table
Examples Select /*+ FULL (sailors) */ sid From sailors Where sname=‘Joe’; Select /*+ INDEX (sailors) */ sid From sailors Where sname=‘Joe’; Select /*+ INDEX (sailors s_ind) */ sid From sailors S, reserves R Where S.sid=R.sid AND sname=‘Joe’;
Combining Output From Multiple Index Scans Suppose we have 2 indexes: sname, rating select * from sailors where sname = 'Jim' and rating = SELECT STATEMENT Optimizer=CHOOSE 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'SAILORS‘ 2 1 AND-EQUAL 3 2 INDEX (RANGE SCAN) OF 'SNAME_IND' (NON-UNIQUE) 4 2 INDEX (RANGE SCAN) OF 'RAT' (NON-UNIQUE)
Operations that Manipulate Data Sets Up until now, all operations returned the rows as they were found There are operations that must find all rows before returning a single row Try to avoid these operations for online users! –SORT ORDER BY: query with order by select sname, age from Sailors order by age;
Operations that Manipulate Data Sets –SORT UNIQUE: sorting records while eliminating duplicates (e.g., query with distinct; query with minus, intersect or union) –SORT AGGREGATE, SORT GROUP BY: queries with aggregate or grouping functions (like MIN, MAX)
Is the table always accessed? What if there is no index?
Operations that Manipulate Data Sets Consider the query: –select sname from sailors union select bname from boats;
Operations that Manipulate Data Sets Consider the query: –select sname from sailors minus select bname from boats; How do you think that Oracle implements intersect? union all?
Select MIN(age), COUNT(*) from Sailors GROUP BY rating SELECT STATEMENT Optimizer=CHOOSE 1 0 SORT (GROUP BY) 2 1 TABLE ACCESS (FULL) OF 'SAILORS' Operations that Manipulate Data Sets
Distinct What should Oracle do when processing the query (assuming that sid is the primary key): –select distinct sid from Sailors Try it!!
Join Methods Select * from Sailors, Reserves where Sailors.sid = Reserves.sid Oracle can use an index on Sailors.sid or on Reserves.sid (note that both will not be used) Join Methods: MERGE JOIN, NESTED LOOPS, HASH JOIN
Nested Loops Joins Block nested loop join NESTED LOOPS TABLE ACCESS FULL OF our_outer_table TABLE ACCESS FULL OF our_inner_table Index nested loop join NESTED LOOPS TABLE ACCESS FULL OF our_outer_table TABLE ACCESS BY ROWID OF our_inner_table INDEX RANGE SCAN OF inner_table_index
Nested Loops Joins Select * from reserves r, sailors s where r.sid=s.sid; SELECT STATEMENT Optimizer=CHOOSE 1 0 NESTED LOOPS 2 1 TABLE ACCESS (FULL) OF 'RESERVES‘ 3 1 TABLE ACCESS (BY INDEX ROWID) OF 'SAILORS‘ 4 3 INDEX (UNIQUE SCAN) OF 'SYS_C ' (UNIQUE)
When Are Nested Loops Joins Used? If tables are of unequal size If results should be returned online
Hash Join Plan HASH JOIN TABLE ACCESS FULL OF table_A TABLE ACCESS FULL OF table_B
When Are Hash Joins Used? If tables are small If results should be returned online
Sort-Merge Join Plan MERGE JOIN SORT JOIN TABLE ACCESS FULL OF table_A SORT JOIN TABLE ACCESS FULL OF table_B
When Are Sort/Merge Joins Used? Performs badly when tables are of unequal size. Why?
Hint Examples for Joins Select /*+ USE_NL (sailors) */ sid From sailors S, reserves R Where S.sid=R.sid AND sname=‘Joe’; Select /*+ USE_MERGE (sailors, reserves) */ sid From sailors S, reserves R Where S.sid=R.sid AND sname=‘Joe’; Select /*+ USE_HASH */ sid From sailors S, reserves R Where S.sid=R.sid AND sname=‘Joe’; inner table
XML (Extensible Markup Language) and the Semi-Structured Data Model
Motivation We have seen that relational databases are very convenient to query. However: –There is a LOT of data not in relational databases!! Perhaps the most widely accessed database is the web, and it certainly isn’t a relational database.
Querying the Web The web can be queried using a search engine, however, we can’t ask questions like: –What is the lowest price for which a Jaguar is sold on the web? Problems: –There are no facilities for asking complex questions, such as aggregation of data
Understanding the Web In order to query the web, we must be able to understand it. 2 Computer Science Approaches: –Artificial Intelligence Approach –Database Approach
Database Approach “The web is unstructured and we will structure it” Sometimes problems that are very difficult can be solved easily by enforcing a standard Encourage the use of XML as a standard for data exchange on the web
Jeff Cohen Irma Levy Example XML Document Opening Tag Attribute Element Closing Tag
Very Unstructured XML The insured’s Corolla broke through the guard rail and plummeted into the ravine. The cause was determined to be faulty brakes. Amazingly there were no casualties.
XML Vs. HTML XML and HTML are brothers. They are both special cases of SGML. HTML has specific tag and attribute names. These are associated with a specific meaning XML can have any tag and attribute name. These are not associated with any meaning HTML is used to specify visual style XML is used to specify meaning
Characteristics of Semistructured Data (From Missing or additional attributes Multi-valued attributes Different types in different objects Heterogeneous collections Self-describing, irregular data, no a priori structure
Relational vs Semi-Structured RelationalSemi-Structured Abstract Model Sets of tuples Labeled Directed Graph Concrete Model TablesXML Documents Standard for Storing Data Data Exchange Separating Content from Style
Data Exchange Problem: Many data sources, each of a different type (different vendor), with a different schema. –How can the data be combined and used together? –How can different companies collaborate on their data? –What format should be used to exchange the data?
Separating Content from Style Web sites develop over time Important to separate style from data in order to allow changes to the site structure and appearance Using XML, we can store data alone CSS separates style from data only in a limited way Using XSL, this data can be translated into HTML The data can be translated differently as the site develops
Write Once Use Everywhere XML Data XSL WML (hand-held devices) XSL HTML (web browser XSL TEXT (Excel)
Using XML Quering and Searching XML: There are query languages and search engines that query XML and return XML. Examples: Xpath, Xquery /SQL4X, Equix, XSEarch Displaying XML: An XML document can have an associated style-sheet which specifies how the document should be translated to HTML. Examples: CSS, XSL