Download presentation
Presentation is loading. Please wait.
Published byJocelyn Cole Modified over 9 years ago
1
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004
2
Outline Overview Measures of query cost Selection operation –Basic algorithms –Selections using indices –Selections involving comparisons –Implementation of complex selections Sorting
3
Overview Query processing – the range of activities involved in extracting data from a database. –Includes: translation of queries, query- optimizing transformations, evaluation of queries.
4
Overview (cont.) Steps involved in processing a query: 1)Parsing and translation 2)Optimization 3)Evaluation
5
Step 1. Parsing and translation System checks the syntax of the query. Creates a parse-tree representation of the query. Translates the query into a relational- algebra expression.
6
Step 2. Optimization Each relational-algebra operation can be executed by one of several different algorithms. Ex. Every tuple can be searched linearly or a B+ tree can be used to index the tuples. A query optimizer must know the cost of each operation.
7
Step 3. Evaluation A relational-algebra expression and evaluation primitive are needed. –evaluation primitive - relational-algebra expression annotated with instructions specifying how to evaluate each operation.
8
Step 3. Evaluation (cont.) Query execution plan or query-evaluation plan – sequence of primitive operations that can be used to evaluate the expression. Π balance (σ balance < 2500 (account))
9
Step 3. Evaluation (cont.) Then the query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.
10
Steps in query processing
11
Measures of Query Cost Measured in terms of different resources, including disk accesses and CPU execution time Measured in response time for a query-evaluation plan (clock time to execute the plan)
12
Measures of Query Cost (cont.) Disk accesses, number of block transfers from disk, are usually the most important cost To calculate add these numbers: 1)# of seek operations performed 2)# of blocks read 3)# of blocks written After multiplying them by the average seek time, average transfer time, and average transfer time, respectively.
13
Selection Operation File scan is the lowest-level operator to access data. File scans are search algorithms that locate and retrieve records that fulfill a selection condition. Allows an entire relation to be read where the relation is stored in a single, dedicated file.
14
Basic Algorithms Two basic scan algorithms to implement the selection operation: A1 (linear search). –The system scans each file block and tests all records to see whether they satisfy the selection condition. –System terminates if the required record is found, without looking at the other records. A2 (binary search). –System performs the binary search on the blocks of the file. –File is ordered on an attribute, and selection condition is an equality comparison.
15
Selections Using Indices Index structures referred to a access paths, since they provide a path through which data can be located and accessed. Primary index – allows records to be read in an order that corresponds to the physical order in the file. Secondary index – any index that is not a primary index.
16
Selections Using Indices (cont.) A3 (primary index, equality on key). –Use the index to retrieve a single record that satisfies the corresponding equality condition. A4 (primary index, equality on nonkey). –Same as A3, but multiple records may need to be fetched.
17
Selections Using Indices (cont.) A5 (secondary index, equality). –Retrieve a single record if the equality condition is on a key. –Retrieve multiple records if the indexing field is not a key.
18
Selections Involving Comparisons Consider the form σ A ≥ v (r). Can be implemented by linear or binary search or by using indices: A6 (primary index, comparison). –For A ≥ v, look for first tuple that has the value of A = v, return all the tuples starting from the tuple up to the end of the file. –For A < v, file scan from the beginning up to (but not including) the first tuple with attribute A = v.
19
Selections Involving Comparisons (cont.) A7 (secondary index, comparison). –The lowest-level blocks are scanned From the smallest value up to v for < and ≤. From v up to the maximum value for > and ≥.
20
Implementation of Complex Selections Previously only considered simple selection conditions with an equality or comparison operation. Now consider more complex selection predicates.
21
Implementation of Complex Selections (cont.) Conjunction: σ θ1 ∩ θ2 ∩ … ∩ θn (r) Disjunction: σ θ1 ∪ θ2 ∪ … ∪ θn (r) Negation: σ -θ (r)
22
Implementation of Complex Selections (cont.) A8 (conjunctive selection using one index). –Determine if there is an access path for an attribute in one of the simple conditions. –If yes, algorithms A2 through A7 can be used. –On all records retrieved test if they satisfy the remaining simple conditions.
23
Implementation of Complex Selections (cont.) A9 (conjunctive selection using composite index (multiple attributes). –Search index directly if selection specifies an equality condition on 2 or move attributes and a composite index exists. –Type of index determines if A3, A4, or A5 will be used.
24
Implementation of Complex Selections (cont.) A10 (conjunction selection by intersection of identifiers). –Requires indices with record pointers. –Scan each index for pointers that satisfy an individual condition. –Take intersection of all retrieved pointers to get set of pointers that satisfy the conjunctive condition. –Then pointers used to retrieve actual records. –If indices not available on all conditions, then algorithm tests retrieved records against remaining conditions.
25
Implementation of Complex Selections (cont.) A11 (disjunctive selection by union of identifiers). –Each index is scanned for pointers to tuples that satisfy the individual condition. –Union of all the retrieved pointers gives the set of pointers to all tuples that satisfy the disjunctive condition. –Pointers then used to retrieve actual records.
26
Sorting Focus on external sorting where relations that are bigger than memory. Most common is external merge-sort algorithm.
27
Sorting (cont.) N-way merge –A number of runs are created and sorted. –Runs are merged.
28
Conclusion Query processing - translation, transformation, and evaluation. Query cost measured by number of disk accesses. Selection operation algorithms Sorting – N-way merge.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.