1 B + -Trees: Search  If there are n search-key values in the file,  the path is no longer than  log  f/2  (n)  (worst case).

2 External Sort-Merge  Sorting phase:  Sorts n B pages at a time  n B = # of main memory pages buffer  creates n R =  b/n B  initial sorted runs on disk  b = # of file blocks (pages) to be sorted  Sorting Cost = read b blocks + write b blocks = 2 b

3 External Sort-Merge  Merging phase:  The sorted runs are merged during one or more passes.  The degree of merging (d M ) is the number of runs that can be merged in each pass.  d M = Min (n B -1, n R )  n P =  (log dM (n R ))   n P : number of passes.  In each pass,  One buffer block is needed to hold one block from each of the runs being merged, and  One block is needed for containing one block of the merged result.

4 External Sort-Merge  Degree of merging (d M )  # of runs that can be merged together in each pass = min (n B - 1, n R )  Number of passes n P =  (log dM (n R ))   In our example  d M = 4 (four-way merging)  min (n B -1, n R ) = min(5-1, 205) = 4  Number of passes n P =  (log dM (n R ))  =  (log 4 (205))  = 4  First pass: –205 initial sorted runs would be merged into 52 sorted runs  Second pass: –52 sorted runs would be merged into 13  Third pass: –13 sorted runs would be merged into 4  Fourth pass: –4 sorted runs would be merged into 1

5 External Sort-Merge  External Sort-Merge: Cost Analysis  Disk accesses for initial run creation (sort phase) as well as in each merge pass is 2b  reads every block once and writes it out once  Initial # of runs is n R =  b/n B  and # of runs decreases by a factor of n B - 1 in each merge pass, then the total # of merge passes is n p =  log d M (n R )   In general, the cost performance of Merge-Sort is  Cost = sort cost + merge cost  Cost = 2b + 2b * n p  Cost = 2b + 2b * log dM n R  = 2b (  log d M (n R )  + 1)

6 Catalog Information  Attribute  d:# of distinct values of an attribute  sl (selectivity):  the ratio of the # of records satisfying the condition to the total # of records in the file.  s (selection cardinality) = sl * r  average # of records that will satisfy an equality condition on the attribute  For a key attribute:  d = r,sl = 1/r,s = 1  For a nonkey attribute:  assuming that d distinct values are uniformly distributed among the records  the estimated sl = 1/d,s = r/d

7 Using Selectivity and Cost Estimates in Query Optimization  Examples of Cost Functions for SELECT  S1. Linear search (brute force) approach  C S1a = b;  For an equality condition on a key, C S1a = (b/2) if the record is found; otherwise C S1a = b.  S2. Binary search:  C S2 = log 2 b + (s/bfr)  –1  For an equality condition on a unique (key) attribute, C S2 =log 2 b  S3. Using a primary index (S3a) or hash key (S3b) to retrieve a single record  C S3a = x + 1; C S3b = 1 for static or linear hashing;  C S3b = 1 for extendible hashing;

8 Using Selectivity and Cost Estimates in Query Optimization  Examples of Cost Functions for SELECT (contd.)  S4. Using an ordering index to retrieve multiple records:  For the comparison condition on a key field with an ordering index, C S4 = x + (b/2)  S5. Using a clustering index to retrieve multiple records:  C S5 = x + ┌ (s/bfr) ┐  S6. Using a secondary (B+-tree) index:  For an equality comparison, C S6a = x + s;  For an comparison condition such as >, =, or <=,  C S6a = x + (b I1 /2) + (r/2)

9 Using Selectivity and Cost Estimates in Query Optimization  Examples of Cost Functions for SELECT (contd.)  S7. Conjunctive selection:  Use either S1 or one of the methods S2 to S6 to solve.  For the latter case, use one condition to retrieve the records and then check in the memory buffer whether each retrieved record satisfies the remaining conditions in the conjunction.  S8. Conjunctive selection using a composite index:  Same as S3a, S5 or S6a, depending on the type of index.

1 B + -Trees: Search  If there are n search-key values in the file,  the path is no longer than  log  f/2  (n)  (worst case).

Similar presentations

Presentation on theme: "1 B + -Trees: Search  If there are n search-key values in the file,  the path is no longer than  log  f/2  (n)  (worst case)."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 B + -Trees: Search  If there are n search-key values in the file,  the path is no longer than  log  f/2  (n)  (worst case).

Similar presentations

Presentation on theme: "1 B + -Trees: Search  If there are n search-key values in the file,  the path is no longer than  log  f/2  (n)  (worst case)."— Presentation transcript:

Similar presentations

About project

Feedback