February 12th – RDBMS Internals

Slides:



Advertisements
Similar presentations
CS 540 Database Management Systems
Advertisements

Query Execution, Concluded Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Some slide content may.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
1 Lecture 20: Indexes Friday, February 25, Outline Representing data elements (12) Index structures (13.1, 13.2) B-trees (13.3)
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Query Processing & Optimization
1 Relational Operators. 2 Outline Logical/physical operators Cost parameters and sorting One-pass algorithms Nested-loop joins Two-pass algorithms.
1 Physical Data Organization and Indexing Lecture 14.
CSCE Database Systems Chapter 15: Query Execution 1.
CPS216: Advanced Database Systems Notes 07:Query Execution Shivnath Babu.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Lecture 5 Cost Estimation and Data Access Methods.
CS4432: Database Systems II Query Processing- Part 2.
Lecture 17: Query Execution Tuesday, February 28, 2001.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Query Processing – Implementing Set Operations and Joins Chap. 19.
Query Execution Chapter 15 Section 15.1 Presented by Khadke, Suvarna CS 257 (Section II) Id
Query Optimization Heuristic Optimization
CS 540 Database Management Systems
CS522 Advanced database Systems
Record Storage, File Organization, and Indexes
CS 540 Database Management Systems
CS 440 Database Management Systems
Database Management System
Storage and Indexes Chapter 8 & 9
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
Chapter 12: Query Processing
Database Performance Tuning and Query Optimization
Introduction to Query Optimization
Evaluation of Relational Operations
Cse 344 February 14th – Indexing.
Chapter 15 QUERY EXECUTION.
Database Management Systems (CS 564)
Evaluation of Relational Operations: Other Operations
Introduction to Database Systems
Examples of Physical Query Plan Alternatives
Cse 344 April 25th – Disk i/o.
April 20th – RDBMS Internals
Cse 344 APRIL 23RD – Indexing.
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#13: Query Evaluation
February 16th – Disk i/o and estimation
Selected Topics: External Sorting, Join Algorithms, …
Lecture 21: Indexes Monday, November 13, 2000.
Database Applications (15-415) DBMS Internals- Part IX Lecture 21, April 1, 2018 Mohammad Hammoud.
Lecture 19: Data Storage and Indexes
April 27th – Cost Estimation
Chapters 15 and 16b: Query Optimization
Lecture 2- Query Processing (continued)
Advance Database Systems
CSE 544: Lecture 11 Storing Data, Indexes
Query Execution Presented by Jiten Oswal CS 257 Chapter 15
Relational Algebra Friday, 11/14/2003.
Chapter 12 Query Processing (1)
Overview of Query Evaluation
Chapter 11 Database Performance Tuning and Query Optimization
Relational Query Optimization
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
Lecture 23: Query Execution
Evaluation of Relational Operations: Other Techniques
Introduction to Database Systems CSE 444 Lectures 19: Data Storage and Indexes May 16, 2008.
Wednesday, 5/8/2002 Hash table indexes, physical operators
Database Systems (資料庫系統)
External Sorting Sorting is used in implementing many relational operations Problem: Relations are typically large, do not fit in main memory So cannot.
Lecture 11: B+ Trees and Query Execution
Yan Huang - CSCI5330 Database Implementation – Query Processing
Lecture 20: Indexes Monday, February 27, 2006.
Evaluation of Relational Operations: Other Techniques
Lecture 20: Query Execution
CSE 190D Database System Implementation
Presentation transcript:

February 12th – RDBMS Internals Cse 344 February 12th – RDBMS Internals

Administrivia HW5 out tonight OQ5 out Wednesday Both due February 21 (11:30 & 11:00) Exam grades on canvas by Wednesday Handed back in section on Thursday

Today Back to RDBMS ”Query plans” and DBMS planning Management between SQL and execution Optimization techniques Indexing and data arrangement

Query Evaluation Steps SQL query Parse & Rewrite Query Logical plan (RA) Select Logical Plan Query optimization Select Physical Plan Physical plan Analogy with sorting: we have decided the ordering of operations, and now we need to choose their implementation. For instance, there are many ways to implement sort. Query Execution Disk

Logical vs Physical Plans Logical plans: Created by the parser from the input SQL text Expressed as a relational algebra tree Each SQL query has many possible logical plans Physical plans: Goal is to choose an efficient implementation for each operator in the RA tree Each logical plan has many possible physical plans

Review: Relational Algebra Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = 2 and x.scity = ‘Seattle’ and x.sstate = ‘WA’ πsname σscity=‘Seattle’ and sstate=‘WA’ and pno=2 sid = sid Relational algebra expression is also called the “logical query plan” Supplier Supply

Physical Query Plan 1 (On the fly) πsname (On the fly) Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Physical Query Plan 1 (On the fly) πsname A physical query plan is a logical query plan annotated with physical implementation details (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = 2 and x.scity = ‘Seattle’ and x.sstate = ‘WA’ (Nested loop) sid = sid Describe the “nested loop join”: we have not discussed physical operators in class yet, but it should be clear what a nested loop means. Supplier Supply (File scan) (File scan)

Physical Query Plan 2 (On the fly) πsname (On the fly) Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Physical Query Plan 2 (On the fly) πsname Same logical query plan Different physical plan (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = 2 and x.scity = ‘Seattle’ and x.sstate = ‘WA’ (Hash join) sid = sid Briefly mention that a hash join is a different implementation. Normally, most students in class should have no problem imagining what it might be, but no need to describe it in detail, we will cover it in two lectures. Supplier Supply (File scan) (File scan)

Physical Query Plan 3 (On the fly) πsname (d) (Sort-merge join) (c) Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Physical Query Plan 3 Different but equivalent logical query plan; different physical plan (On the fly) πsname (d) SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = 2 and x.scity = ‘Seattle’ and x.sstate = ‘WA’ (Sort-merge join) (c) sid = sid (Scan & write to T1) (a) σscity=‘Seattle’ and sstate=‘WA’ Same here, briefly mention that sort-merge is yet another way to implement join. (b) σpno=2 (Scan & write to T2) Supplier Supply (File scan) (File scan) CSE 344 - 2017au

Query Optimization Problem For each SQL query… many logical plans For each logical plan… many physical plans Next: we will discuss physical operators; how exactly are query executed?

Physical Operators Each of the logical operators may have one or more implementations = physical operators Will discuss several basic physical operators, with a focus on join

Main Memory Algorithms Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Main Memory Algorithms Logical operator: Supplier ⨝sid=sid Supply Propose three physical operators for the join, assuming the tables are in main memory:

Main Memory Algorithms Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Main Memory Algorithms Logical operator: Supplier ⨝sid=sid Supply Propose three physical operators for the join, assuming the tables are in main memory: Nested Loop Join O(??) Merge join O(??) Hash join O(??) CSE 344 - 2017au

Main Memory Algorithms Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Main Memory Algorithms Logical operator: Supplier ⨝sid=sid Supply Propose three physical operators for the join, assuming the tables are in main memory: Nested Loop Join O(n2) Merge join O(n log n) Hash join O(n) … O(n2) CSE 344 - 2017au

BRIEF Review of Hash Tables Separate chaining: A (naïve) hash function: 1 2 3 4 5 6 7 8 9 Duplicates OK WHY ?? h(x) = x mod 10 503 103 503 Operations: 76 666 find(103) = ?? insert(488) = ?? 48

BRIEF Review of Hash Tables insert(k, v) = inserts a key k with value v Many values for one key Hence, duplicate k’s are OK find(k) = returns the list of all values v associated to the key k CSE 344 - 2017au

Iterator Interface Each operator implements three methods: open() next() close() CSE 344 - 2017au

Iterator Interface Example “on the fly” selection operator interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Iterator child) { this.p = p; this.child = child; } Tuple next () { boolean found = false; while (!found) { Tuple in = child.next(); if (in == EOF) return EOF; found = p(in); return in; void close () { child.close(); }

Iterator Interface Example “on the fly” selection operator interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Operator child) { this.p = p; this.child = child; } Tuple next () { boolean found = false; while (!found) { Tuple in = child.next(); if (in == EOF) return EOF; found = p(in); return in; void close () { child.close(); }

Iterator Interface Example “on the fly” selection operator interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Operator child) { this.p = p; this.child = child; } Tuple next () {

Iterator Interface Example “on the fly” selection operator interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Operator child) { this.p = p; this.child = child; } Tuple next () { boolean found = false; Tuple r = null; while (!found) { r = child.next(); if (r == null) break; found = p(in);

Iterator Interface Example “on the fly” selection operator interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Operator child) { this.p = p; this.child = child; } Tuple next () { boolean found = false; Tuple r = null; while (!found) { r = child.next(); if (r == null) break; found = p(in); return r;

Iterator Interface Example “on the fly” selection operator interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Operator child) { this.p = p; this.child = child; } Tuple next () { boolean found = false; Tuple r = null; while (!found) { r = child.next(); if (r == null) break; found = p(in); return r; void close () { child.close(); }

Iterator Interface interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } Operator q = parse(“SELECT ...”); q = optimize(q); q.open(); while (true) { Tuple t = q.next(); if (t == null) break; else printOnScreen(t); } q.close(); Query plan execution

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname open() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname open() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 open() (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname open() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 open() (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). open() Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname open() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 open() (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). open() open() Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 next() (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 next() (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. next() Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 next() (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. next() next() Suppliers Supplies (File scan) (File scan)

Discuss: open/next/close for nested loop join Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 next() (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. next() next() next() Suppliers Supplies (File scan) (File scan)

Discuss hash-join in class Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss hash-join in class (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Hash Join) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Suppliers Supplies (File scan) (File scan)

Pipelining Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss hash-join in class (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Hash Join) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Tuples from here are pipelined Suppliers Supplies (File scan) (File scan)

Pipelining Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss hash-join in class (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 Tuples from here are “blocked” (Hash Join) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Tuples from here are pipelined Suppliers Supplies (File scan) (File scan)

Discuss merge-join in class Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Blocked Execution (On the fly) πsname Discuss merge-join in class (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Merge Join) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Suppliers Supplies (File scan) (File scan)

Discuss merge-join in class Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Blocked Execution (On the fly) πsname Discuss merge-join in class (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Merge Join) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Blocked Blocked Suppliers Supplies (File scan) (File scan)

Pipelined Execution Tuples generated by an operator are immediately sent to the parent Benefits: No operator synchronization issues No need to buffer tuples between operators Saves cost of writing intermediate data to disk Saves cost of reading intermediate data from disk This approach is used whenever possible Mention that some engines process tuples in batches rather than one at a time. Others process one at a time.

Query Execution Bottom Line SQL query transformed into physical plan Access path selection for each relation Scan the relation or use an index (next lecture) Implementation choice for each operator Nested loop join, hash join, etc. Scheduling decisions for operators Pipelined execution or intermediate materialization Pipelined execution of physical plan

Recall: Physical Data Independence Applications are insulated from changes in physical storage details SQL and relational algebra facilitate physical data independence Both languages input and output relations Can choose different implementations for operators

Query Performance My database application is too slow… why? One of the queries is very slow… why? To understand performance, we need to understand: How is data organized on disk How to estimate query costs In this course we will focus on disk-based DBMSs

Student Data Storage ID fName lName 10 Tom Hanks 20 Amy … DBMSs store data in files Most common organization is row-wise storage On disk, a file is split into blocks Each block contains a set of tuples In the example, we have 4 blocks with 2 tuples each 10 Tom Hanks 20 Amy block 1 50 … 200 block 2 DBMS decides how to map relations to OS files. DBMSs can also store data directly on disk, bypassing the OS. We won’t get into those details here. 220 240 block 3 420 800

Data File Types The data file can be one of: Heap file Unsorted Student Data File Types ID fName lName 10 Tom Hanks 20 Amy … The data file can be one of: Heap file Unsorted Sequential file Sorted according to some attribute(s) called key

Data File Types The data file can be one of: Heap file Unsorted Student Data File Types ID fName lName 10 Tom Hanks 20 Amy … The data file can be one of: Heap file Unsorted Sequential file Sorted according to some attribute(s) called key Note: key here means something different from primary key: it just means that we order the file according to that attribute. In our example we ordered by ID. Might as well order by fName, if that seems a better idea for the applications running on our database. CSE 344 - 2017au

Index An additional file, that allows fast access to records in the data file given a search key

Index An additional file, that allows fast access to records in the data file given a search key The index contains (key, value) pairs: The key = an attribute value (e.g., student ID or name) The value = a pointer to the record

Index Key = means here search key An additional file, that allows fast access to records in the data file given a search key The index contains (key, value) pairs: The key = an attribute value (e.g., student ID or name) The value = a pointer to the record Could have many indexes for one table Key = means here search key

Keys in indexing Different keys: Primary key – uniquely identifies a tuple Key of the sequential file – how the data file is sorted, if at all Index key – how the index is organized

Example 1: Index on ID Data File Student fName lName 10 Tom Hanks 20 Amy … Data File Student Index Student_ID on Student.ID 10 Tom Hanks 20 Amy 10 20 50 200 220 240 420 800 50 … 200 220 240 Note: this doesn’t speed up a sequential scan if the file data is organized by ID, but it allows fast access to an ID without having to scan the full table Student_ID is the name of the index on the attribute Student.ID 420 800 950 …

Example 2: Index on fName Student Example 2: Index on fName ID fName lName 10 Tom Hanks 20 Amy … Index Student_fName on Student.fName Data File Student 10 Tom Hanks 20 Amy Amy Ann Bob Cho … 50 … 200 220 240 420 800 … Tom

Index Organization We need a way to represent indexes after loading into memory so that they can be used Several ways to do this: Hash table B+ trees – most popular They are search trees, but they are not binary instead have higher fanout Will discuss them briefly next Specialized indexes: bit maps, R-trees, inverted index

(preferably in memory) Student Hash table example ID fName lName 10 Tom Hanks 20 Amy … Data File Student Index Student_ID on Student.ID 10 Tom Hanks 20 Amy 10 20 50 200 220 240 420 800 … 50 … 200 220 240 After this, explain Binary trees (they should know this from 311, but just as a refresher) 420 800 Index File (preferably in memory) Data file (on disk)

B+ Tree Index by Example Find the key 40 80 40 <= 80 20 60 100 120 140 20 < 40 <= 60 Good for scanning ranges of tuples 10 15 18 20 30 40 50 60 65 80 85 90 30 < 40 <= 40 10 15 18 20 30 40 50 60 65 80 85 90

Clustered vs Unclustered B+ Tree B+ Tree Index entries Index entries (Index File) (Data file) Data Records Data Records CLUSTERED UNCLUSTERED Every table can have only one clustered and many unclustered indexes Why? 12

Index Classification Clustered/unclustered Clustered = records close in index are close in data Option 1: Data inside data file is sorted on disk Option 2: Store data directly inside the index (no separate files) Unclustered = records close in index may be far in data

Index Classification Clustered/unclustered Primary/secondary Clustered = records close in index are close in data Option 1: Data inside data file is sorted on disk Option 2: Store data directly inside the index (no separate files) Unclustered = records close in index may be far in data Primary/secondary Meaning 1: Primary = is over attributes that include the primary key Secondary = otherwise Meaning 2: means the same as clustered/unclustered

Index Classification Clustered/unclustered Primary/secondary Clustered = records close in index are close in data Option 1: Data inside data file is sorted on disk Option 2: Store data directly inside the index (no separate files) Unclustered = records close in index may be far in data Primary/secondary Meaning 1: Primary = is over attributes that include the primary key Secondary = otherwise Meaning 2: means the same as clustered/unclustered Organization B+ tree or Hash table