Lecture#18: Physical Database Design

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

Manajemen Basis Data Pertemuan 7 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
1 Physical Database Design and Tuning Module 5, Lecture 3.
Introduction to Structured Query Language (SQL)
ASP.NET Programming with C# and SQL Server First Edition
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
Database Fundamental & Design by A.Surasit Samaisut Copyrights : All Rights Reserved.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#18: Physical Database Design.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Lec 7 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#25: Column Stores.
SCALING AND PERFORMANCE CS 260 Database Systems. Overview  Increasing capacity  Database performance  Database indexes B+ Tree Index Bitmap Index 
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
1 CS122A: Introduction to Data Management Lecture #15: Physical DB Design Instructor: Chen Li.
Databases and DBMSs Todd S. Bacastow January
Fundamentals of DBMS Notes-1.
Practical Database Design and Tuning
INLS 623– Database Systems II– File Structures, Indexing, and Hashing
Record Storage, File Organization, and Indexes
CS 540 Database Management Systems
Physical Changes That Don’t Change the Logical Design
Faloutsos - Pavlo C. Faloutsos - A. Pavlo Lecture#2: E-R diagrams
Tree-Structured Indexes
Storage and Indexes Chapter 8 & 9
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#1: Introduction
Database Keys and Constraints
ITD1312 Database Principles Chapter 5: Physical Database Design
Database Application Development
Hash-Based Indexes Chapter 11
Physical Database Design for Relational Databases Step 3 – Step 8
CS1222 Using Relational Databases and SQL
Database Performance Tuning and Query Optimization
Relational Algebra Chapter 4, Part A
Translation of ER-diagram into Relational Schema
Lecture#7: Fun with SQL (Part 2)
CS1222 Using Relational Databases and SQL
Lecture#12: External Sorting (R&G, Ch13)
Physical Database Design
External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#25: Column Stores
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#27: Final Review
Cse 344 APRIL 23RD – Indexing.
Practical Database Design and Tuning
Team Project, Part II NOMO Auto, Part II IST 210 Section 4
Hash-Based Indexes Chapter 10
Selected Topics: External Sorting, Join Algorithms, …
Relational Algebra Chapter 4, Sections 4.1 – 4.2
Database Systems Instructor Name: Lecture-3.
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Database Management System
Overview of Query Evaluation
CS1222 Using Relational Databases and SQL
CS1222 Using Relational Databases and SQL
Chapter 11 Database Performance Tuning and Query Optimization
Evaluation of Relational Operations: Other Techniques
Self-organizing Tuple Reconstruction in Column-stores
Distributed Databases
Chapter 11 Instructor: Xin Zhang
CS1222 Using Relational Databases and SQL
Physical Database Design
CS1222 Using Relational Databases and SQL
Database Application Development
Information system analysis and design
Presentation transcript:

Lecture#18: Physical Database Design Faloutsos/Pavlo CMU 15-415/615 Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos – A. Pavlo Lecture#18: Physical Database Design

Administrivia HW6 is due right now. HW7 is out today Phase 1: Wed Nov 9th Phase 2: Mon Nov 28th Faloutsos/Pavlo CMU SCS 15-415/615

HW7: CMUPaper Python / Django App Postgres Database Phase 1: Design Spec Phase 2: Implementation Complete instructions and walkthrough: http://15415.courses.cs.cmu.edu/fall2016/hws/HW7 Faloutsos/Pavlo CMU SCS 15-415/615

HW7: CMUPaper We’re providing you a Linux VM image that will automatically create your development environment. Tested on Windows, Mac, Linux Only tested on 64-bit CPUs with VT-X If you don’t have your own laptop to use, please contact us. Faloutsos/Pavlo CMU SCS 15-415/615

HW7: CMUPaper Faloutsos/Pavlo CMU SCS 15-415/615

HW7: CMUPaper Faloutsos/Pavlo CMU SCS 15-415/615

Last Class Decomposition Normal Forms Lossless Dependency Preserving 3NF BCNF Faloutsos/Pavlo CMU SCS 15-415/615

Today’s Class Introduction Index Selection Denormalization Decomposition Partitioning Advanced Topics Faloutsos/Pavlo CMU SCS 15-415/615

Introduction After ER design, schema refinement, and the view definitions, we have a conceptual and external schema for our database. The next step is to create the physical design of the database. Faloutsos/Pavlo CMU SCS 15-415/615

Physical Database Design Physical design is tightly linked to query optimization Query optimization is usually a “top down” concept. But in this lecture we’ll discuss this from the “bottom up” Faloutsos/Pavlo CMU SCS 15-415/615

Physical Database Design It is important to understand the application’s workload: What kind of queries/updates does it execute? How fast is the database growing? What is the desired performance metric? Faloutsos/Pavlo CMU SCS 15-415/615

Understanding Queries For each query in the workload: Which relations does it access? Which attributes are retrieved? Which attributes are involved in selection/join conditions? How selective are these conditions likely to be? Faloutsos/Pavlo CMU SCS 15-415/615

Understanding Updates For each update in the workload: Which attributes are involved in predicates? How selective are these conditions likely to be? What types of update operations and what attributes do they affect? How often are records inserted/updated/deleted? Faloutsos/Pavlo CMU SCS 15-415/615

Consequences Changing a database’s design does not magically make every query run faster. May require you to modify your queries and/or application logic. APIs hide implementation details and can help prevent upstream apps from breaking when things change. Faloutsos/Pavlo CMU SCS 15-415/615

General DBA Advice Modifying the physical design of a database is expensive. DBA’s usually do this when the application demand is low Typically Sunday mornings. May have to do it whenever the application changes. Faloutsos/Pavlo CMU SCS 15-415/615

Today’s Class Introduction Index Selection Denormalization Decomposition Partitioning Advanced Topics Faloutsos/Pavlo CMU SCS 15-415/615

Index Selection Which relations should have indexes? What attributes(s) or expressions should be the search key? What order to use for attributes in index? How many indexes should we create? For each index, what kind of an index should it be? Faloutsos/Pavlo CMU SCS 15-415/615

Example #1 CREATE TABLE users ( userID INT, servID VARCHAR, data VARCHAR, updated DATETIME, PRIMARY KEY (userId) ); CREATE TABLE locations ( locationID INT, servID VARCHAR, coordX FLOAT, coordY FLOAT ); SELECT U.*, L.coordX, L.coordY FROM users AS U INNER JOIN locations AS L ON (U.servID = L.servID) WHERE U.userID > $1 AND EXTRACT(dow FROM U.updated) = 2; Get the location coordinates of a service for any user with an id greater than some value and whose record was updated on a Tuesday. Faloutsos/Pavlo CMU SCS 15-415/615

Example #1: Join Clause Examine the attributes in the join clause Is there an index? What is the cardinality of the attributes? SELECT U.*, L.coordX, L.coordY FROM users AS U INNER JOIN locations AS L ON (U.servID = L.servID) WHERE U.userID > $1 AND EXTRACT(dow FROM U.updated) = 2; Faloutsos/Pavlo CMU SCS 15-415/615

Example #1: Where Clause Examine the attributes in the where clause Is there an index? How are they be accessed? SELECT U.*, L.coordX, L.coordY FROM users AS U INNER JOIN locations AS L ON (U.servID = L.servID) WHERE U.userID > $1 AND EXTRACT(dow FROM U.updated) = 2; Faloutsos/Pavlo CMU SCS 15-415/615

Example #1: Output Clause Examine the query’s output clause What attributes from what tables are needed? SELECT U.*, L.coordX, L.coordY FROM users AS U INNER JOIN locations AS L ON (U.servID = L.servID) WHERE U.userID > $1 AND EXTRACT(dow FROM U.updated) = 2; Faloutsos/Pavlo CMU SCS 15-415/615

Example #1: Summary Join: U.servID, L.servID Where: U.userID, U.updated Output: U.userID, U.servID, U.data, U.updated, L.coordX, L.coordY SELECT U.*, L.coordX, L.coordY FROM users AS U INNER JOIN locations AS L ON (U.servID = L.servID) WHERE U.userID > $1 AND EXTRACT(dow FROM U.updated) = 2; Faloutsos/Pavlo CMU SCS 15-415/615

Index Selection We already have an index on U.userID. Why? What if we created separate indexes for U.servID and L.servID? CREATE INDEX idx_u_servID ON users (servID); CREATE INDEX idx_l_servID ON locations (servID); Faloutsos/Pavlo CMU SCS 15-415/615

Index Selection (2) We still have to look up U.updated. What if we created another index? This doesn’t help our query. Why? SELECT U.*, L.coordX, L.coordY FROM users AS U INNER JOIN locations AS L ON (U.servID = L.servID) WHERE U.userID > $1 AND EXTRACT(dow FROM U.updated) = 2; CREATE INDEX idx_u_servID ON users (servID); CREATE INDEX idx_l_servID ON locations (servID); CREATE INDEX idx_u_updated ON users ( EXTRACT(dow FROM updated)); CREATE INDEX idx_u_updated ON users (updated); Faloutsos/Pavlo CMU SCS 15-415/615

Index Selection (3) The query outputs L.coordX and L.coordX. This means that we have to fetch the location record. We can create a covering index. CREATE INDEX idx_u_servID ON users (servID); CREATE INDEX idx_l_servID ON locations ( servID, coordX, coordY); CREATE INDEX idx_l_servID ON locations (servID); CREATE INDEX idx_u_updated ON users ( EXTRACT(dow FROM updated)); CREATE INDEX idx_l_servID ON locations (servID) INCLUDE (coordX, coordY); Faloutsos/Pavlo CMU SCS 15-415/615 Only MSSQL

Repeat for the other six days of the week! Index Selection (4) Can we do any better? Is the index U.servID necessary? Create a partial index Repeat for the other six days of the week! CREATE INDEX idx_u_servID ON users (servID) WHERE EXTRACT(dow FROM updated) = 2; CREATE INDEX idx_u_servID ON users (servID); CREATE INDEX idx_l_servID ON locations ( servID, coordX, coordY); CREATE INDEX idx_u_updated ON users ( EXTRACT(dow FROM updated)); Faloutsos/Pavlo CMU SCS 15-415/615

Index Selection (5) Should we make the index on users a covering index? What if U.data is large? Should userID come before servID? Do we still need the primary key index? CREATE INDEX idx_u_everything ON users (servID, userID, data) WHERE EXTRACT(dow FROM updated) = 2; CREATE INDEX idx_u_everything ON users (userID, servID) WHERE EXTRACT(dow FROM updated) = 2; CREATE INDEX idx_u_everything ON users (servID, userID) WHERE EXTRACT(dow FROM updated) = 2; Faloutsos/Pavlo CMU SCS 15-415/615

Other Index Decisions What type of index to use? B+Tree, Hash table, Bitmap, R-Tree, Full Text Faloutsos/Pavlo CMU SCS 15-415/615

Today’s Class Introduction Index Selection Denormalization Decomposition Partitioning Advanced Topics Faloutsos/Pavlo CMU SCS 15-415/615

Denormalization Joins can be expensive, so it might be better to denormalize two tables back into one. This is goes against all of the BCNF goodness that we talked about it. But we have bills to pay, so this is an example where reality conflicts with the theory… Faloutsos/Pavlo CMU SCS 15-415/615

Game Example #1 CREATE TABLE players ( playerID INT PRIMARY KEY, ⋮ ); CREATE TABLE prefs ( playerID INT PRIMARY KEY REFERENCES player (playerID), data VARBINARY ); Faloutsos/Pavlo CMU SCS 15-415/615

Game Example #1 Get player preferences (1:1) SELECT P1.*, P2.* FROM players AS P1 INNER JOIN prefs AS P2 ON P1.playerID = P2.playerID WHERE P1.playerID = $1 Faloutsos/Pavlo CMU SCS 15-415/615

Denormalize into parent table Game Example #1 CREATE TABLE players ( playerID INT PRIMARY KEY, data VARBINARY, ⋮ ); CREATE TABLE players ( playerID INT PRIMARY KEY, ⋮ ); Denormalize into parent table CREATE TABLE prefs ( playerID INT PRIMARY KEY REFERENCES player (playerID), data VARBINARY ); SELECT P1.* FROM players AS P1 WHERE P1.playerID = $1 Faloutsos/Pavlo CMU SCS 15-415/615

Denormalization (1:n) It’s harder to denormalize tables with a 1:n relationship. Why? Example: There are multiple “game” instances in our application that players participate in. We need to keep track of each player’s score per game. Faloutsos/Pavlo CMU SCS 15-415/615

Game Example #2 CREATE TABLE players ( playerID INT PRIMARY KEY, ⋮ ); CREATE TABLE games ( gameID INT PRIMARY KEY, ⋮ ); CREATE TABLE scores ( gameID INT REFERENCES games (gameID), playerID INT REFERENCES players (playerID), score INT, PRIMARY KEY (gameID, playerID) ); Faloutsos/Pavlo CMU SCS 15-415/615

Game Example #2 Get the list of playerIDs for a particular game sorted by their score. SELECT S.gameID, S.score, G.*, P.* FROM players AS P, games AS G, scores AS S WHERE G.gameID = $1 AND G.gameID = S.gameID AND S.playerID = P.playerID ORDER BY S.score DESC Faloutsos/Pavlo CMU SCS 15-415/615

Game Example #2 CREATE TABLE players ( playerID INT PRIMARY KEY, ⋮ ); CREATE TABLE games ( gameID INT PRIMARY KEY, playerID1 INT REFERENCES players (playerID), score1 INT, playerID2 INT REFERENCES players (playerID), score2 INT, playerID3 INT REFERENCES players (playerID), score3 INT, ⋮ ); CREATE TABLE games ( gameID INT PRIMARY KEY, ⋮ ); CREATE TABLE scores ( gameID INT REFERENCES games (gameID), playerID INT REFERENCES players (playerID), score INT, PRIMARY KEY (gameID, playerID) ); Faloutsos/Pavlo CMU SCS 15-415/615

Arrays Denormalize 1:n relationships by storing multiple values in a single attribute. Oracle: VARRAY Postgres: ARRAY DB2/MSSQL/SQLite: UDTs MySQL: Fake it with VARBINARY Requires you to modify your application to manage these arrays. DBMS will not enforce foreign key constraints. Faloutsos/Pavlo CMU SCS 15-415/615

Game Example #2 Not a standard SQL function SELECT P.*, G.* Faloutsos/Pavlo CMU - 15-415/615 Game Example #2 CREATE TABLE players ( playerID INT PRIMARY KEY, ⋮ ); CREATE TABLE games ( gameID INT PRIMARY KEY, playerIDs INT ARRAY, scores INT ARRAY, ⋮ ); CREATE TABLE games ( gameID INT PRIMARY KEY, playerIDs INT[], scores INT[], ⋮ ); CREATE TABLE players ( playerID INT PRIMARY KEY ); CREATE TABLE games ( gameID INT PRIMARY KEY, playerIDs INT[], scores INT[] CREATE OR REPLACE FUNCTION idx(anyarray, anyelement) RETURNS INT AS $$ SELECT i FROM ( SELECT generate_series(array_lower($1,1),array_upper($1,1)) ) g(i) WHERE $1[i] = $2 LIMIT 1; $$ LANGUAGE SQL IMMUTABLE; INSERT INTO players VALUES (1), (2), (3), (4), (5); INSERT INTO games VALUES (1, '{4, 3, 1, 5, 2}', '{900, 800, 700, 600, 500}'); SELECT P.*, G.scores[idx(G.playerids, P.playerID)] AS score FROM players AS P JOIN games AS G ON P.playerID = ANY(G.playerIDs) WHERE G.gameID = 1 ORDER BY score DESC; Not a standard SQL function See: https://wiki.postgresql.org/wiki/Array_Index SELECT P.*, G.* G.scores[idx(G.playerIDs, P.playerID)] AS score FROM players AS P JOIN games AS G ON P.playerID = ANY(G.playerIDs) WHERE G.gameID = $1 ORDER BY score DESC; INSERT INTO games VALUES ( 1, --gameId '{4, 3, 1, 5, 2}', --playerIDs '{900, 800, 700, 600, 500}' --scores );

No easy way to query this in pure SQL… Faloutsos/Pavlo CMU - 15-415/615 Game Example #2 CREATE TABLE players ( playerID INT PRIMARY KEY, ⋮ ); CREATE TABLE games ( gameID INT PRIMARY KEY, playerScores INT[][], -- (playerId, score) ⋮ ); CREATE TABLE players ( playerID INT PRIMARY KEY ); DROP TABLE games; CREATE TABLE games ( gameID INT PRIMARY KEY, playerScores INT[][] INSERT INTO players VALUES (1), (2), (3), (4), (5); INSERT INTO games VALUES (1, '{{4,900}, {3,800}, {1,700}, {5,600}, {2,500}}'); SELECT P.*, G.scores[idx(G.playerids, P.playerID)] AS score FROM players AS P JOIN games AS G ON P.playerID = ANY(G.playerIDs) WHERE G.gameID = 1 ORDER BY score DESC; No easy way to query this in pure SQL… Faloutsos/Pavlo CMU SCS 15-415/615

Today’s Class Introduction Index Selection Denormalization Decomposition Partitioning Advanced Topics Faloutsos/Pavlo CMU SCS 15-415/615

Decomposition Split physical tables up to reduce the overhead of reading data during query execution. Vertical: Break off attributes from a table. Horizontal: Split data from a single table into multiple tables. This is an application of normalization. Faloutsos/Pavlo CMU SCS 15-415/615

Wikipedia Example CREATE TABLE pages ( pageID INT PRIMARY KEY, title VARCHAR UNIQUE, latest INT REFERENCES revisions (revID), updated DATETIME ); CREATE TABLE revisions ( revID INT PRIMARY KEY, pageID INT REFERENCES pages (pageID), content TEXT, updated DATETIME ); Faloutsos/Pavlo CMU SCS 15-415/615

Wikipedia Example Load latest revision for page Get all revision history for page SELECT P.*, R.* FROM pages AS P INNER JOIN revisions AS R ON P.latest = R.revID WHERE P.pageID = $1 SELECT R.revID, R.updated, ... FROM revisions AS R WHERE R.pageID = $1 Faloutsos/Pavlo CMU SCS 15-415/615

Avg. Size of Wikipedia Revision: ~16KB Wikipedia Example CREATE TABLE pages ( pageID INT PRIMARY KEY, title VARCHAR UNIQUE, latest INT REFERENCES revisions (revID), updated DATETIME ); Avg. Size of Wikipedia Revision: ~16KB CREATE TABLE revisions ( revID INT PRIMARY KEY, pageID INT REFERENCES pages (pageID), content TEXT, updated DATETIME ); Faloutsos/Pavlo CMU SCS 15-415/615

Vertical Decomposition Split out large attributes into a separate table (aka “normalize”). Trade-offs: Some queries will have to perform a join to get the data that they need. But other queries will read less data. Faloutsos/Pavlo CMU SCS 15-415/615

Vertical Decomposition CREATE TABLE pages ( pageID INT PRIMARY KEY, title VARCHAR UNIQUE, latest INT REFERENCES revisions (revID), updated DATETIME ); CREATE TABLE revisions ( revID INT PRIMARY KEY, pageID INT REFERENCES pages (pageID), updated DATETIME ); CREATE TABLE revisions ( revID INT PRIMARY KEY, pageID INT REFERENCES pages (pageID), content TEXT, updated DATETIME ); SELECT P.*, R.*, RD.* FROM pages AS P, revisions AS R, revData AS RD WHERE P.pageID = $1 AND P.latest = R.revID AND R.revID = RD.revID CREATE TABLE revData ( revID INT REFERENCES revisions (revID), content TEXT ); Faloutsos/Pavlo CMU SCS 15-415/615

Horizontal Decomposition Replace a single table with multiple tables where tuples are assigned to a table based on some condition. Can mask the changes to the application using views and triggers. Faloutsos/Pavlo CMU SCS 15-415/615

Horizontal Decomposition CREATE TABLE revisions ( revID INT PRIMARY KEY, pageID INT REFERENCES pages (pageID), updated DATETIME ); All new revisions are first added to this table. SELECT R.revID, R.updated, ... FROM revisions AS R WHERE R.pageID = $1 CREATE TABLE revDataNew ( revID INT REFERENCES revisions (revID), content TEXT ); Then a revision is moved to this table when it is no longer the latest. CREATE VIEW revData AS (SELECT * FROM revDataNew) UNION (SELECT * FROM revDataOld) CREATE TABLE revDataOld ( revID INT REFERENCES revisions (revID), content TEXT ); Faloutsos/Pavlo CMU SCS 15-415/615

Today’s Class Introduction Index Selection Denormalization Decomposition Partitioning Advanced Topics Faloutsos/Pavlo CMU SCS 15-415/615

Partitioning Split single logical table into disjoint physical segments that are stored/managed separately. Ideally partitioning is transparent to the application. The application accesses logical tables and doesn’t care how things are stored. Not always true. Faloutsos/Pavlo CMU SCS 15-415/615

Vertical Partitioning Store a table’s attributes in a separate location (e.g., file, disk volume). Have to store tuple information to reconstruct the original record. Partition #1 Partition #2 Tuple#1 Tuple#2 Tuple#3 Tuple#4 revID pageID updated content ... Tuple#1 Tuple#2 Tuple#3 Tuple#4

Horizontal Partitioning Divide the tuples of a table up into disjoint segments based on some partitioning key. Hash Partitioning Range Partitioning Predicate Partitioning We will cover this more in depth when we talk about distributed databases. Faloutsos/Pavlo CMU SCS 15-415/615

Horizontal Partitioning (Postgres) CREATE TABLE revisions ( revID INT PRIMARY KEY, pageID INT REFERENCES pages (pageID), updated DATETIME ); CREATE TABLE revData ( revID INT REFERENCES revisions (revID), content TEXT, isLatest BOOLEAN DEFAULT true ); CREATE TABLE revDataNew ( revID INT REFERENCES revisions (revID), content TEXT ); CREATE TABLE revDataOld ( Still need triggers to move data between partitions on update. CREATE TABLE revDataNew ( CHECK (isLatest = true) ) INHERITS revData; CREATE TABLE revDataOld ( CHECK (isLatest = false)

Today’s Class Introduction Index Selection Denormalization Decomposition Partitioning Advanced Topics Faloutsos/Pavlo CMU SCS 15-415/615

Caching Queries for content that does not change often slow down the database. Use external cache to store objects. Memcached, Facebook Tao Application has to maintain consistency. Faloutsos/Pavlo CMU SCS 15-415/615

Auto-Tuning Vendors include tools that can help with the physical design process: IBM DB2 Advisor Microsoft AutoAdmin Oracle SQL Tuning Advisor Random MySQL/Postgres tools Still a very manual process. We are working on something better… Faloutsos/Pavlo CMU SCS 15-415/615

Next Three Weeks Database System Internals Concurrency Control Logging & Recovery Distributed DBMSs Column Store DBMSs Faloutsos/Pavlo CMU SCS 15-415/615