Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS145: Intro to Database Management Systems

Similar presentations


Presentation on theme: "CS145: Intro to Database Management Systems"— Presentation transcript:

1 CS145: Intro to Database Management Systems
Lecture 1: Course Overview

2 “data is the future” – my cab driver in Pittsburgh

3 Outline Introduction Administrative stuff
What is a database and why do we use it? Summary

4 Big Data Landscape… Infrastructure is Changing
New tech. Same Principles.

5 Why should you study databases?
Mercenary: make more $$$ Startups need DB talent right away = low employee # Massive industry… Intellectual: Science: data poor to data rich No idea how to handle the data! Fundamental ideas to/from all of CS: systems, theory, AI, logic, stats, analysis…. Many great computer systems ideas started in DB.

6 What this course is (and is not)
Discuss fundamentals of data management How to query databases, design databases, build applications with them. Not how to be a DBA or how to tune Oracle 12g. We won’t get to cover the principles of how to build database management systems.  see 245, 345, and 346.

7 Who we are… Instructor (me) Chris Ré (sounds like Ray)
Faculty in the InfoLab Research: theory of data processing, statistical analytics, and machine reading. Office hours: MW in Gates 433

8 Course Assistants (CAs) !
Remember: CAs are people (students) too!

9 Joy Kim Angela Gong Sam Keller Kevin McKenzie Curran Kaushik Vien Dinh Duong Michael Fitzpatrick Firas Abuzaid Vishnu Sundaresan Raven Jiang Gina Pai Yifei Huang Patrick Harvey

10 Communication w/ Course Staff
Piazza, Course mailing list, Office hours, and By appointment! All are (or will be soon) listed on the course page!

11 Course Logistics cs145.stanford.edu
The Webpage contains the most up-to-date information. cs145.stanford.edu

12 Course Elements This class is semi-flipped:
Learn from your classmates! Some classes are flipped, some are not… The Red F is your guide! Attendance (10%) Lectures or Videos per week Videos and Slides.

13 Lectures Lecture slides cover essential material
You can (almost) always watch Jennifer instead! Database Systems and Locking are new this time. Try to cover same thing in many ways: Lecture, lecture notes, homework, exams (no shock) Attendance makes your life easier… 8 lectures mandatory…must attend GUEST LECTURES!

14 All but the final assignment are due on Monday before class.
Graded Elements Attendance (10%) – 8 Classes. Problem Sets & EdX Questions (20%) You can retake EdX until you get a perfect score. Programming project (20%) Auction base. Up now! midterm & final exam (20%/30% of grade) All but the final assignment are due on Monday before class.

15 What is expected from you
Attend lectures If you don’t it’s at your own peril Be active Ask questions, post comments on forums Do programming and homework projects Start early and be honest. Study for tests and exams.

16 Now to databases...

17 What is a DBMS? A large, integrated collection of data
Models a real-world enterprise Entities (e.g., Students, Courses) Relationships (e.g., Alice is enrolled in 145) A Database Management System (DBMS) is a piece of software designed to store and manage databases

18 A Motivating, Running Example
Consider building a course management system (CMS): students courses professors who takes what who teaches what Entities Relationships

19 Data models A data model is a collection of concepts for describing data A schema is a description of a particular collection of data, using the given data model The relational model of data is the most widely used model today Main Concept: relation: essentially, a table Every relation has a schema describing types, etc.

20 “Relational databases form the bedrock of western civilization” – Bruce Lindsay, IBM Research

21 Modeling the CMS Logical Schema
Students(sid: string, name: string, gpa: float) Courses(cid: string, cname: string, credits: int) Enrolled(sid: string, cid: string, grade: string) Sid Name Gpa 101 Bob 3.2 123 Mary 3.8 cid cname credits 564 564-2 4 308 417 2 Relations sid cid Grade 123 564 A Students Courses Enrolled

22 Other Schemata… Physical Schema: describes data layout
Relations as unordered files Some data in sorted order (index) Logical Schema: Previous slide External Schema: (Views) Course_info(cid: string, enrollment: integer) Derived from other tables for “authorized users” Administrators Applications

23 NB: One of the most important reasons to use a DBMS
Data independence Applications do not need to worry about how the data is structured and stored Logical data independence protection from changes in the logical structure of the data Physical data independence is protection from the physical layout changes NB: One of the most important reasons to use a DBMS

24 Challenges with Many Users
CMS application serves 1000s+ of users Security: Different users, different roles Performance: Need to provide concurrent access Consistency: Concurrency can lead to update problems Disk/SSD access is slow, DBMS hide the latency by doing more CPU work concurrently DBMS allows user to write programs as if they were the only user.

25 Atomicity: An action either completes entirely or not at all
Transactions Key concept is a transaction: an atomic sequence of db actions (reads/writes) Transactions leave the DB in a consistent state Users may write integrity constraints, e.g., each course is assigned to exactly one room But, DBMS does not understand the real semantics of the data – consistency burden is still on the user! Atomicity: An action either completes entirely or not at all

26 Scheduling concurrent transactions
DBMS ensures that execution of {T1, … Tn} is equivalent to some serial execution Locking: Before reading or writing transaction reqs a lock from DBMS, holds until the end Idea: If Ti writes an item x and Tj reads x then Ti, Tj conflict only one winner gets the lock. loser is blocked until winner finishes What if Ti asks for X before Tj and Tj asks for Y before Ti? Deadlock! One is aborted…

27 Ensuring Atomicity DBMS ensures atomicity all-or-nothing property – even if a transaction crashes! Idea: Keep a log of all writes the DB does Write-ahead log (WAL): Before a change is made, the corresponding log entry is forced to disk Idea: After a crash, partially executed transactions are undone using the log NB: Thanks to WAL, if log entry not present – then its not applied to the DB

28 More details about the log
The following actions are in the log: Ti writes an object: old value and new value Ti commits/aborts Log records chained by Xact ID so easy undo Log is on “stable” storage All log maintenance and concurrency handled transparently by DBMS

29 Friends of Databases (people made happy)
End users and DBMS vendors Reduces cost and makes money DB application programmers e.g., smart webmasters Database administrators (DBA) Designs logical/physical schema Handles security/authorizatino Tuning, crash recovery, and more… Must understand DB internals

30 Summary of DBMS DBMS used to maintain, query, and manage large datasets. Provides concurrency, recovery from crashes, quick application development, integrity, and security Key abstractions give independence DBMS R&D is one of the broadest, most exciting fields in CS. Fact!


Download ppt "CS145: Intro to Database Management Systems"

Similar presentations


Ads by Google