1 Database Management Systems CS 564 Lecture #1 (with some slides integrated from those of Raghu Ramakrishnan, Jeff Ullman, Alon Halevy, and Dan Suciu.)

Slides:



Advertisements
Similar presentations
Intro to CIT 594
Advertisements

Introduction Susan B. Davidson University of Pennsylvania CIS330 – Database & Information Systems Some slide content courtesy of Tova Milo.
Introduction to programming with Visual Basic.NET Dr. Marty Sirkin.
Mani-CS34311 CS3431 – Database Systems I Introduction Instructor: Murali Mani
Introduction to Database Systems Ch. 1, Ch. 2 Mr. John Ortiz Dept. of Computer Science University of Texas at San Antonio.
1 Introduction to Information Systems SSC, Semester 6 Lecture 01.
1 Introduction to Information Systems SSC, Semester 6 Lecture 01.
Murali Mani CS3431 – Database Systems I Introduction.
1 Introduction to Database Systems CSE 444 Lecture #1 January 5, 2004 Alon Halevy.
1 Database Systems Lecture #1. 2 Staff Lecturer: Yael Amsterdamer – –Schreiber, Databases lab, M-20, –Office.
1 Introduction to Database Systems CSE 444 Lecture #1 January 3, 2005.
Computer Science 102 Data Structures and Algorithms V Fall 2009 Lecture 1: administrative details Professor: Evan Korth New York University 1.
1 Database Systems Lecture #1. 2 Staff Instructor: Tova Milo – –Schreiber, Room 314, –Office hours: See.
WEEK 1 CS 361: ADVANCED DATA STRUCTURES AND ALGORITHMS Dong Si Dept. of Computer Science 1.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
Page 1 Course Description CPS510 Database Systems Fall 2004 School of Computer Science Ryerson University.
CSC2012 Database Technology & CSC2513 Database Systems.
Introduction. » How the course works ˃Homework ˃Project ˃Exams ˃Grades » prerequisite ˃CSCI 6441: Mandatory prerequisite ˃Take the prereq or get permission.
The Worlds of Database Systems Chapter 1. Database Management Systems (DBMS) DBMS: Powerful tool for creating and managing large amounts of data efficiently.
Lecture 1 Page 1 CS 111 Summer 2015 Introduction CS 111 Operating System Principles.
Welcome to COP4710 Course Website:
Chapter 9 Database Management Discovering Computers Fundamental.
1 Introduction to Operating Systems 9/16/2008 Lecture #1.
CS 474 Database Design and Application Terminology Jan 11, 2000.
Is422- Course Overview Prepared by L. Nouf Almujally 1.
CS461: Principles and Internals of Database Systems Instructor: Ying Cai Department of Computer Science Iowa State University Office:
Introduction to Database Systems Fundamental Concepts Irvanizam Zamanhuri, M.Sc Computer Science Study Program Syiah Kuala University Website:
Spring 2011 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii.
Introduction to Database Management Systems. Information Instructor: Csilla Farkas Office: Swearingen 3A43 Office Hours: Monday, Wednesday 4:15 pm – 5:30.
CSE544 Introduction Monday, March 29, Staff Instructor: Dan Suciu –CSE 662, –Office hours: Tuesday, 1-2pm. TA: Nilesh Dalvi.
Course Overview Prepared by L. Nouf Almujally 1. Course Objectives Fundamental concepts of database systems, in particular, relational database systems.
INFS614, Dr. Brodsky, GMU1 Database Management Systems INFS 614 Instructor: Professor Alex Brodsky
CS 564 Database Management Systems: Design and Implementation Fall 2015 Arun Kumar CS 564: Database Management Systems1.
1 Database Systems ( 資料庫系統 ) 9/16/2009 Lecture #1.
Introduction. » How the course works ˃Homework ˃Project ˃Exams ˃Grades » prerequisite ˃CSCI 6441: Mandatory prerequisite ˃Take the prereq or get permission.
Introduction to Database Management Systems. Information Instructor: Csilla Farkas Office: Swearingen 3A43 Office Hours: Monday, Wednesday 2:30 pm – 3:30.
IST 210: Organization of Data
CS 541 Lecture Slides Sunil Prabhakar CS541 Database Systems.
Fall 2010 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
Fall 2010 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
A Puzzle for You. Puzzle Someone is working for you for 7 days You have a gold bar, which is segmented into 7 pieces, but they are all CONNECTED You have.
1 Database Systems ( 資料庫系統 ) Practicum in Database Systems ( 資料庫系統實驗 ) 9/20 & 9/21, 2006 Lecture #1.
1 Introduction to C Programming 計算機程式設計 2/22/2012 朱浩華.
Fall CSE330/CIS550: Introduction to Database Management Systems Prof. Susan Davidson Office: 278 Moore Office hours: TTh
Database Design and Implementation ITCS6160 & ITCS 8160 Instructor: Jianping Fan Time: Thursday 3:30PM-6:15PM Classroom: Woodward Hall 130 Course Webpage:
Winter 2016CISC101 - Prof. McLeod1 CISC101 Elements of Computing Science I Course Web Site: The lecture outlines.
Data Structures and Algorithms in Java AlaaEddin 2012.
Database Management Systems.  Instructor: Yrd. Doç. Dr. Cengiz Örencik   Course material.
Operating Systems CMPSC 473 Introduction and Overview August 24, Lecture 1 Instructor: Bhuvan Urgaonkar.
IST 210: ORGANIZATION OF DATA Introduction IST210 1.
IMS 4212: Course Introduction 1 Dr. Lawrence West, Management Dept., University of Central Florida ISM 4212 Dr. Larry West
Introduction to CSCI 242 Compiled by S. Zhang 1. Syllabus Syllabus has the most updated information! –Use the information on the syllabus for the grading.
CS3431-B111 CS3431 – Database Systems I Logistics Instructor: Mohamed Eltabakh
Very Brief Background on RDBMSs, Big Data/NoSQL Systems, Machine Learning AnHai Doan.
Introduction to Database Systems CSE 444 Lecture #1 September,
Introduction to Information Systems
Database Systems Lecture #1.
Database Systems Lecture #1.
Introduction to Database Systems CSE 444
Database Management Systems (CS 564)
Database Systems Lecture #1.
Database Design and Implementation
CSE544 Lecture 1: Introduction
Introduction to Database Systems CSE 444
Introduction to Database Systems CSE 444
Introduction to Database Systems CSE 444
Introduction to Database Systems CSE 444
Introduction to Database Systems CSE 444
Introduction to Database Systems CSE 444
Presentation transcript:

1 Database Management Systems CS 564 Lecture #1 (with some slides integrated from those of Raghu Ramakrishnan, Jeff Ullman, Alon Halevy, and Dan Suciu.)

Yes. This is the Room for CS 564 We moved from Humanities 1111 All future lectures/discussions will be in this room Please sit a bit closer to the screen, so that I don’t have to shout Room doors are usually locked; I will unlock 15 minutes before each class 2

3 A Bit about Myself Born in Vietnam Grew up in a fishing village Nice name: AnHai Doan “Nghe An” “Hai Phong” Until my brother as born as HaiAn Doan

4 Vietnam  Hungary  US High school in Vietnam Undergrad in Hungary –had lot of beers –learned seven languages –Hungarian, English, C, C++, Ada, Pascal, PL/I When iron curtain fell back in 1993, one of the firsts to reach US to study

5 Wisconsin  Seattle  Illinois  Wisconsin Masters at Wisconsin-Milwaukee Ph.D. at Washington-Seattle –where I failed to take “CS 564” started at Univ of Illinois-Urbana –with corn, cow, campus In Madison since 2006 –where the four major food groups are

6 Random Comments from Students Take instruction seriously, … gave lots of really excellent dating advice All in-class examples revolve around beer His accent is very annoying … His accent is great. It’s so hard to understand that I’m forced to concentrate in lectures … His accent is a bonus feature of the class. Prepared me to work in Silicon Valley I now love databases …When I own Oracle, I will pay you back.

What is this Course about? Numerous applications must deal with a lot of data They typically put data into a database The database will be managed by a system called database management system Applications then interact with this system to access and use the database 7

An Illustration 8 Database management system DB 2 DB 3 DB 1 App 1 App 2

Questions What form should the data be in? –way back in 1970s, people suggest to store data in tables –so each database is a set of tables 9 IDFirst NameLast Name 1BarackObama 2GeorgeBush IDCityState 1Washington DC 2DallasTX Students Addresses

Questions What form should the data be in? – each table can be thought of as a relation in the mathematical sense –so such a database is referred to as a relational DB 10 IDFirst NameLast Name 1BarackObama 2GeorgeBush IDCityState 1Washington DC 2DallasTX Students Addresses

So the management system is called a relational database management system (or RDBMS for short) 11 Database management system DB 2 DB 3 DB 1 App 1 App 2

Since the 1970s, RDBMSs have been studied intensively, and have taken over the world It is now a corner stone of the modern world Powering virtually all data-intensive apps 20B industry Bought island in Hawaii Since then new types of data have emerged –that would not be very well suited to be modeled as tables 12

New types of database management systems have also emerged –eg NoSQL systems But RDBMSs remain foundational and pervasive, and will be so in the future This class focuses on RDBMSs –we will learn how to design a relational database –how to store it in an RDBMS –how to use an RDBMS –look into the internals of RDBMS 13

Lessons that you learn in this class will carry over to newer types of database management systems You will learn fundamentals of managing a large amount of data –critical as the world is becoming increasingly data centric Good for you when you go applying for a job –many jobs require knowing how to use RDBMSs It’s fun 14

If you are interested in more data managment stuff –CS 764: gory details about RDBMSs –CS 784: newer types of data and how to manage them (beyond RDBMSs) 15

16 Course Logistics

17 Prerequisite Must have data structure and algorithm background –CS 367 is a must; CS 537 might be useful For the project –lot of programming will be required –in a high-level language of your own choosing (or rather your team’s choosing) –could be Java, C, C++, Perl, Python, etc. –must know how to build a Web based application or be willing to learn

18 Textbook –There is no ideal textbook, unfortunately –Database Management Systems, by R. Ramakrishnan and J. Gehrke, third edition –Database Systems: The Complete Book, by Garcia- Molina, Ullman and Widom, second edition –The best thing to do is to attend the lectures, make notes, and read the lecture notes –Consult the textbooks –If you do this, you will be fine

19 Course Format For all students –two 75-min lectures / week –project: programming, 4-5 stages, may include some basic homework questions –a midterm and a final exam Attending lectures on Wed/Fri is important We also use the Mon slots occasionally for make-up lectures So if you can’t make Monday 2:25-3:15, do not take the class

In fact, for next week I’m traveling on W and F So we will have a make-up lecture on Monday, Jan 26 20

21 Lectures Lecture slides in ppt format will be posted shortly before or after the lecture –are to complement the lectures Many issues discussed in the lectures will be covered in the exams –hence try to attend lectures regularly Will not cover ALL materials on the slides –attending lectures will tell you which is covered and which is not

22 Project Select an application that needs a database Build a database application from start to finish Significant amount of programming Will be done in stages –you will submit some work at the end of each stage May have to show a demo at semester end

23 Project Groups Project will be done in group of 3-4 students –a lot of work, difficult to design so that one person can do all –learn how to work in a group: valuable skills –groups are like broccoli, they are good for you Try to form groups as soon as possible –can start by posting requests on Piazza There will be a deadline later for forming groups If you have not formed groups by then –we will help assign you to groups

24 More on Grouping All group members receive same grading If someone drops out, the rest pick up the work

25 Exams Midterm & final –will be announced shortly –check dates and make sure no conflict! There may be some brief review before each exam If you have conflicts –do let us know in advance The Uncle problem

26 Tentative Grading Breakdown Midterm: 25% Final: 35% Project: 40% Will attempt to grade on an absolute scale as much as possible –not on a curve

27 Contacting the staff...

28 Staff & Office Hours Instructor: AnHai Doan TAs: –Avinaash Gupta –Harneet Singh See class homepage for office hours, contact information

29 Communications class homepage – mailing list: –vitally important! –make sure to check it regularly for new announcements Piazza: will be set up shortly If you have a question/problem –talk to people in your group first –post your question on Piazza – TA –go to office hours to talk to TA or instructor

30 Now onto database studies...

At the Beginning A program typically consists of code + data Eg, need to sort 1000 numbers –2, 4, 6, 8, 1, 13, 9,... Store these numbers in an array Write some code to sort Both code + data are stored in memory, and mixed together –this was typical sort programs you learned in CS

Eventually people realized that –the data part could be huge; maybe not sorting 1000 numbers, but 1 trillion numbers –this posed serious problems: what happened if the data doesn’t fit into memory? –another issue is that many apps may want to access and do the same thing with data –should we write duplicate codes for each of these apps? –maybe we should factor out common code –thus the motivation for databases and DB management systems 32

An Illustration 33 Database management system DB 2 DB 3 DB 1 App 1 App 2

34 Another Motivating Example Suppose we want to store, manipulate, and query information about: –students –courses –professors –who takes what, who teaches what

35 Application Requirements store the data for a long period of time –large amounts (100s of GB) –protect against crashes –protect against unauthorized use allow users to query/update: –who teaches “CS 367” –enroll “Mary” in “CS 564”

36 allow several (100s, 1000s) users to access the data simultaneously allow administrators to change the schema –add information about TAs

37 Trying Without a DBMS Why Direct Implementation Won’t Work: Storing data: file system is limited –size less than 4GB (on 32 bits machines) –when system crashes we may loose data –password-based authorization insufficient Query/update: –need to write a new C++/Java program for every new query –need to worry about performance

38 Concurrency: limited protection –need to worry about interfering with other users –need to offer different views to different users (e.g. registrar, students, professors) Schema change: –entails changing file formats –need to rewrite virtually all applications Better let a database system handle it

39 What Can a DBMS Do for Us? Data Definition Language - DDL Data Manipulation Language - DML –query language Storage management Transaction Management –concurrency control –recovery Think buying a plane ticket! Can you do it without a DBMS?

40 What Can a DBMS Do for Us? Automate a lot of boring/mundane operations on data –so that we don’t have to program over and over –so that we can write complex data manipulations in just a few lines, so that we can concentrate on app logics Make execution very fast –so that it scales up to very large data sets Make concurrent access/modification possible –so that many users can use the data at the same time

41 Building an Application with a DBMS Requirements modeling (conceptual, pictures) –Decide what entities should be part of the application and how they should be linked. Schema design and implementation –Decide on a set of tables, attributes. –Define the tables in the database system. –Populate database (insert tuples). Write application programs using the DBMS –way easier now that the data management is taken care of.

42 address namefield Professor Advises Takes Teaches Course Student namecategory quarter name ssn Conceptual Modeling cid

43 Schema Design and Implementation Tables: Separates the logical view from the physical view of the data. Students:Takes: Courses:

44 Querying a Database Find all courses that “Mary” takes S(tructured) Q(uery) L(anguage) Query processor figures out how to answer the query efficiently. select C.name from Students S, Takes T, Courses C where S.name = “Mary” and S.ssn = T.ssn and T.cid = C.cid

45 Query Optimization Imperative query execution plan: select C.name from Students S, Takes T, Courses C where S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cid select C.name from Students S, Takes T, Courses C where S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cid Declarative SQL query Plan: tree of Relational Algebra operators, choice of algorithms at each operator Goal: Students Takes sid=sid sname name=“Mary” cid=cid Courses

46 Database Industry Relational databases are a great success of theoretical ideas. Big DBMS companies are among the largest software companies in the world. Oracle IBM (with DB2) Microsoft (SQL Server, Microsoft Access) Others $20B industry.

47 The Study of DBMS Several aspects: –Modeling and design of databases –Database programming: querying and update operations –Database implementation DBMS study cuts across many fields of Computer Science: OS, languages, AI, Logic, multimedia, theory...