CS 440 Database Management Systems

Slides:



Advertisements
Similar presentations
Mani-CS34311 CS3431 – Database Systems I Logistics Instructor: Murali Mani
Advertisements

1 ICS 223: Transaction Processing and Distributed Data Management Winter 2008 Professor Sharad Mehrotra Information and Computer Science University of.
Rundensteiner-CS34311 CS3431 – Database Systems I Logistics Instructor: Elke A. Rundensteiner
CS542: Database Management Systems1 CS 542 (Fall 2001) Database Management Systems Instructor: Wang-Chien Lee or
Final Review for CS 562. Final Exam on December 18, 2014 in CAS 216 Time: 3PM – 5PM (~2hours) OPEN NOTES, SLIDES, BOOKS Study the topics that we covered.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
COMP Introduction to Programming Yi Hong May 13, 2015.
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, 2.
Object Oriented Programming (OOP) Design Lecture 1 : Course Overview Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang.
CS461: Principles and Internals of Database Systems Instructor: Ying Cai Department of Computer Science Iowa State University Office:
Spring 2011 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii.
Database Design and Implementation ITCS6160 & ITCS 8160 Instructor: Jianping Fan Webpage:
Overviews of ITCS 6161/8161: Advanced Topics on Database Systems Dr. Jianping Fan Department of Computer Science UNC-Charlotte
CS 564 Database Management Systems: Design and Implementation Fall 2015 Arun Kumar CS 564: Database Management Systems1.
Data Structures (Second Part) Lecture 1 Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University.
CSSE 250 Dr. Yingwu Zhu Office: ENGR 530 Phone: Emai:
11/27/2015Murali Mani -- CS5421 Database Management Systems CS Fall 2008 Instructor: Murali Mani
Database Design and Implementation ITCS3160 Instructor: Jianping Fan Course Webpage:
IST 210: Organization of Data
11/29/2015Elke A. Rundensteiner -- CS5421 Database Management Systems CS Fall 2012 Instructor: Elke Rundensteiner
Fall 2010 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
Fall 2010 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
CS 440 Database Management Systems Lecture1: Course overview.
CS363: Introduction to Database Systems Instructor: Ying Cai Department of Computer Science Iowa State University Office: Atanasoff.
Database Design and Implementation ITCS6160 & ITCS 8160 Instructor: Jianping Fan Time: Thursday 3:30PM-6:15PM Classroom: Woodward Hall 130 Course Webpage:
1 Advanced Database System Design Instructor: Ruoming Jin Fall 2010.
CS 540 Database Management Systems Lecture1: Course overview.
Rundensteiner-CS34311 CS3431 – Database Systems I Logistics Instructor: Elke A. Rundensteiner
Mining of Massive Datasets Edited based on Leskovec’s from
IST 210: ORGANIZATION OF DATA Introduction IST210 1.
CSE202 : Fundamentals of Database Systems Vikram Goyal Indraprastha Institute of Information Technology, Delhi (IIIT-D), India FROM : Slides from CSE202.
CSE Wireless and Adhoc networks Instructor: Ayman Alharbi Computer Engineering Dept. (Head of dept. ) Why ?
Introduction to CSCI 242 Compiled by S. Zhang 1. Syllabus Syllabus has the most updated information! –Use the information on the syllabus for the grading.
CS3431-B111 CS3431 – Database Systems I Logistics Instructor: Mohamed Eltabakh
Class Introduction IST 210: Organization of Data IST2101.
CS 540 Database Management Systems
CSCI5570 Large Scale Data Processing Systems
Computer Network Fundamentals CNT4007C
CSE 489/589 Modern Networking Concepts
CS/CE/TE 6378 Advanced Operating Systems
Course Overview - Database Systems
Introduction to Database Systems CSE 444
Course Introduction 공학대학원 데이타베이스
Syllabus Introduction to Computer Science
CS 245: Database System Principles Notes 01: Introduction
CS 245: Database System Principles Notes 01: Introduction
Computer Networks CNT5106C
Instructor: Elke Rundensteiner
Database Design and Implementation
Data Structures Algorithms: (Slides to be Adopted from Goodrich and aligned with Weiss' book) Instructor: Ganesh Ramakrishnan
September 27 – Course introductions; Adts; Stacks and Queues
CS 540 Database Management Systems
Computer Networks CNT5106C
Course Overview - Database Systems
Advanced Operating Systems – Fall 2009
Jeremy Bolton, PhD Assistant Teaching Professor
CS & CS Capstone Project & Software Development Project
Systems Programming Intro
CSE1311 Introductory Programming for Engineers & Scientists
Introduction to Database Systems CSE 444
Lecture 1a- Introduction
Office: ENGR 530 Phone: Emai:
Sampath Jayarathna Cal Poly Pomona
Data Management and Information Processing
Introduction to Database Systems CSE 444
Lecture 1a- Introduction
Introduction to Database Systems CSE 444
Lecture 1: Overview of CSCI 485 Notes: I presented parts of this lecture as a keynote at Educator’s Symposium of OOPSLA Shahram Ghandeharizadeh Director.
Lecture 1: Overview of CSCI 485 Notes: I presented parts of this lecture as a keynote at Educator’s Symposium of OOPSLA Shahram Ghandeharizadeh Associate.
Course Introduction Data Visualization & Exploration – COMPSCI 590
Presentation transcript:

CS 440 Database Management Systems Course overview

Welcome to CS440! Instructor: Arash Termehchy Assistant Professor at EECS Research on data management and analytics Information & Data Management and Analytics (IDEA) Lab

The Era of Big Data Both opportunities and challenges. Technological shifts, e.g., mobile devices, have created a staggering number of enormous data sets. Both opportunities and challenges.

Opportunities: unreasonable effectiveness of data A. Halevy, et al. The unreasonable effectiveness of data, IEEE Intelligence Systems, 2009. Observation from working with large datasets in Google. More data generally outperforms complex statistical models in the data-centric prediction and discovery. Conclusion: Usually, no need for overly complex statistical models.

Opportunities are priceless! The story of John Snow “In the mid-1850s, Dr. John Snow plotted cholera deaths on a map, and in the corner of a particularly hard-hit buildings was a water pump. A 19th-century version of Big Data, which suggested an association between cholera and the water pump.” Integrating data sets has saved millions of lives!

Paradigm shifting influence on scientific discovery “The Fourth Paradigm: Data-Intensive Scientific Discovery”, Jim Gray Empirical Theoretical Computational Data-centric Sloan Sky Server database is a top cited resource in the field of astronomy. Astronomical observation => database query Spread of diseases by analyzing Google query log Personalized medicine, drug discovery, …

Challenges: data volume Sloan Sky Server will soon store 30 terabyte per day. Hardon Colider can generate 500 exabyte per day. 90% of world data generated in the last two years (2013) Every two year : ten times more data

Challenges: data variety/ diversity Database systems used to deal with a single static database. Need to transform and or integrate large number of evolving data sets. Impossible to do manually. “A data integration expert is never without a job”

Challenges: usability “….(in the next few years) we project a need for 1.5 million additional analysts in the United States who can analyze data effectively…“, -- McKinsey Big Data Study, 2012 Current systems are not built for scientists and normal users. “It may take a PhD in computer science to successfully deploy a data analytics algorithm!”

The notion of database management system (DBMS) Data processing used to be mostly ad-hoc programming. W. McGee, Generalization: Key to Successful Electronic Data Processing, Journal of ACM, 1959. Generalization, aka abstraction/ data modeling File: A sequence of records. Operation: sort, select part of the file, … Makes data management and processing usable. People can learn and use the abstraction instead of developing new data processing programs. How to build models that provide nice generalizations How to implement the efficiently

Abstraction is the key How to develop usable abstractions for our data? Data models, query languages, Relational data model, graph data model, … How to implement these abstractions efficiently? Database systems internal Storage management, indexing, ….

Topics How to develop usable abstractions for our data? relational data model graph data model database programming How to implement these abstractions efficiently? storage management and indexing query processing algorithms query optimization Transaction management parallel and distributed data processing

Our plan Learn the fundamental concepts and ideas Foundational models, algorithms, and systems. Textbooks, resources, and lectures. Apply them to new problems Apply the lessons learned to interesting database problems. By doing assignments.

Learning the fundamentals: Lectures Review and discuss the material. Will be available on the course website after the class. Provide the road map for studying The course material can seem overwhelming. Attendance is not required but encouraged. Read the course material before the class. Participate and ask questions!

Learning the fundamentals: Readings Textbooks: Database management systems, 3rd edition, R. Ramakrishnan and J. Gehrke. Cow book Mining Massive data sets, Jure Leskovec, Anand Rajaraman, Jeff Ullman. Free Online Papers for newer material: posted on the course website.

Learning the fundamentals: Readings Recommended Database systems: the complete book, 2nd edition, Hector Garcia Molina, Jeffry Ullman, and Jennifer Widom. The complete book Foundations of databases, Serge Aitboul, Richard Hull, Victor Vianu Alice book

Learning the fundamentals: Exam Midterm exam in class. Closed books and notes Tests your knowledge of the subjects discussed in the class. 40% of the overall grade In class No final exam

Apply your understanding: assignments Seven assignments: Announced on Piazza and course website, posted on the course website. Both written and programming. Submit using TEACH Write using word processors and submit in pdf. Start early! 60% of the overall grade

How to get the most out of the course? Communicate with the course staff TA: Vahid Ghadakchi, Parisa Ataie Piazza preferred method of communication Office hours Arash: Tuesday 4:30 – 5:30 pm Vahid: Monday/ Wednesday 4 – 5 pm Parisa: Monday 9 – 10 am Email the staff for other types of questions Use [cs440] tag in the subject line. Communicate with your peers on course materials and lectures. Check the Piazza and course website for announcements or possible changes in the schedule.

What is next? A review of relational model, relational algebra, and SQL. You refresh your memory by working on some advanced problems on relational model and database design.