Lecture 1: Overview of CSCI 585

Slides:



Advertisements
Similar presentations
IiWAS2002, Bandung, Indonesia Teaching and Learning Databases Dr. Stéphane Bressan National University of Singapore.
Advertisements

Introduction to Database Management  Department of Computer Science Northern Illinois University January 2001.
1 Introduction to Database Management Systems Lila Rao Graham.
Database Management: Getting Data Together Chapter 14.
MI807: Database Systems for Managers Introduction –Course Goals & Schedule –Logistics –Syllabus Review Relational DBMS Basics –RDBMS Role in Applications.
Midterm 2: April 28th Material:   Query processing and Optimization, Chapters 12 and 13 (ignore , 12.7, and 13.5)   Transactions, Chapter.
Murali Mani CS3431 – Database Systems I Introduction.
1 ICS 223: Transaction Processing and Distributed Data Management Winter 2008 Professor Sharad Mehrotra Information and Computer Science University of.
ECE 569 Database System EngineeringFall 2004 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang:
Chapter 14 The Second Component: The Database.
Databases and Database Management System. 2 Goals comprehensive introduction to –the design of databases –database transaction processing –the use of.
Chapter 4 Database Management Systems. Chapter 4Slide 2 What is a Database Management System (DBMS)?  Database An organized collection of related data.
Chapter 1 Database and Database Users Dr. Bernard Chen Ph.D. University of Central Arkansas.
Database Management COP4540, SCS, FIU An Introduction to database system.
1: IntroductionData Management & Engineering1 Course Overview: CS 395T Semantic Web, Ontologies and Cloud Databases Daniel P. Miranker Objectives: Get.
Introduction to Database Systems 1.  Assignments – 3 – 9%  Marked Lab – 5 – 10% + 2% (Bonus)  Marked Quiz – 3 – 6%  Mid term exams – 2 – (30%) 15%
Introduction to Databases and Database Languages
Database Systems: Design, Implementation, and Management Ninth Edition
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
Introduction. 
Concepts of Database Management Sixth Edition
The Worlds of Database Systems Chapter 1. Database Management Systems (DBMS) DBMS: Powerful tool for creating and managing large amounts of data efficiently.
Chapter 10 Storage and File Structure Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Course Introduction Introduction to Databases Instructor: Joe Bockhorst University of Wisconsin - Milwaukee.
Database Architecture Introduction to Databases. The Nature of Data Un-structured Semi-structured Structured.
Database and Database Users. Outline Database Introduction An Example Characteristics of the Database Actors on the Scene Advantages of using the DBMS.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS (Cont’d) Instructor Ms. Arwa Binsaleh.
Database System Concepts and Architecture
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
CS461: Principles and Internals of Database Systems Instructor: Ying Cai Department of Computer Science Iowa State University Office:
Database Organization and Design
Introduction to Operating Systems J. H. Wang Sep. 18, 2015.
Master Thesis Defense Jan Fiedler 04/17/98
INFS614, Dr. Brodsky, GMU1 Database Management Systems INFS 614 Instructor: Professor Alex Brodsky
Overviews of ITCS 6161/8161: Advanced Topics on Database Systems Dr. Jianping Fan Department of Computer Science UNC-Charlotte
CS 3630 Database Design and Implementation Dr. Qi Yang 213 Ullrich My Home Page: The Class Page:
Storing Organizational Information - Databases
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
CS246 Data & File Structures Lecture 1 Introduction to File Systems Instructor: Li Ma Office: NBC 126 Phone: (713)
Lecture 1: Overview of CSCI 485 Notes: I presented parts of this lecture as a keynote at Educator’s Symposium of OOPSLA Shahram Ghandeharizadeh Associate.
Chapter 1 Introduction to Databases. 1-2 Chapter Outline   Common uses of database systems   Meaning of basic terms   Database Applications  
1 CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh
11.1Database System Concepts. 11.2Database System Concepts Now Something Different 1st part of the course: Application Oriented 2nd part of the course:
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
CIS/SUSL1 Fundamentals of DBMS S.V. Priyan Head/Department of Computing & Information Systems.
CS 541 Lecture Slides Sunil Prabhakar CS541 Database Systems.
1 TOPIC 6 DATABASE 6.1 Introduction to Database 6.2 Basic Concept of Database 6.3 Database Object DATABASE.
Introduction HNDIT DBMS 1. Database Management Systems Module code HNDIT Module title Database Management Systems Credits2HoursLectures15.
Database Systems Lecture 1. In this Lecture Course Information Databases and Database Systems Some History The Relational Model.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
1 CS462- Database Systems Sang H. Son
Database Management Systems.  Instructor: Yrd. Doç. Dr. Cengiz Örencik   Course material.
CS3431: C-Term CS3431 – Database Systems I Introduction Instructor: Mohamed Eltabakh
1 CENG 351 CENG 351 Introduction to Data Management and File Structures Department of Computer Engineering METU.
IIS 645 Database Management Systems DDr. Khorsheed Today’s Topics 1. Course Overview 22. Introduction to Database management 33. Components of Database.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Lecture 1: Overview of CSCI 485 Shahram Ghandeharizadeh Associate Professor Computer Science Department University of Southern California Presented by:
Big Data Yuan Xue CS 292 Special topics on.
CS522 Advanced database Systems Huiping Guo Department of Computer Science California State University, Los Angeles Course administration.
CS4222 Principles of Database System
CS 3630 Database Design and Implementation
CS422 Principles of Database Systems Course Overview
COSC 6340 Projects & Homeworks Spring 2002
External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.
Ch 4. The Evolution of Analytic Scalability
Lecture 1: Overview of CSCI 485 Notes: I presented parts of this lecture as a keynote at Educator’s Symposium of OOPSLA Shahram Ghandeharizadeh Director.
Lecture 1: Overview of CSCI 485 Notes: I presented parts of this lecture as a keynote at Educator’s Symposium of OOPSLA Shahram Ghandeharizadeh Associate.
Presentation transcript:

Lecture 1: Overview of CSCI 585 Prof. Shahram Ghandeharizadeh Director of USC Database Lab (http://dblab.usc.edu) Computer Science Department University of Southern California

Logistics Collection of technical papers: Pre-req for the course: ACM/IEEE/Springer digital libraries. URLs work from USC machines. Pre-req for the course: CSCI 485: Introduction to File and Database Management, and Knowledge C++ programming language. Extensive use of Blackboard for homework and project submissions. Make sure to have access to: http://den.usc.edu Power-point of presentations also available from http://dblab.usc.edu

Pre-Req 585 assumes you know the following: Transactions and their ACID properties. Concurrency control protocols such as locking and time-stamp based protocols. Crash recovery techniques such as logging and shadow paging. Physical characteristics of magnetic disks. SQL Relational algebra operators ER data modeling Alternative normal forms. Visit http://dblab.usc.edu/csci485 for an overview of this material.

Instructor Details Dr. Shahram Ghandeharizadeh Office: SAL 208 E-mail: shahram@usc.edu Phone: 213-740-4781 Office Hours: Tuesday: 12:30 to 2 pm Thursday: 4:30 to 5:30 pm Class URL: http://dblab.usc.edu/csci585

TA Shahin Shayandeh Office: SAL 200C E-mail: shayande@usc.edu Office Hours: Mondays: 3:30 to 5 pm Thursday: 12:30 to 2 pm

Outline Motivation for DBMS An outline for the course material Grading: Assignments and projects

Database Management Systems (DBMS) Used almost on a daily basis for either individual or business use. Relational database vendors were one of the fastest growing sectors during the .COM boom!

DATABASE & DBMS Database: An integrated collection of data, usually stored on secondary storage, typically describing the activities of one or more related organizations. Database management system (DBMS): A collection of software/programs designed to assist in maintaining and utilizing large collections of data.

BEFORE DBMS User 1 Data Data User 2 Application programs

AFTER DBMS User 1 DBMS User 2 Application programs Data managed by DBMS DBMS User 2 Application programs

WHY A DBMS? Reduced application development time Data independence: Application programs not dependent on data representation and storage details Data sharing: data is better utilized (discovered and reused), redundancy of data is minimized Data integrity and consistency: one may enforce consistency constraints on data, e.g., number of seats sold ≤ number of seats on the plane × 1.1 Centralized control: DBA tunes the database to balance user's needs Security: mechanisms to prevent unauthorized access. These mechanisms are based on content instead of file-oriented approach. Concurrency control: avoids undesirable race conditions that arise with simultaneous access/updates to data Crash recovery: ensures the integrity of data in the presence of failures

DBMS ARCHITECTURE … User 1 DBMS DB User n Physical data Conceptual schema

An Emerging Phenomena User 1 DBMS User 2 Application programs Data managed by DBMS DBMS User 2 Application programs

Example F. Chang et. al. Bigtable: A Distributed Storage System for Structured Data. In OSDI 2006. Last paragraph of the paper: “Finally, we have found that there are significant advantages to building our own storage solution at Google. We have gotten substantial amount of flexibility from designing our own data model for Bigtable. In addition, our control over Bigtable’s implementation, and the other Google infrastructure upon which Bigtable depends, means that we can remove bottlenecks and inefficiencies as they arise.”

WHAT HAS CHANGED? Relational database technology is now more than a quarter of century old. While concepts such as concurrency control are extremely valuable, the performance loss attributed to their use is not justified for some non-banking applications. E.g., A social networking site is not a banking application. RDBMS vendors increased functionality for their own niche, increasing complexity. Each application used a decreasing fraction of the provided features. A deployment requires a specialist, trained in database administration, for maintainence. Availability of data is paramount. Cost of downtime is estimated at thousands of dollars per minute. SQL is too general and cumbersome to use with some applications. Storage has become larger and more economical. 10 cents per Gigabyte of magnetic disk storage. Flash as a new layer in the storage hierarchy: DRAM, Flash, Disk. 7 to 8 dollars per Gigabyte of DRAM. A bank’s data (TPC benchmark) becomes main memory resident!

Cross-roads Since 1998, database researchers have been aware of the limitations: More modular architecture based on simple, component-based building blocks. One architecture will not satisfy all applications.

585 Syllabus Storage and Storage Management: 2-3 weeks. M. Seltzer. Beyond Relational Databases. Communications of the ACM, July 2008, Vol. 51, No. 7. D. A. Patterson, G. Gibson, and R. H. Katz. A Case for Redundant Arrays of Inexpensive Disks (RAID). ACM SIGMOD, 1988. G. Graefe. The five-minute rule twenty-years later, and how flash memory changes the rules. Proceedings of the Third International Workshop on Data Management on New Hardware (DaMoN), 2007. Flash as a new storage medium. 2-3 weeks. Start homework 1 using Berkeley DB.

585 Syllabus (Cont…) Parallel DBMS: 2 Weeks. D. DeWitt et al. The Gamma Database Machine Project. IEEE Transactions on Knowledge and Data Engineering, Vol. 2, 1990. F. Chang et al. Bigtable: A Distributed Storage System for Structured Data. In OSDI 2006. J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In Communications of the ACM, Vol. 51, No. 1, 2008. Data intensive applications can be parallelized effectively. 2 Weeks.

585 Syllabus (Cont…) Spatial Index Structures: 2 Weeks. A. Guttman. R-Trees: A Dynamic Index Structure for Spatial Searching. In ACM SIGMOD 1984. P. E. O’Neil, and D. Quass. Improved Query Performance with Variant Indexes. In ACM SIGMOD 1997. No substitute for smart data indexing techniques! Brute-force approaches are not acceptable. 2 Weeks. Initiate your project to build a relational query processing software using Berkeley DB.

585 Syllabus (Cont…) Query optimizations: 2 Weeks. P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, T. G. Price. Access Path Selection in Relational Database Management System. In ACM SIGMOD 1979. S. Chaudhuri. An Overview of Query Optimization in Relational Systems. PODS 1998. Techniques to select index structures. Focus is on your project. 2 Weeks.

585 Syllabus (Cont…) Decision Support: 2-3 Weeks. R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. In VLDB 1994. J. Gray et al. Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab and SubTotals. Data Mining and Knowledge Discovery 1(1), 1997. C. Stolte, D. Tang, and P. Hanrahan. Polaris: A System for Query, Analysis, and Visualization of Multidimensional Databases. Communications of the ACM, Vol. 51, No. 11, November 2008. Discovery of trends in large data sets and their visualization. 2-3 Weeks.

585 Syllabus (Cont…) Main Memory Databases: 2 Weeks P. A. Boncz, M. L. Kristen, and S. Manegold. Breaking the Memory Wall in MonetDB. Communications of the ACM, December 2008, Vol. 51, No. 12. Use L2 cache of a CPU! 2 Weeks

585 Syllabus (Cont…) Cache Management: Time permitting, 1 to 2 weeks. S. Ghandeharizadeh and S. Shyandeh. Greedy Cache Management Techniques for Mobile Devices. In First International IEEE Workshop on Ambient Intelligence, Media and Sensing. April 2007. Effective support for variable sized objects. Time permitting, 1 to 2 weeks.

Grading Midterm 1: 35% Midterm 2: 35% Assignments: 10% Project: 20%

For next lecture Read: M. Seltzer. Beyond Relational Databases. Communications of the ACM, July 2008, Vol. 51, No. 7.