Tallahassee, Florida, 2016 COP5725 Advanced Database Systems Introduction Spring 2016.

Slides:



Advertisements
Similar presentations
COMP 5138 Relational Database Management Systems Sem2, 2007 Lecture 0 Course Overview.
Advertisements

CS6501: Text Mining Course Policy
1 ICS 223: Transaction Processing and Distributed Data Management Winter 2008 Professor Sharad Mehrotra Information and Computer Science University of.
Database Management Systems 331 IT Semester II 1431/1432 Winter 2011.
CS542: Database Management Systems1 CS 542 (Fall 2001) Database Management Systems Instructor: Wang-Chien Lee or
Computer Science 102 Data Structures and Algorithms V Fall 2009 Lecture 1: administrative details Professor: Evan Korth New York University 1.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
COMP Introduction to Programming Yi Hong May 13, 2015.
Welcome to COP4710 Course Website:
CSET 3300: Database-Driven Web Applications Spring 2010 William Acosta URL:
CpSc 462/662: Database Management Systems (DBMS) (TEXNH Approach)
Course Title Database Technologies Instructor: Dr ALI DAUD Course Credits: 3 with Lab Total Hours: 45 approximately.
CpSc 462/662: Database Management Systems (DBMS) (TEXNH Approach) Introduction James Wang.
CS461: Principles and Internals of Database Systems Instructor: Ying Cai Department of Computer Science Iowa State University Office:
1 My Experiences as Faculty Member and Researcher Dr. Kalim Qureshi.
Mini-Project on Web Data Analysis DANIEL DEUTCH. Data Management “Data management is the development, execution and supervision of plans, policies, programs.
Spring 2011 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii.
Introduction to Database Management Systems. Information Instructor: Csilla Farkas Office: Swearingen 3A43 Office Hours: Monday, Wednesday 4:15 pm – 5:30.
Database Design and Implementation ITCS6160 & ITCS 8160 Instructor: Jianping Fan Webpage:
CSCI 51 Introduction to Computer Science Dr. Joshua Stough January 20, 2009.
Introduction to Databases Computer Science 557 September 2007 Instructor: Joe Bockhorst University of Wisconsin - Milwaukee.
INFS614, Dr. Brodsky, GMU1 Database Management Systems INFS 614 Instructor: Professor Alex Brodsky
Introduction to Data Structures
Computer Science 102 Data Structures and Algorithms CSCI-UA.0102 Fall 2012 Lecture 1: administrative details Professor: Evan Korth New York University.
COMP-421: Database Systems
Principles of Computer Science I Honors Section Note Set 1 CSE 1341 – H 1.
CS4432: Database Systems II Course Logistics 1. Textbook 2 Required “Database Systems: The Complete Book”, Second Edition Hector Garcia-Molina, Jeffrey.
Most of contents are provided by the website Introduction TJTSD66: Advanced Topics in Social Media Dr.
Database Design and Implementation ITCS3160 Instructor: Jianping Fan Course Webpage:
IST 210: Organization of Data
ICS202 Data Structures King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.
Database Applications Programming CS 362 Dr. Samir Tartir 2014/2015 Second Semester.
CS 541 Lecture Slides Sunil Prabhakar CS541 Database Systems.
Fall 2010 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
Fall 2010 ICS321 Data Storage & Retrieval Mon & Wed 12-1:15 PM Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
Tallahassee, Florida, 2015 COP4710 Database Systems Project Overview Fall 2015.
COP4610/CGS5765 Operating Systems Syllabus. Instructor Xin Yuan Office: 168 LOV Office hours: W M F 9:10am – 10:00am, or by appointments.
CMSC 2021 CMSC 202 Computer Science II for Majors Spring 2001 Sections Ms. Susan Mitchell.
CS 440 Database Management Systems Lecture1: Course overview.
CS363: Introduction to Database Systems Instructor: Ying Cai Department of Computer Science Iowa State University Office: Atanasoff.
Database Design and Implementation ITCS6160 & ITCS 8160 Instructor: Jianping Fan Time: Thursday 3:30PM-6:15PM Classroom: Woodward Hall 130 Course Webpage:
CSCE 5073 Section 001: Data Mining Spring Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur,
Data Structures and Algorithms in Java AlaaEddin 2012.
CS 540 Database Management Systems Lecture1: Course overview.
CSE3330/5330 DATABASE SYSTEMS AND FILE STRUCTURES (DB I) CSE3330/5330 DB I, Summer2012 Department of Computer Science and Engineering, University of Texas.
Advances in Cloud Computing CIS6930/CIS4930
ITIS 5160 Applied Databases Fall Overview Class hour 6:30 – 9:15pm, Wedn, Woodward Hall 125 Office hour 3:00 – 5:00pm, Wedn Instructor - Dr. Xintao.
ITIS 5160 Applied Databases Fall Overview Class hour 9:30am – 12:15pm, Friday, Woodward 120 Office hour 1:30 – 2:30pm, Wednesday Instructor - Dr.
IST 210: ORGANIZATION OF DATA Introduction IST210 1.
CSE202 : Fundamentals of Database Systems Vikram Goyal Indraprastha Institute of Information Technology, Delhi (IIIT-D), India FROM : Slides from CSE202.
Introduction to CSCI 242 Compiled by S. Zhang 1. Syllabus Syllabus has the most updated information! –Use the information on the syllabus for the grading.
Course Overview Stephen M. Thebaut, Ph.D. University of Florida Software Engineering.
CS3431-B111 CS3431 – Database Systems I Logistics Instructor: Mohamed Eltabakh
Database Applications Programming CS 362 Dr. Samir Tartir 2014/2015 First Semester.
CS & CS ST: Probabilistic Data Management Fall 2016 Xiang Lian Kent State University Kent, OH
Term Project Proposal By J. H. Wang Apr. 7, 2017.
Course Overview - Database Systems
COP4710 Database Systems Project Overview.
Course Introduction 공학대학원 데이타베이스
COP4710 Database Systems Introduction.
Database Design and Implementation
Computer Science 102 Data Structures CSCI-UA
CS & CS Probabilistic Data Management
Course Overview - Database Systems
Database Applications Programming CS 362
CS & CS Capstone Project & Software Development Project
CS & CS ST: Probabilistic Data Management
CSCE 4143 Section 001: Data Mining Spring 2019.
Database Applications Programming CS 362
CSCE 4523/5523 Database Management Systems Fall 2019.
Presentation transcript:

Tallahassee, Florida, 2016 COP5725 Advanced Database Systems Introduction Spring 2016

Welcome to COP5725! COP5725: Advanced Database Systems – Course website: all you need to know about COP – Time: 2pm--3:15pm Mondays and Wednesdays – Venue: LOV 103 Please go over the syllabus carefully before taking the class! 1

Welcome to COP5725! Instructor – Prof. Peixiang Zhao – Office hours: Monday, Wednesday: 3:30pm-4:30pm Or by appointment – Office: LOV 262 – Research interest: Database, data mining, information/social network analysis TA – Dr. Esra Akbas – Office hours: Tuesday 10am – 11am – Office: MCH 106-A 2

The Goal of COP5725! 1.Reflection of the foundation: – Climb up to the shoulders – the foundational models, representations, systems, and techniques for relational database systems, by way of reading and lectures 2.Projection on the outlook: – And look out from here! Be inspired – what’s the next advanced database systems? – by way of reading and presenting the classics and the state-of-the- art, and by way of doing projects! “We can do it!” 3

The Contents of COP5725! Relational Database Internals – Fundamentals for relational databases – Data storage and representation – Advanced indexing – Query processing and execution – Query optimization – …… Advanced Database Topics – Parallel/Distributed databases (MapReduce) – Data mining (selected topics) – Data on the Web – …… 4

Welcome to COP5725! Textbook – Database Systems: The Complete Book 2nd edition – Hector Garcia-Molina, Jeff Ullman and Jennifer Widom Recommended reading – Database Management Systems 3rd edition, by Raghu Ramakrishnan and Johannes Gehrke – Readings in Database Systems 5th edition, by P. Bailis J. Hellerstein and M. Stonebraker ( – The Web Prerequisites – COP4710: Introduction to Database Systems – COP4530: Data Structures and Algorithms – Good programming skills 5

Welcome to COP5725! Components of the course 1.Two lectures every week 2.Two assignments (10%) 3.A series of papers to be read and summarized (15%) One or two-page paper summary to be submitted during the class on the due date 4.Paper presentation (5%) Every student (or group?) will present one paper related to her/his project in the class for 20(?) minutes 5.Semester-long project (30%) Research-flavor Implementation-flavor 6.A set of quizzes (5%) 7.Final exam (35%) 6

Paper Summaries Milestone papers in database systems Every paper will be assigned early in the course website, and can be downloaded within the campus network One to two pages summary includes – What is the problem? – Why is this problem important and worthy of a thorough study? – Why is this problem difficult? – What are the innovative ideas and technical merits? – Comments on the experimental evaluations – Any drawbacks and potential improvement? Summarize based on your own understanding. Verbatim copying from the paper results in low scores Contents in the paper will be tested in the final exam! 7

Paper Presentation Every student (or group) will have a chance to select one paper to present in the class – The paper should be closely related to the project you are conducting – The slides (pptx/ppt/pdf) should be sent to the instructor at least one day prior to the class you will be presenting – The slides organization should be similar to the requirement of the paper summary – 20(?) minutes presentation and Q&A Student will sign up for the presentation in the near future 8

Project Theme: choose either of the two 1.Research-flavor: mainly for Ph.D. students find an interesting, nontrivial data management problem, propose a novel and effective solution to it 2.Implementation-flavor: mainly for M.S. students find an interesting method/algorithm in a data management paper, implement it and perform experimental studies Teamwork: a group of one or two students (but no more!) The project is partitioned into multiple milestones, each of which requires deliverables 9

Multi-stage Project 1.Group formation (0%) 2.Project Proposal (10%) – What I want to do? 3.Literature Survey (20%) – What are the state-of-the-art? 4.Status report (10%) – What I have achieved thus far 5.Source code, software and final report (60%) – Dude, these are my deliverables! 10

Implementation Project Topics: – Choose a research paper published in the following conferences/journals after 2001, implement the idea and finish all experimental studies related to this idea – Conferences: SIGMOD, VLDB, ICDE, KDD, ICDM, SDM, SIGIR, WWW, CIKM – Journals: TODS, VLDB Journal, TKDD, TKDE Workload (in C/C++, Java, or Python) – lines of code; real/synthetic data, experimental studies Expectation – Source code, software, detailed readmes and scripts, and a final report Repeatability, Completeness of datasets and experimental studies, Efficiency, Effectiveness, Scalability …… You may demo your implementation to TA 11

Research Project Topics: – A state-of-the-art data management, mining problem in your research area Workload – Problem definition, algorithm design and analysis, implementation (more than 3000 lines of code, in C/C++, Java, or Python), experimental studies – Your innovative ideas! Expectation – A conference-quality (potential publishable) paper – Source code, software, detailed readmes and scripts – You may demo your implementation to TA 12

Quizzes The first quiz will be held on Monday 01/11 – Takes up 3% of your full credit! – Coverage: Fundamentals in relational DB Data structures and algorithms Remaining quizzes will be held throughout the semester – Call for attendance – Get feedbacks and suggestions from students 13

Is This Course Suitable For Me? First-day Attendance Policy at FSU Prerequisites MUST be satisfied – Introduction to database systems Relational model, relational algebra, relational design, SQL, B/B+ tree, hashing, transaction management, crash recovery…… – Data structures and algorithms Difference between stack and queue? Worst-case complexity for insertion/deletion in Red-black trees? Dijkstra algorithm for shortest-path computation Set-cover is NP-complete ……. Feel comfortable in programing (a lot) 14

COP5725 = How DB Knowledge is created + How to create more In terms of topics, COP5725 is not: – about Linux + Apache + PHP + MySQL (LAMP) – about designing DBs that are in BCNF – about SQL3 and stored procedures – about Oracle tuning and implementation In terms of methodology, COP5725 is not solely – by reading textbook and acing it – by implementing a well-specified DB algorithm, e.g., B+tree 15

How to Get the Most out of COP5725? Read and think before class – read the textbooks for related concepts – read the papers Use lectures as road map for studying – Lecture notes won’t cover all the material Use your peers in learning – discuss in/out of classes to enhance understanding Explore interesting projects creatively – learning by doing 16

Any questions so far? 17

Evolution of Data Management 18 Jim Gray: Evolution of Data Management. IEEE Computer 29(10): (1996)

Prehistory Thoughts: Emergence of the Notion of DBMS William C. McGee: Generalization: Key to Successful Electronic Data Processing. J. ACM 6(1): 1-23 (1959) When data processing was mostly ad-hoc programs --- Need generalization, e.g., – sorting – file maintenance – data access – modification and update – report generation – …… 19

How Did We Get Here? The dominating relational database system, which we take for granted now, was deemed impossible to implement and difficult to use in its early days But-- Quoting Jim Gray: These innovations give one of the best examples of research prototypes turning into products. The relational model, parallel database systems, active databases, and object-relational databases all came from the academic and industrial research labs. The development of database technology has been a textbook case of successful collaboration between academy and industry. -- Evolution of Data Management 20

Examples 21

In Industry 22

In Science – Turing Awardees 23 CHARLES BACHMAN, 1973EDGAR CODD, 1981 JAMES GRAY, 1998MICHAEL STONEBRAKER, 2014

The Grand Challenges of Data Management Relational DBMS was invented in early 70’s, and now 50+ billion mature industry What are we still working on? Big Data! – – What is the ultimately advanced DB? – Data of all sorts--- Prevalent on the Web! – What have you been searching lately? – What you search is what you want? New challenges naturally arise – structured vs. unstructured data – querying vs. analysis vs. searching – closed “base” vs. the open Web 24

Tallahassee, Florida, 2016 Have fun! Have fun! What Does 'Big Data' Mean and Who Will Win?