CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#1: Introduction Christos Faloutsos CMU www.cs.cmu.edu/~christos.

Slides:



Advertisements
Similar presentations
Using the SQL Access Advisor
Advertisements

Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
AP STUDY SESSION 2.
Advanced SQL Topics Edward Wu.
1
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Myra Shields Training Manager Introduction to OvidSP.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination. Introduction to the Business.
We need a common denominator to add these fractions.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
Create an Application Title 1A - Adult Chapter 3.
Process a Customer Chapter 2. Process a Customer 2-2 Objectives Understand what defines a Customer Learn how to check for an existing Customer Learn how.
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
1. 2 Objectives Become familiar with the purpose and features of Epsilen Learn to navigate the Epsilen environment Develop a professional ePortfolio on.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Break Time Remaining 10:00.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
Information Systems Today: Managing in the Digital World
PP Test Review Sections 6-1 to 6-6
1 The Blue Café by Chris Rea My world is miles of endless roads.
Bright Futures Guidelines Priorities and Screening Tables
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
Page 1 of 43 To the ETS – Bidding Query by Map Online Training Course Welcome This training module provides the procedures for using Query by Map for a.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
Analyzing Genes and Genomes
CMU SCS : Multimedia Databases and Data Mining Lecture #4: Multi-key and Spatial Access Methods - I C. Faloutsos.
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
ANSC644 Bioinformatics-Database Mining 1 ANSC644 Bioinformatics §Carl J. Schmidt §051 Townsend Hall §
Clock will move after 1 minute
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Profile. 1.Open an Internet web browser and type into the web browser address bar. 2.You will see a web page similar to the one on.
CMU SCS Carnegie Mellon Univ. School of Computer Science /615 - DB Applications C. Faloutsos & A. Pavlo Lecture #4: Relational Algebra.
CMU SCS : Multimedia Databases and Data Mining Lecture#1: Introduction Christos Faloutsos CMU
CMU SCS : Multimedia Databases and Data Mining Lecture#1: Introduction Christos Faloutsos CMU
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Introduction.
15-826: Multimedia Databases and Data Mining
Presentation transcript:

CMU SCS : Multimedia Databases and Data Mining Lecture#1: Introduction Christos Faloutsos CMU

CMU SCS Copyright: C. Faloutsos (2012)2 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search Data Mining

CMU SCS Copyright: C. Faloutsos (2012)3 Problem Given a large collection of (multimedia) records, or graphs, find similar/interesting things, ie: Allow fast, approximate queries, and Find rules/patterns

CMU SCS Copyright: C. Faloutsos (2012)4 Sample queries Similarity search –Find pairs of branches with similar sales patterns –find medical cases similar to Smith's –Find pairs of sensor series that move in sync –Find shapes like a spark-plug –(nn: ‘case based reasoning’)

CMU SCS Copyright: C. Faloutsos (2012)5 Sample queries –cont’d Rule discovery –Clusters (of branches; of sensor data;...) –Forecasting (total sales for next year?) –Outliers (eg., unexpected part failures; fraud detection)

CMU SCS Copyright: C. Faloutsos (2012)6 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search Data Mining

CMU SCS Copyright: C. Faloutsos (2012)7 Detailed Outline Intro to DB Relational DBMS - what and why? –inserting, retrieving and summarizing data –views; security/privacy –(concurrency control and recovery)

CMU SCS Copyright: C. Faloutsos (2012)8 What is the goal of rel. DBMSs

CMU SCS Copyright: C. Faloutsos (2012)9 What is the goal of rel. DBMSs Electronic record-keeping: Fast and convenient access to information. Eg.: students, taking classes, obtaining grades; find my gpa

CMU SCS Copyright: C. Faloutsos (2012)10 Why Databases?

CMU SCS Copyright: C. Faloutsos (2012)11 Why Databases? Flexibility data independence (can add new tables; new attributes) data sharing/concurrency control recovery

CMU SCS Copyright: C. Faloutsos (2012)12 Why NOT Databases?

CMU SCS Copyright: C. Faloutsos (2012)13 Why NOT Databases? Price additional expertise (SQL/DBA) over-kill for small data sets

CMU SCS Copyright: C. Faloutsos (2012)14 Main vendors/products CommercialOpen source

CMU SCS Copyright: C. Faloutsos (2012)15 Main vendors/products Commercial Oracle IBM/DB2 MS SQL-server Sybase (MS Access,...) Open source Postgres (UCB) mySQL, sqlite, miniBase (Wisc) (

CMU SCS Copyright: C. Faloutsos (2012)16 Detailed Outline Intro to DB Relational DBMS - what and why? –inserting, retrieving and summarizing data –views; security/privacy –(concurrency control and recovery)

CMU SCS Copyright: C. Faloutsos (2012)17 How do DBs work? We use sqlite3 as an example, from

CMU SCS Copyright: C. Faloutsos (2012)18 How do DBs work? %sqlite3 mydb # mydb: file sql>create table student ( ssn fixed; name char(20) );

CMU SCS Copyright: C. Faloutsos (2012)19 How do DBs work? sql>insert into student values (123, “Smith”); sql>select * from student;

CMU SCS Copyright: C. Faloutsos (2012)20 How do DBs work? sql>create table takes ( ssn fixed, c_id char(5), grade fixed));

CMU SCS Copyright: C. Faloutsos (2012)21 How do DBs work - cont’d More than one tables - joins Eg., roster (names only) for

CMU SCS Copyright: C. Faloutsos (2012)22 How do DBs work - cont’d sql> select name from student, takes where student.ssn = takes.ssn and takes.c_id = “15826”

CMU SCS Copyright: C. Faloutsos (2012)23 SQL-DML General form: select a1, a2, … an from r1, r2, … rm where P [order by ….] [group by …] [having …]

CMU SCS Copyright: C. Faloutsos (2012)24 Aggregation Find ssn and GPA for each student

CMU SCS Copyright: C. Faloutsos (2012)25 Aggregation sql> select ssn, avg(grade) from takes group by ssn; DM!

CMU SCS What if slow #2? sqlite> create table friends (p1, p2); sqlite> select f1.p1, f2.p2 from friends f1, friends f2 where f1.p2 = f2.p1; Q: too slow – now what? Copyright: C. Faloutsos (2012)26

CMU SCS Copyright: C. Faloutsos (2012)27 Detailed Outline Intro to DB Relational DBMS - what and why? –inserting, retrieving and summarizing data –views; security/privacy –(concurrency control and recovery)

CMU SCS Copyright: C. Faloutsos (2012)28 Views - what and why? suppose you ONLY want to see ssn and GPA (eg., in your data-warehouse) suppose secy is only allowed to see GPAs, but not individual grades (or, suppose you want to create a short-hand for a query you ask again and again) -> VIEWS!

CMU SCS Copyright: C. Faloutsos (2012)29 Views sql> create view fellowship as ( select ssn, avg(grade) from takes group by ssn);

CMU SCS Copyright: C. Faloutsos (2012)30 Views sql> create view fellowship as ( select ssn, avg(grade) from takes group by ssn);

CMU SCS Copyright: C. Faloutsos (2012)31 Views Views = ‘virtual tables’

CMU SCS Copyright: C. Faloutsos (2012)32 Views sql> select * from fellowship;

CMU SCS Copyright: C. Faloutsos (2012)33 Views sql> grant select on fellowship to secy;

CMU SCS Copyright: C. Faloutsos (2012)34 Detailed Outline Intro to DB Relational DBMS - what and why? –inserting, retrieving and summarizing data –views; security/privacy –(concurrency control and recovery) What if slow? Conclusions

CMU SCS What if slow? sqlite> select * from irs_table where ssn=‘123’; Q: What to do, if it takes 2hours? Copyright: C. Faloutsos (2012)35

CMU SCS What if slow? sqlite> select * from irs_table where ssn=‘123’; Q: What to do, if it takes 2hours? A: build an index Q’: on what attribute? Q’’: what syntax? Copyright: C. Faloutsos (2012)36 DM!

CMU SCS What if slow - #2? sqlite> create table friends (p1, p2); Facebook-style: find the 2-step-away people Copyright: C. Faloutsos (2012)37

CMU SCS What if slow - #2? sqlite> create table friends (p1, p2); sqlite> select f1.p1, f2.p2 from friends f1, friends f2 where f1.p2 = f2.p1; Q: too slow – now what? Copyright: C. Faloutsos (2012)38

CMU SCS What if slow - #2? sqlite> create table friends (p1, p2); sqlite> select f1.p1, f2.p2 from friends f1, friends f2 where f1.p2 = f2.p1; Q: too slow – now what? A: ‘explain’: sqlite> explain select … Copyright: C. Faloutsos (2012)39 DM!

CMU SCS Long term answer: Check the query optimizer (see, say, Ramakrishnan + Gehrke 3 rd edition, chapter15) Copyright: C. Faloutsos (2012)40

CMU SCS Copyright: C. Faloutsos (2012)41 Conclusions (relational) DBMSs: electronic record keepers customize them with create table commands ask SQL queries to retrieve info

CMU SCS Copyright: C. Faloutsos (2012)42 Conclusions cont’d main advantages over flat files & scripts: logical + physical data independence (ie., flexibility of adding new attributes, new tables and indices) concurrency control and recovery for free

CMU SCS Copyright: C. Faloutsos (2012)43 For more info: Sqlite3: postgres: Microsoft Access: available on ANDREW clusters (PC) Ramakrishna + Gehrke, 3rd edition web page, eg, –