overview today’s ideas relational databases

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

C6 Databases.
Data Modeling and Database Design Chapter 1: Database Systems: Architecture and Components.
CIT 613: Relational Database Development using SQL Introduction to SQL.
Databases. Database Information is not useful if not organized In database, data are organized in a way that people find meaningful and useful. Database.
Geographic Information Systems
1 Database Systems (Part I) Introduction to Databases I Overview  Objectives of this lecture.  History and Evolution of Databases.  Basic Terms in Database.
Introduction to Databases
1 Lecture 31 Introduction to Databases I Overview  Objectives of this lecture  History and Evolution of Databases  Basic Terms in Database and definitions.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
WHAT IS A DATABASE ? a collection of data organized to help easy retrieval & usage.
Relational Database M S
CS370 Spring 2007 CS 370 Database Systems Lecture 2 Overview of Database Systems.
1 Overview of Databases. 2 Content Databases Example: Access Structure Query language (SQL)
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chpt 1 Instructor: Weichao Wang.
Lecture Set 14 B new Introduction to Databases - Database Processing: The Connected Model (Using DataReaders)
Prof. Sujata Rao Introduction to Computers & MIS Data Base Concepts Lesson 6.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Copyright © by Curt Hill Database Introduction History Why we want to use them Other fun.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
8 1 Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Data resource management
Creating and Maintaining Geographic Databases. Outline Definitions Characteristics of DBMS Types of database Relational model SQL Spatial databases.
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
Database Systems Lecture 1. In this Lecture Course Information Databases and Database Systems Some History The Relational Model.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
CSC 370 – Database Systems Introduction Instructor: Alex Thomo.
Object storage and object interoperability
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Faeez, Franz & Syamim.   Database – collection of persistent data  Database Management System (DBMS) – software system that supports creation, population,
1 Section 1 - Introduction to SQL u SQL is an abbreviation for Structured Query Language. u It is generally pronounced “Sequel” u SQL is a unified language.
Geographic Information Systems GIS Data Databases.
Databases and DBMSs Todd S. Bacastow January
Database Systems: Design, Implementation, and Management Tenth Edition
CS4222 Principles of Database System
Databases and SQL Databases SQL Rev 1.5
Module 11: File Structure
Indexing Goals: Store large files Support multiple search keys
Databases Chapter 16.
Physical Changes That Don’t Change the Logical Design
Database Management System
Query-by-Example (QBE)
Chapter 9 Database Systems
Instructor: Elke Rundensteiner
Database Management System
Databases and Database Management Systems Chapter 9
Introduction to Database Systems
Geographic Information Systems
SQL – Application Persistence Design Patterns
Chapter 15 QUERY EXECUTION.
CPSC-310 Database Systems
Introduction to Database Management System
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
Data Base System Lecture : Database Environment
Introduction to DataBase
Indexing and Hashing Basic Concepts Ordered Indices
Databases.
Indexing 4/11/2019.
Views 1.
Query Processing.
Chapter 2 Database Environment Pearson Education © 2009.
Understanding Core Database Concepts
Chapter 2 Database Environment Pearson Education © 2009.
Geographic Information Systems
Database management systems
Presentation transcript:

6001 structure & interpretation of computer programs recitation 8/ october 10, 1997

overview today’s ideas relational databases relational operators and queries simple query optimization 10/31/2019 daniel jackson

history origins 1960’s: network and hierarchical databases 1970: Codd invents the relational model; revolutionizes the field early 1980’s: IBM starts DB2, Oracle founded growth late 1980’s: RDBs become big business: Oracle, Sybase, Ingres, Informix, etc 1997: Oracle’s revenue is $5B, total RDB market about $10-15B future object-oriented databases unlikely to take off but hybrid ‘object-relational databases’ are starting to get popular 10/31/2019 daniel jackson

what’s a database? what does a database do? stores large volumes of data allows rapid access for many different kinds of query databases includes mechanisms for handling secondary storage management: indexing and storage of data that doesn’t fit in memory persistence: data shouldn’t go away! concurrency control: many users at once distribution: users at different sites data protection: making sure updates are consistent recovery: when system fails, eg the difference between DB systems and file systems programmer doesn’t have to worry about how data is laid out on disk (sometimes called the ‘data independence principle’) 10/31/2019 daniel jackson

the big idea what was wrong with early databases (network and hierarchical) the user’s operations were coded directly in terms of data structures these data structures closely mirrored the way the data was actually stored when developer changed the data structures for performance, the queries had to be rewritten the solution have an abstract data model queries and insertions all expressed in terms of abstract model hide the details of the actual data structures from the user in other words… data abstraction! 10/31/2019 daniel jackson

tables what’s in a relational database an RDB consists of a collection of tables or relations each table has a name columns have names too (called attributes) each row is a sequence of values one for each column these may be numbers, booleans, strings, etc rows are also called tuples or records example EVENTS table EVENT CAFEMEISTER HOURS java-jump ben 10 tour-de-café alice 15 seattle-sleepless carol 8 ATHLETES table NAME SPONSOR alice coffee-connection ben starbucks carol green-mountain 10/31/2019 daniel jackson

queries to extract information from the RDB present it with a query examples who won the java-jump event? what company is sponsoring ben? which company sponsored the winner of the java-jump event? kinds of query some queries can be answered by examining only one table who won the java-jump event? –> look at EVENTS table what company is sponsoring ben? –> look at ATHLETES table some need more than one table –> look up winner in EVENTS and then find sponsor in ATHLETES query processing queries are declarative: they say what you want, not how to get it database system translates query into operation on internal data structures may optimize the query before applying, so that it goes faster 10/31/2019 daniel jackson

queries on a single table for uniformity think of result of a query as a table itself two ways to make a table smaller eliminate some columns eliminate some rows eliminating columns query select <column names> from <table> example select CAFEMEISTER, HOURS from EVENTS CAFEMEISTER HOURS ben 10 alice 15 carol 8 10/31/2019 daniel jackson

simple queries, continued eliminating rows query filter <condition> from <table> example filter NAME = ben from ATHLETES NAME SPONSOR ben starbucks operators filter and select are called relational operators they are like the operations of an abstract data type take a table and produce another table 10/31/2019 daniel jackson

combining queries example who won the java-jump event? want a table containing only the name of the winner so only column is CAFEMEISTER want to restrict rows so that EVENT is java-jump query: select CAFEMEISTER from filter EVENT = java-jump from EVENTS could do it the other way round too other examples who sponsors alice? which athletes did more than 9 hours? whom does starbucks sponsor? 10/31/2019 daniel jackson

relational join relational join given two tables T1 and T2 with columns a1, b1, c1, … and a2, b2, c2, … make a new table with columns a1, b1, c1, … a2, b2, c2, … for each row R1 in T1 and each row R2 in T2, make a new row R1 R2 how many rows if each table has k rows? k^2 assume that column names of T1 and T2 are disjoint 10/31/2019 daniel jackson

example of a join join EVENTS and ATHLETES EVENT CAFEMEISTER HOURS NAME SPONSOR java-jump ben 10 alice coffee- connection tour-de-café alice 15 alice coffee- connection seattle-sleepless carol 8 alice coffee- connection java-jump ben 10 ben starbucks tour-de-café alice 15 ben starbucks seattle-sleepless carol 8 ben starbucks java-jump ben 10 carol green-mountain tour-de-café alice 15 carol green-mountain seattle-sleepless carol 8 carol green-mountain 10/31/2019 daniel jackson

queries involving two tables now we can formulate a query on two tables by joining the tables together into a single table applying filter and select to the new table example which company sponsored the winner of the java-cup event? query is select SPONSOR from filter EVENT = java-jump from filter CAFEMEISTER = NAME from join EVENTS and ATHLETES before filter, we have EVENT CAFEMEISTER HOURS NAME SPONSOR java-jump ben 10 ben starbucks final table (result of query) is SPONSOR starbucks 10/31/2019 daniel jackson

more example queries exercises which events did coffee-connection sponsor the winner of? who sponsored an athlete who did more than 12 hours? 10/31/2019 daniel jackson

query optimization note join is expensive! want to apply it to small tables whereever possible could have done one of the selects before the join first instead of select SPONSOR from filter EVENT = java-jump filter CAFEMEISTER = NAME from join EVENTS and ATHLETES we could write join ATHLETES and filter EVENT = java-jump from EVENTS now join gives table with only 3 rows instead of 9 we’d like to write the query either way, and have the database figure out a better way to write it this is called query optimization: can be expressed with simple rules 10/31/2019 daniel jackson

join of tables with shared column names convenient to use same names for related columns for example, CAFEMEISTER in EVENTS and NAME in ATHLETES could both be NAME refine the meaning of join can’t have two columns with same name so form a table with both columns eliminate rows with different values in the two columns merge the two columns example assume CAFEMEISTER column in EVENTS is now called NAME join of EVENTS and ATHLETES: EVENT NAME HOURS SPONSOR tour-de-café alice 15 coffee- connection java-jump ben 10 starbucks seattle-sleepless carol 8 green-mountain 10/31/2019 daniel jackson

query optimizations which of these rules are valid? filter C1 from filter C2 from T = filter C1 and C2 from T select COLS1 from select COLS2 from T = select COLS1 U COLS2 from T under what conditions are these valid? select COLS from (join of T1 and T2) = join (select COLS from T1) and (select COLS from T2) filter C from (join of T1 and T2) = join of (select C from T1) and T2 10/31/2019 daniel jackson

puzzles for student presentations procedures with variable length argument lists find out from the R4RS manual how to write a procedure that can take a variable number of arguments make up an entertaining procedure that exploits this feature powerlists define a procedure that takes a list and returns a list of all the sublists for example, (p (list 1 2 3)) should return a list containing () (1) (2) (3) (1 2) (1 3) (2 3) (1 2 3) in some order use map! 10/31/2019 daniel jackson