Introduction to Relational Databases

Slides:



Advertisements
Similar presentations
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
Advertisements

The Relational Model. Introduction Introduced by Ted Codd at IBM Research in 1970 The relational model represents data in the form of table. Main concept.
Introduction to Relational Databases Obtained from Portland State University.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 28 Database Systems I The Relational Data Model.
The Relational Model CS 186, Fall 2006, Lecture 2 R & G, Chap. 3.
SPRING 2004CENG 3521 The Relational Model Chapter 3.
1 Query Languages: How to build or interrogate a relational database Structured Query Language (SQL)
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
The Relational Model These slides are based on the slides of your text book.
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
Database Organization and Design
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 2: Intro to Relational.
Introduction to Relational Databases
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
1.1 CAS CS 460/660 Relational Model. 1.2 Review E/R Model: Entities, relationships, attributes Cardinalities: 1:1, 1:n, m:1, m:n Keys: superkeys, candidate.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
The University of Akron Dept of Business Technology Computer Information Systems The Relational Model: Concepts 2440: 180 Database Concepts Instructor:
The Relational Model Content based on Chapter 3 Database Management Systems, (Third Edition), by Raghu Ramakrishnan and Johannes Gehrke. McGraw Hill, 2003.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CMPT 258 Database Systems The Relationship Model (Chapter 3)
The Relational Model Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.
ICS 421 Spring 2010 Relational Model & Normal Forms Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/19/20101Lipyeow.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Database Management Systems.  Instructor: Yrd. Doç. Dr. Cengiz Örencik   Course material.
Faeez, Franz & Syamim.   Database – collection of persistent data  Database Management System (DBMS) – software system that supports creation, population,
SQL: Interactive Queries (2) Prof. Weining Zhang Cs.utsa.edu.
Advanced Computing Data Bases: Basic concepts Based in part on open access material from: Database Management Systems by Raghu Ramakrishnan and Johannes.
Chapter 3 The Relational Model. Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Introduction to Databases & SQL Ahmet Sacan. What you’ll need Firefox, SQLite plugin Mirdb and Targetscan databases.
1 CS122A: Introduction to Data Management Lecture #4 (E-R  Relational Translation) Instructor: Chen Li.
CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
1 CS122A: Introduction to Data Management Lecture #15: Physical DB Design Instructor: Chen Li.
 Database – collection of persistent data  Database Management System (DBMS) – software system that supports creation, population, and querying of a.
Databases and DBMSs Todd S. Bacastow January
Logical DB Design: ER to Relational
COP Introduction to Database Structures
Databases We are particularly interested in relational databases
CS522 Advanced database Systems
CS 540 Database Management Systems
Storage and Indexes Chapter 8 & 9
Chapter 9 Database Systems
These slides are based on the slides of your text book.
CS 186, Fall 2006, Lecture 2 R & G, Chap. 3
Relational Algebra Chapter 4, Part A
Translation of ER-diagram into Relational Schema
Lecture#7: Fun with SQL (Part 2)
What is a Database and Why Use One?
CS 405G: Introduction to Database Systems
Database Systems (資料庫系統)
The Relational Model Content based on Chapter 3
The Relational Model Relational Data Model
CS 405G: Introduction to Database Systems
DATABASE SYSTEM.
Relational Algebra Chapter 4 - part I.
File Organizations and Indexing
SQL: The Query Language Part 1
The Relational Model The slides for this text are organized into chapters. This lecture covers Chapter 3. Chapter 1: Introduction to Database Systems Chapter.
CMPT 354: Database System I
Database Systems (資料庫系統)
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Database Management System
SQL: Structured Query Language
Introduction to Database Systems
The Relational Model Content based on Chapter 3
The Relational Model Content based on Chapter 3
CS 405G: Introduction to Database Systems
The Relational Model Chapter 3
Presentation transcript:

Introduction to Relational Databases Obtained from Portland State University

Introduction Database – collection of persistent data Database Management System (DBMS) – software system that supports creation, population, and querying of a database

Relational Database Relational Database Management System (RDBMS) Consists of a number of tables and single schema (definition of tables and attributes) Students (sid, name, login, age, gpa) Students identifies the table sid, name, login, age, gpa identify attributes sid is primary key

An Example Table Students (sid: string, name: string, login: string, age: integer, gpa: float) sid name login age gpa 50000 Dave dave@cs 19 3.3 53666 Jones jones@cs 18 3.4 53688 Smith smith@ee 3.2 53650 smith@math 3.8 53831 Madayan madayan@music 11 4.0 53832 Guldu guldu@music 12 3.0

Another example: Courses Courses (cid, instructor, quarter, dept) cid instructor quarter dept Carnatic101 Jane Fall 06 Music Reggae203 Bob Summer 06 Topology101 Mary Spring 06 Math History105 Alice History

Keys Primary key – minimal subset of fields that is unique identifier for a tuple sid is primary key for Students cid is primary key for Courses Foreign key –connections between tables Courses (cid, instructor, quarter, dept) Students (sid, name, login, age, gpa) How do we express which students take each course?

Example Schema (simplified) Courses (cid, instructor, quarter, dept) Students (sid, name, gpa) Enrolled (cid, sid, grade)

Selection Select students with gpa higher than 3.3 from S1: Students sid name gpa 50000 Dave 3.3 53666 Jones 3.4 53688 Smith 3.2 53650 3.8 53831 Madayan 1.8 53832 Guldu 2.0 sid name gpa 53666 Jones 3.4 53650 Smith 3.8

Another Selection Project name and gpa of all students in S1 S1 Sid 50000 Dave 3.3 53666 Jones 3.4 53688 Smith 3.2 53650 3.8 53831 Madayan 1.8 53832 Guldu 2.0 name gpa Dave 3.3 Jones 3.4 Smith 3.2 3.8 Madayan 1.8 Guldu 2.0

Combine Selection and Projection Select name and gpa of students in S1 with gpa higher than 3.3: Sid name gpa 50000 Dave 3.3 53666 Jones 3.4 53688 Smith 3.2 53650 3.8 53831 Madayan 1.8 53832 Guldu 2.0 name gpa Jones 3.4 Smith 3.8

Set Operations Union (R U S) Intersection (R ∩ S) All tuples in R or S (or both) R and S must have same number of fields Corresponding fields must have same domains Intersection (R ∩ S) All tuples in both R and S Set difference (R – S) Tuples in R and not S

Set Operations (continued) Cross product or Cartesian product (R x S) All fields in R followed by all fields in S One tuple (r,s) for each pair of tuples r  R, s  S

Example: Intersection sid name gpa 53666 Jones 3.4 53688 Smith 3.2 53700 Tom 3.5 53777 Jerry 2.8 53832 Guldu 2.0 sid name gpa 50000 Dave 3.3 53666 Jones 3.4 53688 Smith 3.2 53650 3.8 53831 Madayan 1.8 53832 Guldu 2.0 sid name gpa 53666 Jones 3.4 53688 Smith 3.2 53832 Guldu 2.0

Joins Combine information from two or more tables Example: students enrolled in courses: S1 S1.sid=E.studidE S1 E Sid name gpa 50000 Dave 3.3 53666 Jones 3.4 53688 Smith 3.2 53650 3.8 53831 Madayan 1.8 53832 Guldu 2.0 cid grade studid Carnatic101 C 53831 Reggae203 B 53832 Topology112 A 53650 History 105 53666

Joins S1 E Sid name gpa 50000 Dave 3.3 53666 Jones 3.4 53688 Smith 3.2 53650 3.8 53831 Madayan 1.8 53832 Guldu 2.0 cid grade studid Carnatic101 C 53831 Reggae203 B 53832 Topology112 A 53650 History 105 53666 Note that join can also be thought of as cross product and select sid name gpa cid grade studid 53666 Jones 3.4 History105 B 53650 Smith 3.8 Topology112 A 53831 Madayan 1.8 Carnatic101 C 53832 Guldu 2.0 Reggae203

Relational Algebra Summary Algebras are useful to manipulate data types (relations in this case) Set-oriented Opportunities for optimization May have different expressions that do same thing

Intro to SQL CREATE TABLE SELECT-FROM-WHERE INSERT Create a new table, e.g., students, courses, enrolled SELECT-FROM-WHERE List all CS courses INSERT Add a new student, course, or enroll a student in a course

Create Table CREATE TABLE Enrolled (studid CHAR(20), cid CHAR(20), grade CHAR(20), PRIMARY KEY (studid, cid), FOREIGN KEY (studid) references Students)

Select-From-Where query “Find all students who are under 18” SELECT * FROM Students S WHERE S.age < 18

Queries across multiple tables (joins) “Print the student name and course ID where the student received an ‘A’ in the course” SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid = E.studid AND E.grade = ‘A’

Other SQL features MIN, MAX, AVG COUNT, DISTINCT ORDER BY, GROUP BY Find highest grade in fall database course COUNT, DISTINCT How many students enrolled in CS courses in the fall? ORDER BY, GROUP BY Rank students by their grade in fall database course

Views Virtual table defined on base tables defined by a query Single or multiple tables Security – “hide” certain attributes from users Show students in each course but hide their grades Ease of use – expression that is more intuitively obvious to user Views can be materialized to improve query performance

Views Suppose we often need names of students who got a ‘B’ in some course: CREATE VIEW B_Students(name, sid, course) AS SELECT S.sname, S.sid, E.cid FROM Students S, Enrolled E WHERE S.sid=E.studid and E.grade = ‘B’ name sid course Jones 53666 History105 Guldu 53832 Reggae203

Indexes Idea: speed up access to desired data “Find all students with gpa > 3.3

Types of Indexes Clustered vs. Unclustered Primary vs. Secondary Clustered- ordering of data records same as ordering of data entries in the index Unclustered- data records in different order from index Primary vs. Secondary Primary – index on fields that include primary key Secondary – other indexes

Example: Clustered Index Sorted by sid sid name gpa 50000 Dave 3.3 53650 Smith 3.8 53666 Jones 3.4 53688 3.2 53831 Madayan 1.8 53832 Guldu 2.0 50000 53600 53800

Example: Unclustered Index Sorted by sid Index on gpa sid name gpa 50000 Dave 3.3 53650 Smith 3.8 53666 Jones 3.4 53688 3.2 53831 Madayan 1.8 53832 Guldu 2.0 1.8 2.0 3.2 3.3 3.4 3.8

Comments on Indexes Indexes can significantly speed up query execution But inserts more costly May have high storage overhead Need to choose attributes to index wisely! What queries are run most frequently? What queries could benefit most from an index?

Summary: Why are RDBMS useful? Data independence – provides abstract view of the data, without details of storage Efficient data access – uses techniques to store and retrieve data efficiently Reduced application development time – many important functions already supported Centralized data administration Data Integrity and Security Concurrency control and recovery

So, why don’t scientists use them? “I tried to use databases in my project, but they were just too [slow | hard-to-use | expensive | complex] . So I use files”. Gray and Szalay, Where Rubber Meets the Sky: Bridging the Gap Between Databases and Science

Some other limitations of RDBMS Arrays Hierarchical data

Example: Taxonomy of Organisms Hierarchy of categories: Kingdom - phylum – class – order – family – genus - species How would you design a relational schema for this? Animals Chordates Arthropods Vertebrates insects spiders crustaceans birds reptiles mammals