Advanced Computing Data Bases: Basic concepts Based in part on open access material from: Database Management Systems by Raghu Ramakrishnan and Johannes.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

IMPLEMENTATION OF INFORMATION RETRIEVAL SYSTEMS VIA RDBMS.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 Introduction to Database Systems CSE444 Instructor: Scott Vandenberg University of Washington Winter 2000.
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chapter 1 Instructor: Mirsad Hadzikadic.
Database: A collection of related data [Elmasri]. A database represents some aspect of real world called “miniworld” [Elmasri] or “enterprise” [Ramakrishnan].
Management Information Systems, Sixth Edition
Introduction to Relational Databases Obtained from Portland State University.
The Relational Model CS 186, Fall 2006, Lecture 2 R & G, Chap. 3.
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 1 Database Systems I Introduction.
1 Introduction to Database Systems Ref. Ramakrishnan & Gehrke Chapter 1.
1 CENG 302 Introduction to Database Management Systems Nihan Kesim Çiçekli URL:
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Information systems and databases Database information systems Read the textbook: Chapter 2: Information systems and databases FOR MORE INFO...
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
IST Databases and DBMSs Todd S. Bacastow January 2005.
Introduction to Data bases concepts
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
CSC2012 Database Technology & CSC2513 Database Systems.
Introduction. 
Database Management Systems 1 Introduction to Database Systems Instructor: Xintao Wu Ramakrishnan & Gehrke.
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
Chapter 2 CIS Sungchul Hong
Web-Enabled Decision Support Systems
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chpt 1 Instructor: Xintao Wu.
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chpt 1 Instructor: Weichao Wang.
Database Organization and Design
INFS614, Dr. Brodsky, GMU1 Database Management Systems INFS 614 Instructor: Professor Alex Brodsky
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Introduction to Relational Databases
Object Persistence Design Chapter 13. Key Definitions Object persistence involves the selection of a storage format and optimization for performance.
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
Instructor: Dema Alorini Database Fundamentals IS 422 Section: 7|1.
 Three-Schema Architecture Three-Schema Architecture  Internal Level Internal Level  Conceptual Level Conceptual Level  External Level External Level.
Lecture2: Database Environment Prepared by L. Nouf Almujally 1 Ref. Chapter2 Lecture2.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
IS 325 Notes for Wednesday August 28, Data is the Core of the Enterprise.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
1 CS 430 Database Theory Winter 2005 Lecture 2: General Concepts.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
The University of Akron Dept of Business Technology Computer Information Systems The Relational Model: Concepts 2440: 180 Database Concepts Instructor:
12/2/2015CPSC , CPSC , Lecture 41 Relational Model.
Database Systems Lecture 1. In this Lecture Course Information Databases and Database Systems Some History The Relational Model.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
-ebru a.s ATTRIBUTE: Description of entities For employee entity number, name, deptno, age, adr, salary..etc are attributes. RECORD: Stores whole.
The Relational Model Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.
Chapter 13.3: Databases Invitation to Computer Science, Java Version, Second Edition.
Chapter 2 Database Environment.
Database Management Systems.  Instructor: Yrd. Doç. Dr. Cengiz Örencik   Course material.
1 Chapter 2 Database Environment Pearson Education © 2009.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Faeez, Franz & Syamim.   Database – collection of persistent data  Database Management System (DBMS) – software system that supports creation, population,
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1.
BBM 371 – Data Management Lecture 3: Basic Concepts of DBMS Prepared by: Ebru Akçapınar Sezer, Gönenç Ercan.
1 CENG 351 CENG 351 Introduction to Data Management and File Structures Department of Computer Engineering METU.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
SQL Basics Review Reviewing what we’ve learned so far…….
1 CS122A: Introduction to Data Management Lecture #4 (E-R  Relational Translation) Instructor: Chen Li.
Introduction to Database Programming with Python Gary Stewart
 Database – collection of persistent data  Database Management System (DBMS) – software system that supports creation, population, and querying of a.
Databases and DBMSs Todd S. Bacastow January
Database Management Systems
Instructor: Elke Rundensteiner
Chapter 4 Relational Databases
Introduction to Database Systems
Introduction to Relational Databases
Translation of ER-diagram into Relational Schema
Database.
Introduction to Database Systems
Presentation transcript:

Advanced Computing Data Bases: Basic concepts Based in part on open access material from: Database Management Systems by Raghu Ramakrishnan and Johannes Gehrke Databases by Timothy Griffin & Scientific Data Management by Laura Bright & Bill Howe

What is a database? "A database is a collection of information stored (usually in a computing system) in a systematic way, such that users (usually a computer program) can consult it to answer questions.“ (see WikiPedia - The Free Encyclopedia.) Databases are in almost any medium to large system in the world today. This is useful to handle scalability. As the expectations of users and the complexity of applications continue to increase, most application systems would need a database to store and manage information. Databases have grown larger and more powerful as technology advances. It is now common to have databases which are terabytes in size and hosted on multiple servers Data is stored in database tables and SQL is used to access or modify this data.database tablesSQL

Database management systems A database is a collection of (usually persistent) data that model the entities of interest, by means of properties. To handle large amounts of data computers use what we call a Database Management System (DBMS) – These are software systems that support the creation, population, and querying of a database Aims in the access to data: Efficiency, Concurrency, Security

Database Tables Tables are the basic structures within databases that are used to store data. Each database table consists of rows and columns. Rows are called Records (the entities to be modelled), and columns are called Fields (their properties of interest). As the number of records grows, more rows are added into the table A record is expressed as a collection of fields, which describe the relevant aspects of the single unit of data stored in the table. Each column in the database table shows the fields of the different records and has a data-type associated.

Data types Each field in a table has a data-type associated. Types fix the way fields are represented and stored in the binary implementation inside the computing system. This allows to have efficient mechanism to reach a record in the serialized version of the table.

Relationships As data grow in a database, it may be more efficient to have several tables to store it. When data types become more complex, and/or the properties to attach to a field would be better expressed as records in another table, records of this tables are referenced from fields in records of the other tables. The referencing mechanism is the definition of one of the fields of the record as a key Keys are used in the referencing table as data stored in a field

Relational Database When data need to be expressed in several related tables, we call it a relational database The software used to manage them is a Relational Database Management System (RDBMS). It consists of a number of tables and single schema (definition of tables and attributes) Example:Students (sid, name, login, age, gpa) Students  identifies the table sid, name, login, age, gpa  identify attributes sid  is primary key

Programming over Database Management Systems (DBMSs) Raw Resources (and its binary code) DBMS Applications Go Here Database abstractions allow this interface to be cleanly defined and this allows applications and data management systems to be implemented separately.

9 Three-level architecture Conceptual Schema … Physical level Conceptual level Internal Schema External Schema 1 External Schema 2 External Schema n External level

An Example Table Students (sid: string, name: string, login: string, age: integer, gpa: real) sidnameloginagegpa

Another example: Courses Courses (cid, instructor, quarter, dept) cidinstructorquarterdept Carnatic101JaneFall 06Music Reggae203BobSummer 06Music Topology101MarySpring 06Math History105AliceFall 06History

Keys Primary key – minimal subset of fields that is unique identifier for a tuple sid is primary key for Students cid is primary key for Courses Foreign key –connections between tables Courses (cid, instructor, quarter, dept) Students (sid, name, login, age, gpa) How do we express which students take each course?

Many to many relationships In general, need a new table Enrolled(cid, grade, studid) Studid is foreign key that references sid in Student table cidgradestudid Carnatic101C53831 Reggae203B53832 Topology112A53650 History 105B53666 sidnamelogin Enrolled Student Foreign key

Indexes When a table will be searched using different fields, it is convenient to have them pre-ordered according to those fields. To have quick access to it, it is convenient to have an additional table that links ordering values(fields) to keys of the records in the related table. Idea: speed up access to desired data “Find all students with gpa > 3.3” May need to scan entire table Index consists of a set of entries pointing to locations of each search key

Types of Indexes Clustered vs. Unclustered Clustered- ordering of data records same as ordering of data entries in the index Unclustered- data records in different order from index Primary vs. Secondary Primary – index on fields that include primary key Secondary – other indexes

Example: Clustered Index Sorted by sid sidnamegpa 50000Dave Smith Jones Smith Madayan Guldu

Example: Unclustered Index Sorted by sid Index on gpa sidnamegpa 50000Dave Smith Jones Smith Madayan Guldu

Comments on Indexes Indexes can significantly speed up query execution But inserts more cost May have high storage overhead Need to choose attributes to index wisely! What queries are run most frequently? What queries could benefit most from an index?

Summary: Why are RDBMS useful? Data independence – provides abstract view of the data, without details of storage Efficient data access – uses techniques to store and retrieve data efficiently Reduced application development time – many important functions already supported Centralized data administration Data Integrity and Security Concurrency control and recovery

So, why don’t scientists use them? “I tried to use databases in my project, but they were just too [slow | hard-to-use | expensive | complex]. So I use files”. Gray and Szalay, Where Rubber Meets the Sky: Bridging the Gap Between Databases and Science Some other limitations of RDBMS Arrays Hierarchical data

Example: Taxonomy of Organisms Hierarchy of categories: Kingdom - phylum – class – order – family – genus - species How would you design a relational schema for this? Animals Chordates Vertebrates Arthropods birds insectsspiderscrustaceans reptilesmammals