Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.

Slides:



Advertisements
Similar presentations
Information Systems Today: Managing in the Digital World
Advertisements

C6 Databases.
Introduction to Databases
Chapter Information Systems Database Management.
Chapter 3 Database Management
Entity-Relationship Model and Diagrams (continued)
Lecture Fourteen Methodology - Conceptual Database Design
“DOK 322 DBMS” Y.T. Database Design Hacettepe University Department of Information Management DOK 322: Database Management Systems.
8/28/97Information Organization and Retrieval Files and Databases University of California, Berkeley School of Information Management and Systems SIMS.
Information Technology in Organizations
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
IST Databases and DBMSs Todd S. Bacastow January 2005.
Page 1 ISMT E-120 Desktop Applications for Managers Introduction to Microsoft Access.
PHASE 3: SYSTEMS DESIGN Chapter 7 Data Design.
The Relational Database Model
1 DATABASE TECHNOLOGIES BUS Abdou Illia, Fall 2007 (Week 3, Tuesday 9/4/2007)
1 C omputer information systems Design Instructor: Mr. Ahmed Al Astal IGGC1202 College Requirement University Of Palestine.
CSC2012 Database Technology & CSC2513 Database Systems.
Chapter 4 The Relational Model.
Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall 9.1.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Chapter 1 Overview of Database Concepts Oracle 10g: SQL
Introduction to SQL Steve Perry
1 Chapter 1 Overview of Database Concepts. 2 Chapter Objectives Identify the purpose of a database management system (DBMS) Distinguish a field from a.
Methodology Conceptual Databases Design
Databases. Database A database is an organized collection of related data.
1 Chapter 15 Methodology Conceptual Databases Design Transparencies Last Updated: April 2011 By M. Arief
MIS 301 Information Systems in Organizations Dave Salisbury ( )
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
GUS: 0262 Fundamentals of GIS Lecture Presentation 3: Relational Data Model Jeremy Mennis Department of Geography and Urban Studies Temple University.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
1 Relational Databases and SQL. Learning Objectives Understand techniques to model complex accounting phenomena in an E-R diagram Develop E-R diagrams.
Chapter 4 Database Processing Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 4-1.
Lecture2: Database Environment Prepared by L. Nouf Almujally 1 Ref. Chapter2 Lecture2.
Chapter 1Introduction to Oracle9i: SQL1 Chapter 1 Overview of Database Concepts.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
1 CS 430 Database Theory Winter 2005 Lecture 2: General Concepts.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
The University of Akron Dept of Business Technology Computer Information Systems The Relational Model: Concepts 2440: 180 Database Concepts Instructor:
Relational Theory and Design
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
Foundations of Business Intelligence: Databases and Information Management.
The relational model A data model (in general) : Integrated collection of concepts for describing data (data requirements). Relational model was introduced.
BSA206 Database Management Systems Lecture 2: Introduction to Oracle / Overview of Database Concepts.
MIS2502: Data Analytics Relational Data Modeling
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
6.1 © 2007 by Prentice Hall Chapter 6 (Laudon & Laudon) Foundations of Business Intelligence: Databases and Information Management.
ASET 1 Amity School of Engineering & Technology B. Tech. (CSE/IT), III Semester Database Management Systems Jitendra Rajpurohit.
Chapter 1: Introduction. 1.2 Database Management System (DBMS) DBMS contains information about a particular enterprise Collection of interrelated data.
Database Planning Database Design Normalization.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Rationale Databases are an integral part of an organization. Aspiring Database Developers should be able to efficiently design and implement databases.
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
Intro to MIS – MGS351 Databases and Data Warehouses
Databases Chapter 16.
Information Systems Today: Managing in the Digital World
 DATAABSTRACTION  INSTANCES& SCHEMAS  DATA MODELS.
Databases and Information Management
Databases and Information Management
Presentation transcript:

Databases From A to Boyce Codd

What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in which multiple trajectories of interaction with the information are possible. When thinking about “the database” on this level of abstraction, the difference between a collection of discrete objects (e.g., a bunch of books) and a particular structured representation of those objects (e.g., a set of catalog records) are, for example, not so different. Manovich says exactly this.

What is a database? In contrast, in the computational universe, the difference between a set of files and a set of relational tables is significant, in terms of the operations that can be performed on each. For computational purposes, a database is a set of structured data; Brookshear includes the requirement that the data be structured in a multidimensional way, so that it can be presented “from a variety of perspectives.”

What is a database? Are these databases? In what way? The books in my office. The PCL. YouTube. The Internet Movie Database. The Web. What is useful about calling any of these things databases or not?

What is a database? Even in the computational universe, a “database” can exist at multiple levels of abstraction: a conceptual model of a database (as presented through entity-relationship diagrams) can be implemented via different logical models (e.g., as different tables in a relational database, or potentially as objects in an object database, or...), and these can be stored and accessed differently (for example, distributed over many servers)...

Data modeling with ER diagrams Entity-relationship diagrams depict a conceptual model (schema) of data by specifying entities (things), attributes (properties of the things) and relationships between the things. ERDs document semantics, not implementation. There are many forms of notation for ERDs (Chen, the optional reading, is an early one).

Data modeling with ER diagrams There is often no “right answer” when determining entities, attributes, and relationships; different entities might be defined, or entities might be defined as relationships instead. It can be difficult to predict consequences of one modeling choice vs. another. Similar decisions and consequences occur when translating a conceptual data model to a logical model for implementation (and then in the actual implementation).

Data modeling with ER diagrams Let’s take a simple example to play with ERDs. We want to model the idea that students take courses and instructors teach courses. One way to do this is to have three entities: students, courses, and instructors.

Silly ER example #1

Data modeling with ER diagrams Now, we could also do this by having two entities: courses and people. In this model, people would have two roles, instructor and student (Chen has an example of a marriage as a relationship between two people).

Silly ER example #2

Data modeling with ER diagrams If we are interested in capturing information other information about students and instructors besides their names, then we probably want to separate them (e.g., we might want to track how many credits students have and how many requirements they have completed, and these don’t apply to instructors). If there was overlap between students and instructors, we might have some redundancy going on. We might want to keep this in mind.

Back to silly ER example #1

Data modeling with ER diagrams Say we want to add the idea of grades to this model. Instructors assess student performance in courses by assigning them grades. How might we model grades? Are they entities? Are they attributes of some existing entity? Are they are relationship between entities? Let’s think about it.

Does this model work?

From ERD to database Databases can be implemented in different ways. Brookshear describes relational databases and object-oriented databases. In each case, the database would be implemented in a database management system (DBMS), which would hide details of the actual data composition and storage and such from application programs that use the data.

Relational databases In a relational database, entities are described as rows in a table. The table is called a relation. The table columns are attributes. Each table row is also called a tuple.

A relation Here is a relation that contains information about peer reviewers for an academic conference. There are some problems with this relation.

A relation One immediate problem is that we have multiple values for a single attribute. That’s going to make it hard to figure out how many papers are really assigned to Barb Chen (is that one number or two numbers), or to reassign papers, etc.

Revised relation We can fix this by making additional tuples so that no cell has multiple values. But then we have some redundancy in the relation; every time we assign a new paper, we need to input that name, school and city information as well. Redundancy can cause problems with data integrity.

Revised relations To fix this redundancy problem, we need to split up the table into multiple tables and relate each table with a “foreign key” that links back to the original table. Then we can also include more information about the papers, which is probably necessary anyway.

Revised relations

There’s still a problem with that original table, though; the city attribute isn’t really about the reviewer, is it? It’s about the school. It might not be likely that schools will change cities, but it’s still good practice to keep all the attributes in a table about the entity specified by the primary key (in this case, the reviewer).

Revised relations

So that’s good. But there was still redundancy in those Paper and Expertise tables. Also we have deletion problems...if a paper isn’t assigned to a reviewer, does that mean it can’t exist in the database? How would we fix that?

Revised relations

Normalization Wowee, that’s a lot of tables there! Do we really want to do that? Yes. And no. On the one hand, it’s good to minimize data redundancy, because this can lead to problems with updating. On the other hand, performing lots of Join operations to put information together again can be inefficient, from a performance perspective.

Database operations Databases are powerful because we can reassemble data in myriad ways. Basic relational database operations include Select, Project, and Join.

Database operations Select extracts rows (tuples) from a relation. Project extracts columns (attribute values). Join combines multiple relations into a new relation.

Query languages Database query languages, such as SQL, may implement the basic operations in different ways. A SQL statement may perform all three operations—Join, Select, and Project.

Wait, what about objects? Indeed, an object-oriented database is a different model than a relational one.

Object-oriented databases Object-oriented databases can be: More flexible. Better integrated with applications. However, there are many standard tools for managing relational databases, which can make them easier to administer.

Very large databases For tremendous datasets such as Facebook, performance and storage become tremendously important. Facebook wrote its own system, Cassandra, that is optimized for distributed storage and speed of “massive” data. (This is a “NoSQL” database model.) “Cassandra does not support a full relational data model; instead, it provides clients with a simple data model that supports dynamic control over data layout and format.”

So...what then? Is learning MySQL stupid? No; but it’s important to remember where MySQL lies in the realm of “the database.” For example, conceptual models may be stable where logical models and implementation details change.