Chitu Okoli Associate Professor in Business Technology Management

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management Ninth Edition
Advertisements

Database Systems: Design, Implementation, and Management Tenth Edition
Database Systems: Design, Implementation, and Management Tenth Edition
Database Systems: Design, Implementation, and Management Tenth Edition
1 © 2013 Cengage Learning. All Rights Reserved. This edition is intended for use outside of the U.S. only, with content that may be different from the.
Chapter 2 Data Models.
Database Systems: Design, Implementation, and Management Ninth Edition
BTM 382 Database Management Chapter 2: Data models Chapter : CAP and Hadoop Chitu Okoli Associate Professor in Business Technology Management John.
Introduction to Databases
Chapter 2 Data Models Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
File Systems and Databases
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 2 Data Models Database Systems, 8th Edition 1.
Chapter 4 Database Management Systems. Chapter 4Slide 2 What is a Database Management System (DBMS)?  Database An organized collection of related data.
PHASE 3: SYSTEMS DESIGN Chapter 7 Data Design.
Databases with Scalable capabilities Presented by Mike Trischetta.
2 1 Chapter 2 Data Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
BTM 382 Database Management Chapter 14: XML and cloud databases Chapter 9: Database design Chapter 15: Database administration Chitu Okoli Associate Professor.
Systems analysis and design, 6th edition Dennis, wixom, and roth
11 1 Object oriented DB (not in book) Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Learning objectives: What.
Web-Enabled Decision Support Systems
Database Design - Lecture 2
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
BTM 382 Database Management Chapter Writing optimized SQL queries Chitu Okoli Associate Professor in Business Technology Management John Molson.
BTM 382 Database Management Chapter 2: Data models Chapter : CAP and Hadoop Chitu Okoli Associate Professor in Business Technology Management John.
L Department of Mathematics Computer and Information Science l1l1 1 CS 351: Database Management Christopher I. G. Lanclos Chapter 2.
ITEC 3220A Using and Designing Database Systems
1 © 2013 Cengage Learning. All Rights Reserved. This edition is intended for use outside of the U.S. only, with content that may be different from the.
Data Models. 2 The Importance of Data Models Data models –Relatively simple representations, usually graphical, of complex real-world data structures.
DBS201: Data Modeling. Agenda Data Modeling Types of Models Entity Relationship Model.
1 © 2013 Cengage Learning. All Rights Reserved. This edition is intended for use outside of the U.S. only, with content that may be different from the.
BTM 382 Database Management Chapter 5: Advanced Data Modeling
BTM 382 Database Management Chapter 8 Advanced SQL Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia.
1 © 2010 Cengage Learning. All Rights Reserved. This edition is intended for use outside of the U.S. only, with content that may be different from the.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Managing Data Resources File Organization and databases for business information systems.
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
Geographic Information Systems GIS Data Databases.
James A. Senn’s Information Technology, 3rd Edition
Database Systems: Design, Implementation, and Management Tenth Edition
Data Resource Management
Client/Server Databases and the Oracle 10g Relational Database
BTM 382 Database Management Chapter 13: Business intelligence and data warehousing Chapter 14-4: Data analytics Chitu Okoli Associate Professor in Business.
An Introduction to database system
Chapter 2 Database System Concepts and Architecture
Fundamentals of Information Systems, Sixth Edition
Introduction to Information Technology
Database Concepts.
NOSQL databases and Big Data Storage Systems
Geographic Information Systems
Tools for Memory: Database Management Systems
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Databases and Information Management
Advanced Database Models
Basic Concepts in Data Management
Chapter 2 Database Environment.
1 Demand of your DB is changing Presented By: Ashwani Kumar
MANAGING DATA RESOURCES
NOSQL and CAP Theorem.
BTM 382 Database Management Chapter 1: Database systems
File Systems and Databases
Chapter 4 Entity Relationship (ER) Modeling
MANAGING DATA RESOURCES
Databases.
Introduction to Database Management Systems
Database Management Systems
Entity Relationship (ER) Modeling
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management
Presentation transcript:

BTM 382 Database Management Chapter 2: Data models Chapter 12-12: CAP Chapter 14-2a: Hadoop Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia University, Montréal

Structure of BTM 382 Database Management Week 1: Introduction and overview ch1: Introduction Weeks 2-6: Database design ch3: Relational model ch4: ER modeling ch6: Normalization ERD modeling exercise ch5: Advanced data modeling Week 7: Midterm exam Weeks 8-10: Database programming ch7: Intro to SQL ch8: Advanced SQL SQL exercises Weeks 11-13: Database management ch2,12,14: Data models ch13: Business intelligence and data warehousing ch9,14,15: Selected managerial topics

Review of Chapters 2, 12, 14: Data models What is a data model? How have data models developed over the years? What is the Object-Oriented Data Model (OODM), and when is it useful? What is Big Data, and how does NoSQL resolve the major Big Data challenges? Which data models should we use for which situations?

Models and data models

What is a model? A model is a simplified way to describe or explain a complex reality A model helps people communicate and work simply yet effectively when talking about and manipulating complex real-world phenomena

Scientific models Image sources: http://www.redorbit.com/education/reference_library/space_1/universe/2574692/geocentric_model/ http://hendrianusthe.wordpress.com/2012/06/21/heliocentric-vs-geocentric/

Conceptual models Image sources: http://info563.malagaclasses.info/strategy-it-2/ http://fivewhys.wordpress.com/2012/05/22/business-model-innovation/

Importance of Data Models Communication tool Give an overall view of the database Organize data for various users Are an abstraction for the creation of well-designed good database

The Evolution of Data Models

Obsolete models: Hierarchical and network models

The Relational Model Uses key concepts from mathematical relations (tables) “Relational” in “relational model” means “tables” (mathematical relations), not “relationships” Table (relations) Intersections of rows (various data types) and columns (same data type) Relations have well defined methods (queries) for combining their data members Selecting (reading) and joining (combining) data is defined based on mathematical principles Relational data management system (RDBMS) Relations were originally too advanced for 1970s computing power As computing power increased, simplicity of the model prevailed

The Entity Relationship Model Enhancement of the relational model Relations (tables) become entities Very detailed specification of relationships and their properties Entity relationship diagram (ERD) Uses graphic representations to model database components Many variations for notation exist In this class, we use the Crow’s Foot notation

The Object-Oriented Data Model (OODM)

The Object-Oriented Data Model (OODM) Tries to reconcile the ER model with object-oriented programming (OOP) The ER model’s view of data (tables) and programmers’ view of data (objects in OOP), is completely different This mismatch can sometimes make database programming painful, especially for very complex data structures An OODM uses OOP concepts to store data Objects represent nouns (entities or records) Objects have attributes (properties or fields) with values (data) Objects have methods (operations or functions) Classes group similar objects using a hierarchy and inheritance In an OODBMS, the data retrieval and storage closely mirrors the data structures that programmers use, and so programming complex objects is much easier than with the ER model More advanced forms support the Extended Relational Data Model, Object/Relational DBMS, and XML data structures

OODBMS vs. RDBMS https://youtu.be/kORTgvfHl4g

C

C

Big Data and NoSQL

Explaining Big Data https://youtu.be/7D1CQ_LOizA

Big Data Volume Velocity Variety Huge amounts of data (terabytes and petabytes), especially from the Internet Velocity Organizations need to process the huge amounts of data rapidly, just as fast as with smaller databases Variety Many different types of data, much of it unstructured and even changing in structure

How do you handle Big Data? Where RDBMSs run into trouble Solution: Scale up Use more powerful, expensive servers But RDBMSs are very computing intensive Big data would require much faster, more capable, more expensive computers, and even that’s not good enough for big data Solution: Scale out Use many cheap distributed servers But RDBMS is slow with distributed processing Consistency is the biggest problem: guaranteeing consistency (which RDBMS is great at) is slow Slow infrastructure isn’t good enough for big data

https://youtu.be/qUV2j3XBRHc What is NoSQL? https://youtu.be/qUV2j3XBRHc

NoSQL databases to the Big Data rescue “NoSQL” means: Non-relational or non-RDBMS Also “Not only SQL”—a few in fact do support SQL It is not one model; it is many different models that are not relational data models Scale out (many cheap distributed servers) instead of scale up High scalability Support distributed database architectures High availability Rapid performance for big data, including unstructured and sparse data Fault tolerance Continue to work even if some servers in the cluster fail Emphasis is high performance speed, rather than transaction consistency

Types of NoSQL databases Also see: Picking the Right NoSQL Database Tool Image sources: https://www.linkedin.com/pulse/20140823125259-38485481-nosql-databases-where-i-can-use?trk=sushi_topic_posts http://www.monitis.com/blog/2011/05/22/picking-the-right-nosql-database-tool/

Disadvantages of NoSQL Complex programming is often required “NoSQL” means you lose the ease-of-use and structural independence of SQL There is often no built-in implementation of relationships in the database—you might have to program relationships yourself in code Data might be sometimes inconsistent No guarantee of transaction integrity Entity integrity and referential integrity not guaranteed The data you retrieve at any given moment might be inaccurate… but it will eventually become OK This is the price to pay for rapid performance in a distributed database

The CAP theorem for distributed databases CAP stands for: Consistency: All nodes see the same data Availability: A request always gets a response (success or failure) Partition tolerance: Even if a node fails, the system can still function A distributed database can guarantee only two of the three CAP characteristics, not all three at the same time Over time, it will eventually provide all three, but it cannot guarantee all three at the same time NoSQL databases are distributed, and so the CAP theorem restricts them to providing BASE, not ACID Image source: PRWEB

ACID versus BASE A relational database guarantees the ACID properties: Atomicity, Consistency, Isolation, Durability In short, a set of SQL statements (called a transaction) will either completely work or completely fail—no half way success, and the result will not corrupt the database A price to pay: results might be somewhat slow A NoSQL database does not guarantee ACID; it only guarantees BASE properties: Basically Available, Soft-state, Eventual consistency In short, at any given moment, not everything might be consistent, but the database will eventually get consistent In return, these imperfect results are delivered fast

C

C

Summary of data models

Distributed Database Spectrum Table 12.8 Sacrifices availability to ensure consistency and isolation

Historical outline of data models

Which data model should you use? Hierarchical or network models Obsolete—no one uses these any longer Entity-relationship model Almost always 90% or more of professional database situations Object-oriented database When you have very complex data structures, you need rapid performance, and it helps achieve organizational objectives Source: Barry & Associates, Inc When data structures are so complex that organizing data as tables causes headaches in programming retrieval and storage NoSQL When you have vast amounts of unstructured data and you need rapid performance When speed is more important than data consistency Popularity ranking of DBMSs: http://db-engines.com/en/ranking

Summary of Chapters 2, 12, 14: Data models A data model is an abstract way of thinking about how data is organized Although the relational model has become the dominant data model, it cannot solve all database challenges The Object-Oriented Data Model is useful for complex data coupled with object-oriented programming Big Data is data with high volume, velocity and variety NoSQL generally handles big data better than relational databases, but it sacrifices consistency for speed No single data model is the best for all situations, so we should understand the pros and cons of each model

Sources Most of the slides are adapted from Database Systems: Design, Implementation and Management by Carlos Coronel and Steven Morris. 11th edition (2015) published by Cengage Learning. ISBN 13: 978-1-285-19614-5 Other sources are noted on the slides themselves