CS 405G: Introduction to Database Systems

Slides:



Advertisements
Similar presentations
No SQL is not about SQL No SQL is a Zoo.. Key-Value Stores Wide Column Stores Document Stores Graph Databases.
Advertisements

Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
NoSQL and Review 1. Big Data (some old numbers) Facebook:  130TB/day: user logs  TB/day: 83 million pictures Google: > 25 PB/day processed data.
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
Reporter: Haiping Wang WAMDM Cloud Group
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
NoSQL Database.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Project By: Anuj Shetye Vinay Boddula. Introduction Motivation HBase Our work Evaluation Related work. Future work and conclusion.
Distributed Data Stores and No SQL Databases S. Sudarshan IIT Bombay.
AN INTRODUCTION TO NOSQL DATABASES Karol Rástočný, Eduard Kuric.
Distributed Data Stores and No SQL Databases S. Sudarshan Perry Hoekstra (Perficient) with slides pinched from various sources such as Perry Hoekstra (Perficient)
Getting Biologists off ACID Ryan Verdon 3/13/12. Outline Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL.
WTT Workshop de Tendências Tecnológicas 2014
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
Modern Databases NoSQL and NewSQL Willem Visser RW334.
Changwon Nati Univ. ISIE 2001 CSCI5708 NoSQL looks to become the database of the Internet By Lawrence Latif Wed Dec Nhu Nguyen and Phai Hoang CSCI.
Cloud Computing Clase 8 - NoSQL Miguel Johnny Matias
Jennifer Widom NoSQL Systems Motivation. Jennifer Widom NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
NOSQL DATABASE Not Only SQL DATABASE
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT IT Monitoring WG Technology for Storage/Analysis 28 November 2011.
NoSQL: Graph Databases. Databases Why NoSQL Databases?
Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.
NoSQL databases A brief introduction NoSQL databases1.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
1 Analysis on the performance of graph query languages: Comparative study of Cypher, Gremlin and native access in Neo4j Athiq Ahamed, ITIS, TU-Braunschweig.
NoSQL: Graph Databases
Introduction to Mongo DB(NO SQL data Base)
NoSQL: Graph Databases
and Big Data Storage Systems
XML: Extensible Markup Language
Introduction to Databases
Hadoop.
CSE 775 – Distributed Objects Bekir Turkkan & Habib Kaya
Datab ase Systems Week 1 by Zohaib Jan.
An Open Source Project Commonly Used for Processing Big Data Sets
CS122B: Projects in Databases and Web Applications Winter 2017
NoSQL Database and Application
Bridging SQL and NoSQL Rupal Dhillon U
Map Reduce.
Modern Databases NoSQL and NewSQL
NOSQL.
CHAPTER 3 Architectures for Distributed Systems
NOSQL databases and Big Data Storage Systems
NoSQL Systems Overview (as of November 2011).
Storage Systems for Managing Voluminous Data
Massively Parallel Cloud Data Storage Systems
1 Demand of your DB is changing Presented By: Ashwani Kumar
NOSQL and CAP Theorem.
NoSQL Databases An Overview
Introduction to Databases
NoSQL Systems Motivation.
NoSQL Databases Antonino Virgillito.
Instructor 彭智勇 武汉大学软件工程国家重点实验室 电话:
NoSQL Not Only SQL University of Kurdistan Faculty of Engineering
Database Systems Summary and Overview
CSE 482 Lecture 5: NoSQL.
April 13th – Semi-structured data
Charles Tappert Seidenberg School of CSIS, Pace University
Cloud Computing for Data Analysis Pig|Hive|Hbase|Zookeeper
Introduction to NoSQL Database Systems
NoSQL & Document Stores
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Presentation transcript:

CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky

Chen Qian @ University of Kentucky Summary Tree-based indexes: O(logN) for search and update, support range queries Hash-based indexes: best for equality searches O(1), cannot support range searches. Static and dynamic 3/14/2018 Chen Qian @ University of Kentucky 16

NoSQL Systems: Motivation NoSQL: The Name “SQL” = Traditional relational DBMS Recognition over past decade or so: Not every data management/analysis problem is best solved using a traditional relational DBMS “NoSQL” = “No SQL” = Not using traditional relational DBMS “No SQL”  Don’t use SQL language

NoSQL Systems: Motivation NoSQL: The Name “SQL” = Traditional relational DBMS Recognition over past decade or so: Not every data management/analysis problem is best solved using a traditional relational DBMS “NoSQL” = “No SQL” = Not using traditional relational DBMS “No SQL”  Don’t use SQL language “NoSQL” = “Not Only SQL”

NoSQL Systems: Motivation Not every data management/analysis problem is best solved using a traditional DBMS Database Management System (DBMS) provides…. … efficient, reliable, convenient, and safe multi-user storage of and access to massive amounts of persistent data.

NoSQL Systems: Motivation Alternative to traditional relational DBMS Flexible schema Quicker/cheaper to set up Massive scalability Relaxed consistency  higher performance & availability No declarative query language  more programming Relaxed consistency  fewer guarantees

NoSQL Systems: Motivation Example #1: Web log analysis Each record: UserID, URL, timestamp, additional-info Task: Load into database system Data cleaning Data extraction Verification Schema Nothing above is needed for noSQL!

NoSQL Systems: Motivation Example #1: Web log analysis Each record: UserID, URL, timestamp, additional-info Task: Find all records for… Given UserID Given URL Given timestamp Certain construct appearing in additional-info

NoSQL Systems: Motivation Example #1: Web log analysis Each record: UserID, URL, timestamp, additional-info Separate records: UserID, name, age, gender, … Task: Find average age of user accessing given URL May not require strict consistency.

NoSQL Systems: Motivation Example #2: Social-network graph Each record: UserID1, UserID2 Separate records: UserID, name, age, gender, … Task: Find all friends of friends of friends of … friends of given user Large number of joins? Not efficient at all! Specially designed graph database may be better

NoSQL Systems: Motivation Example #3: Wikipedia pages Large collection of documents Combination of structured and unstructured data Task: Retrieve introductory paragraph of all pages about U.S. presidents before 1900 Mix of structured and unstructured data

NoSQL Systems: Motivation Alternative to traditional relational DBMS Flexible schema Quicker/cheaper to set up Massive scalability Relaxed consistency  higher performance & availability No declarative query language  more programming Relaxed consistency  fewer guarantees

NoSQL Systems Overview

NoSQL Systems: Overview Several incarnations MapReduce framework: OLAP Key-value stores: OLTP Document stores Graph database systems

NoSQL Systems: Overview MapReduce Framework Originally from Google, open source Hadoop No data model, data stored in files User provides specific functions map() reduce() System provides data processing “glue”, fault-tolerance, scalability

NoSQL Systems: Overview Map and Reduce Functions Map: Divide problem into subproblems Reduce: Do work on subproblems, combine results

NoSQL Systems: Overview MapReduce Architecture

NoSQL Systems: Overview MapReduce Example: Web log analysis Each record: UserID, URL, timestamp, additional-info Task: Count number of accesses for each domain (inside URL)

NoSQL Systems: Overview MapReduce Example (modified #1) Each record: UserID, URL, timestamp, additional-info Task: Total “value” of accesses for each domain based on additional-info

NoSQL Systems: Overview MapReduce Framework No data model, data stored in files User provides specific functions System provides data processing “glue”, fault-tolerance, scalability

NoSQL Systems: Overview MapReduce Framework Schemas and declarative queries are missed Hive – schemas, SQL-like query language Pig – more imperative but with relational operators Both compile to “workflow” of Hadoop (MapReduce) jobs

NoSQL Systems: Overview Key-Value Stores Extremely simple interface Data model: (key, value) pairs Operations: Insert(key,value), Fetch(key), Update(key), Delete(key) Implementation: efficiency, scalability, fault-tolerance Records distributed to nodes based on key Replication Single-record transactions, “eventual consistency”

NoSQL Systems: Overview Key-Value Stores Extremely simple interface Data model: (key, value) pairs Operations: Insert(key,value), Fetch(key), Update(key), Delete(key) Some allow (non-uniform) columns within value Some allow Fetch on range of keys Example systems Google BigTable, Amazon Dynamo, Cassandra, Voldemort, HBase, …

NoSQL Systems: Overview Document Stores Like Key-Value Stores except value is document Data model: (key, document) pairs Document: JSON, XML, other semistructured formats Basic operations: Insert(key,document), Fetch(key), Update(key), Delete(key) Also Fetch based on document contents Example systems CouchDB, MongoDB, SimpleDB, …

NoSQL Systems: Overview Graph Database Systems Data model: nodes and edges Nodes may have properties (including ID) Edges may have labels or roles

NoSQL Systems: Overview Graph Database Systems Interfaces and query languages vary Single-step versus “path expressions” versus full recursion Example systems Neo4j, FlockDB, Pregel, … RDF “triple stores” can map to graph databases

NoSQL Systems: Overview “NoSQL” = “Not Only SQL” Not every data management/analysis problem is best solved exclusively using a traditional DBMS Current incarnations MapReduce framework Key-value stores Document stores Graph database systems