5/27/2014 Stephen Frein. About Me Director of QA for Comcast.com Adjunct for CCI https://www.linkedin.com/in/stephenfrein

Slides:



Advertisements
Similar presentations
BASIC SKILLS AND TOOLS USING ACCESS
Advertisements

Database Systems: Design, Implementation, and Management
No SQL is not about SQL No SQL is a Zoo.. Key-Value Stores Wide Column Stores Document Stores Graph Databases.
Megastore: Providing Scalable, Highly Available Storage for Interactive Services. Presented by: Hanan Hamdan Supervised by: Dr. Amer Badarneh 1.
Microsoft Access.
Chapter 6 Data Design.
Lecture plan Outline of DB design process Entity-relationship model
Presented by Douglas Greer Creating and Maintaining Business Objects Universes.
CS 440 Database Management Systems
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
NoSQL Databases: MongoDB vs Cassandra
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
NoSQL Database.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
A Study in NoSQL & Distributed Database Systems John Hawkins.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Distributed Data Stores and No SQL Databases S. Sudarshan IIT Bombay.
Databases with Scalable capabilities Presented by Mike Trischetta.
:: Conférence :: NoSQL / Scalabilite Etat de l’art Samuel BERTHE10 Mars 2014Epitech Nantes.
Getting Biologists off ACID Ryan Verdon 3/13/12. Outline Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL.
WTT Workshop de Tendências Tecnológicas 2014
Modern Databases NoSQL and NewSQL Willem Visser RW334.
Changwon Nati Univ. ISIE 2001 CSCI5708 NoSQL looks to become the database of the Internet By Lawrence Latif Wed Dec Nhu Nguyen and Phai Hoang CSCI.
NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
NOSQL DATABASES Please remember to read the NOSQL Distilled book and the Seven Databases book.
Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Himanshu Grover.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
Lecture 8: Databases and Data Infrastructure CS 6071 Big Data Engineering, Architecture, and Security Fall 2015, Dr. Rozier.
MongoDB First Light. Mongo DB Basics Mongo is a document based NoSQL. –A document is just a JSON object. –A collection is just a (large) set of documents.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
NOSQL DATABASE Not Only SQL DATABASE
NoSQL databases A brief introduction NoSQL databases1.
Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.
CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
Dive into NoSQL with Azure Niels Naglé Hylke Peek.
Amirhossein Saberi May CASSANDRA NAME A daughter of the Trojan king Priam, who was given the gift of prophecy by Apollo. When she cheated him, however,
Neo4j: GRAPH DATABASE 27 March, 2017
CS 405G: Introduction to Database Systems
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
and Big Data Storage Systems
CSE 775 – Distributed Objects Bekir Turkkan & Habib Kaya
Introduction In the computing system (web and business applications), there are enormous data that comes out every day from the web. A large section of.
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Modern Databases NoSQL and NewSQL
NOSQL.
CMPE 280 Web UI Design and Development October 17 Class Meeting
Dineesha Suraweera.
NOSQL databases and Big Data Storage Systems
NoSQL Systems Overview (as of November 2011).
Massively Parallel Cloud Data Storage Systems
1 Demand of your DB is changing Presented By: Ashwani Kumar
NOSQL and CAP Theorem.
NoSQL Databases An Overview
Teaching slides Chapter 8.
NoSQL Databases Antonino Virgillito.
NoSQL Not Only SQL University of Kurdistan Faculty of Engineering
CSE 482 Lecture 5: NoSQL.
relational thoughts on NoSql
CS5220 Advanced Topics in Web Programming Introduction to MongoDB
Transaction Properties: ACID vs. BASE
NoSQL Overview + Elasticsearch Quick Dive
CMPE 280 Web UI Design and Development March 14 Class Meeting
NoSQL & Document Stores
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Presentation transcript:

5/27/2014 Stephen Frein

About Me Director of QA for Comcast.com Adjunct for CCI

Stuff We'll Talk About Traditional (relational) databases What is NoSQL? Types of NoSQL databases Why would I use one? Hands-on with Mongo Cluster considerations

Relational Databases Well-defined schema with regular, “rectangular” data Use SQL (Structured Query Language)

Relational Databases Transactions* meet ACID criteria: Atomic – all or nothing Consistent – no defined rules are violated, and all users see the same thing when complete Isolated – in-progress transactions can’t see each other, as if these were serialized Durable – database won’t say work is finished until it is written to permanent storage *sets of logically related commands – “units of work”

Frein - INFO RA 6 The Next Challenger Relational databases dominant, but have had various challengers over the years – Object-oriented – XML These have faded into niche use – relational, SQL-based databases have been flexible / capable enough to make newcomers rarely worth it NoSQL is next wave of challenger

What is NoSQL? “…an ill-defined set of mostly open source databases, mostly developed in the early 21 st century, and mostly not using SQL.” - Martin Fowler Hard to say…

Loose Characterization Don’t store data in relations (tables) Don’t use SQL (or not only SQL) Open source (the popular ones) Cluster friendly Relaxed approach to ACID Use implicit schemas ↑ Not true all the time

Why Use NoSQL? Productivity o May be a good fit for the kind of data you have and the pace of your development o Operations can be very fast Large Scale Data o Works well on clusters o Often used for mega-scale websites

At What Cost? Dropping ACID o BASE (contrived, but we’ll go with it) o Basically Available o Soft state o Eventually consistent Data Store Becomes Dumber o Have to do more in the app o No “integration” data stores Standardization o No common way to address various flavors o Learning curve

Flavors of NoSQL Key-value: use key to retrieve chunk of data that app must process (Riak, Redis) – Fast, simple – Example use: session state Document: irregular structures but can still search inside each document (Mongo, Couch) – Flexibility in storage and retrieval – Example use: content management

What Does Irregular Look Like? Products: Product A: Name, Description, Weight Product B: Name, Description, Volume Product C: Name, Description Sub-Product X: Name, Description, Weight Sub-Product Y: Name, Description, Duration Sub-Sub-Product Z: Name, Description, Volume

Flavors of NoSQL Graph: stores nodes and relationships (Neo4j) – Natural and fast for graph data – Example use: social networks Column family: multi-dimensional maps with versioning (Cassandra, Hbase) – Work well for extremely large data sets – Example use: search engine

14 Productivity Can store “irregular” data readily Less set-up to get started – database infers structures from commands it sees Can change record structure on the fly Adding new fields or changing fields only has to be done in application, not application and database

15 Mongo Demo We'll use MongoDb to show off some NoSQL properties – Create a database – Store some data – Change structure on the fly – Query what we saved Go to We’ll enter commands here

Enter the following (one-at-a-time) at the prompt: steve = {fname: 'Steve', lname: 'Frein'}; db.people.save(steve); db.people.find(); suzy = {fname: 'Susan', lname: 'Queen', age: 30}; db.people.save(suzy); db.people.find(); db.people.find({fname:'Steve'}); db.people.find({age:30}); 16 Demo Code

The colon-value format used to enter data is called JSON (JavaScript Object Notation) You didn’t define structures up front – these were created on the fly as you saved the data (the save command) Steve and Susan had different structures, but both could be saved to “people” Mongo knew how to handle both structures – it could search for age (and return Susan) even though Steve had no age define 17 Notice

18 Consider How fast you can move and refine your database if structures are malleable, and dynamically defined by the data you enter How you could shoot yourself in the foot with such flexibility

19 Ow – My Foot! If you wrote code like this: emp1 = {firstname: 'Steve', lastname: 'Smith'}; db.employees.save(emp1); emp2 = {firstname: 'Billy', last_name: 'Smith'}; db.employees.save(emp2); Then you tried to run a query: db.employees.find({lastname:'Smith'}); You’d be missing Billy (last_name vs lastname) [ {"_id" : {"$oid" : "529bdefacc f“}, "lastname" : "Smith", "firstname" : "Steve" } ]

20 Scalability NoSQL databases scale easily across server clusters Instead of one big server, add many commodity servers and share data across these (cost, flexibility) Relational harder to scale across many servers (largely because of consistency issues that NoSQL doesn't emphasize)

21 CAP Theorem Consistency – All nodes have the same information Availability – Non-failed nodes will respond to requests Partition Tolerance – Cluster can survive network failures that separate its nodes into separate partitions PICK ANY TWO

22 CAP Theorem

23 In Practice If you will be using a distributed system (context in which CAP is discussed), you will be balancing consistency and availability Questions of degree – not binary Can sometimes specify the balance on a transaction-by-transaction basis (as opposed to whole system level)

24 NoSQL and Clusters Replication: Same data copied to many nodes (eventually) o self-managed when given replication factor Sharding: Different nodes own different ranges of data o auto-sharded and invisible to clients Can combine the two

25 Distributed Processing NoSQL clusters support distributed data processing Basic approach: Send the algorithm to the data (e.g., MapReduce) Map – process a record and convert it to key-value pairs Reduce – Aggregate key-value pairs with the same key

26 MapReduce Visualized

Learn More

Wrap-up Questions? Thanks!