Is NoSQL the Future of Data Storage? By Gary Short Developer Express.

Slides:



Advertisements
Similar presentations
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
Advertisements

Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal et al. Oct 24, 2011.
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
NoSQL Databases: MongoDB vs Cassandra
Reporter: Haiping Wang WAMDM Cloud Group
Introduction to Backend James Kahng. Install Node.js.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Data in the cloud O’Reilly MySQL Conference Mårten Mickos CEO, Eucalyptus Systems
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Distributed Data Stores and No SQL Databases S. Sudarshan IIT Bombay.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Databases with Scalable capabilities Presented by Mike Trischetta.
AN INTRODUCTION TO NOSQL DATABASES Karol Rástočný, Eduard Kuric.
1 Introduction to Big Data and NoSQL SQL Azure Saturday April, 21, 2012 Don Demsak Advisory Solutions Architect EMC Consulting
SQL vs NOSQL Discussion
Extreme Scaling with SQL Azure SQL Bits 7, York, October 2010 Martin Schmidt – Miracle A/S Denmark.
ZhangGang, Fabio, Deng Ziyan /31 NoSQL Introduction to Cassandra Data Model Design Implementation.
NoSQL by Michael Britton, Mark McGregor, and Sam Howard
:: Conférence :: NoSQL / Scalabilite Etat de l’art Samuel BERTHE10 Mars 2014Epitech Nantes.
Distributed Data Stores and No SQL Databases S. Sudarshan Perry Hoekstra (Perficient) with slides pinched from various sources such as Perry Hoekstra (Perficient)
© , OrangeScape Technologies Limited. Confidential 1 Write Once. Cloud Anywhere. Building Highly Scalable Web applications BASE gives way to ACID.
Goodbye rows and tables, hello documents and collections.
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
The Lightning Way XIV Encontro da comunidade SQLPort LX
NoSQL Databases NoSQL Concepts SoftUni Team Technical Trainers Software University
NoSQL Not Only SQL Edel Sherratt. What is NoSQL? Not Only SQL Large volumes of data No schema Partition tolerance – scale by adding more commodity servers.
Changwon Nati Univ. ISIE 2001 CSCI5708 NoSQL looks to become the database of the Internet By Lawrence Latif Wed Dec Nhu Nguyen and Phai Hoang CSCI.
NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.
Cloud Computing Clase 8 - NoSQL Miguel Johnny Matias
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
Iran Hutchinson.  I work for InterSystems who drives the new NoSQL project. 
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
NOSQL DATABASE Not Only SQL DATABASE
NoSQL: Graph Databases. Databases Why NoSQL Databases?
Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.
NoSQL databases A brief introduction NoSQL databases1.
CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.
Distributed databases A brief introduction with emphasis on NoSQL databases Distributed databases1.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Dive into NoSQL with Azure Niels Naglé Hylke Peek.
Solr Power FTW Alex #solrnosql. What Will I Cover? Who I am What Bazaarvoice does SOLR and NoSQL Can SOLR handle 20K queries per second?
NoSQL: Graph Databases
CS 405G: Introduction to Database Systems
NoSQL Know Your Enemy Shelly Noll Learning Care Group, Novi, MI
NoSQL: Graph Databases
and Big Data Storage Systems
Cloud Computing and Architecuture
NoSQL Know Your Enemy Shelly Noll SRT Solutions, Ann Arbor, MI
CS122B: Projects in Databases and Web Applications Winter 2017
A free and open-source distributed NoSQL database
Introduction In the computing system (web and business applications), there are enormous data that comes out every day from the web. A large section of.
Modern Databases NoSQL and NewSQL
NOSQL.
Introduction to NewSQL
NOSQL databases and Big Data Storage Systems
NoSQL Systems Overview (as of November 2011).
Massively Parallel Cloud Data Storage Systems
11/18/2018 2:14 PM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
آزمايشگاه سيستمهای هوشمند علی کمالی زمستان 95
April 13th – Semi-structured data
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Presentation transcript:

Is NoSQL the Future of Data Storage? By Gary Short Developer Express

Introduction Gary Short Technical Evangelist for Developer Express C# MVP

Where Does NoSQL Originate? 1998 – OS relational database Didn’t expose an SQL interface Created by Carlo Strozzi – Said the NoSQL movement “departs from the relational model altogether...” “...should have been called ‘NoREL”.

More Recently... Eric Evans reintroduced the term in 2009 – Johan Oskarsson (last.fm) Event to discuss OS distributed databases This labels growing number datastores – Open source – Non-relational – Distributed – (often) don’t guarantee ACID.

Atlanta 2009 No:sql(east) conference Billed as “conference of no-rel datastores” Worst tag line ever – SELECT fun, profit FROM real_world WHERE rel=false.

Not Ant-RDBMS

Key Attributes of NoSQL Databases Don’t require fixed table schemas Non-relational (Usually) avoid join operations Scale horizontally – Adding more nodes to a storage system.

What Does the Taxonomy Look Like?

Document Store Apache Jackrabbit CouchDB MongoDB SimpleDB XML Databases – MarkLogic Server – eXist.

Document What? Okay think of a web page... – Relational model requires column/tag – Lots of empty columns – Wasted space Document model just stores the pages as is – Saves on space – Very flexible.

Graph Storage AllegroGraph Core Data Neo4j DEX FlockDB.

Which Means? Graph consists of – Node (‘stations’ of the graph) – Edges (lines between them) FlockDB – Created by the Twitter folks – Nodes = Users – Edges = Nature of relationship between nodes.

Key/Value Stores On disk Cache in Ram Eventually Consistent – Weak Definition “If no updates occur for a period, eventually all updates will propagate through the system and all replicas will be consistent” – Strong Definition “for a given update and a given replica eventually either the update reaches the replica or the replica retires” Ordered – Distributed Hash Table allows lexicographical processing.

Object Databases Db4o GemStone/S InterSystems Caché Objectivity/DB ZODB.

Okay got it, Now Let’s Compare Some Real World Scenarios

You Need Constant Consistency You’re dealing with financial transactions You’re dealing with medical records You’re dealing with bonded goods Best you use a RDMBS.

You Need Horizontal Scalability You’re working across defined timezones You’re Aggregating large quantities of data Maintaining a chat server (Facebook chat) Use NoSQL.

Up in the Clouds Baby If you are using Azure or AWS – Compare costs of Azure Storage or SimpleDB to SQL Azure or Elastic RDBMS Could be cheaper for your scenario.

It’s all About the iPhone!

Frequently Written Rarely Read Think web counters and the like Every time a user comes to a page = ctr++ But it’s only read when the report is run Use NoSQL (key-value storage).

I Got Big Data! Think weather stats Satellite Images Maps Use NoSQL ( Something like Hadoop).

Binary Baby! If you are YouTube Flickr Twitpic Spotify NoSQL (Amazon S3).

Here Today Gone Tomorrow Transient data like.. – Web Sessions – Locks – Short Term Stats Shopping cart contents Use NoSQL (Memcache).

Data Replication Same data in two or more locations – Music Library Web browser iPone App NoSQL (CouchDB).

Hit me Baby One More Time! High Availability – High number of important transactions Online gambling Pay Per view – Ahem! Online Auction NoSQL (Cassandra – automatic clustering).

Give me a Real World Example Twitter – The challenges Needs to store many graphs – Who you are following – Who’s following you – Who you receive phone notifications from etc To deliver a tweet requires rapid paging of followers Heavy write load as followers are added and removed Set arithmetic (intersection of users).

What Did They Try? Relational Databases Key-Value storage of denormalized lists Did it work? – Nope! Either good at – Handling the write load – Or paging large amounts of data – But not both .

What Did They Need? Simplest possible thing that would work Allow for horizontal partitioning Allow write operations to – Arrive out of order – Or be processed more than once Failures should result in redundant work – Not lost work!

The Result was FlockDB Stores graph data Not optimised for graph traversal operations Optimised for large adjacency lists – List of all edges in a graph Key is the edge value a set of the node end points Optimised for fast read and write Optimised for page-able set arithmetic.

How Does it Work? Stores graphs as sets of edges between nodes Data is partitioned by node – All queries can be answered by a single partition Write operations are idempotent – Can be applied multiple times without changing the result And commutative – Changing the order of operands doesn’t change the result.

Commutative Writes Help Bring up Partitions Partition can receive write traffic immediately Receive dump of data in the background Live for read as soon as the dump is complete.

Performance? Currently store 13 billion edges 20K writes / second 100K reads / second.

Lessons Learned? Use aggressive timeouts – Cut a client loose after timeout expired – Let it try again on another app server Use same code path for error and normal ops – Error requests are periodically retried Instrument.

Punchline? Under all the bells and whistles... – Its MySQL.

So is this the Future? Yes! And No!

Questions? Contact me –

Coming up… P/X001 Understanding and Preventing SQL Injection Attacks Kevin Kline P/L001 SSIS Fieldnotes Darren Green P/L002 The (Geospatial) Shapes of Things to Come Simon Munro P/L005 End to End Master Data Management with SQL Server Master Data Services Jeremy Kashel P/T007 Understanding Microsoft Certification in SQL Server Chris Testa-O'Neill # SQLBITS