Jeremy Shafer Temple University

Slides:



Advertisements
Similar presentations
NoSQL Databases: MongoDB vs Cassandra
Advertisements

In 10 minutes Mohannad El Dafrawy Sara Rodriguez Lino Valdivia Jr.
COMP 5138 Relational Database Management Systems Semester 2, 2007 Lecture 8A Transaction Concept.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
NoSQL W2013 CSCI 2141.
Computers Are Your Future Tenth Edition Chapter 12: Databases & Information Systems Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall1.
Functions of a Database Management System
:: Conférence :: NoSQL / Scalabilite Etat de l’art Samuel BERTHE10 Mars 2014Epitech Nantes.
Getting Biologists off ACID Ryan Verdon 3/13/12. Outline Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL.
O n l y f r o m S y n e r g i s t i c s Establishing Knowledge Communities with KM Portals.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
NoSQL Not Only SQL Edel Sherratt. What is NoSQL? Not Only SQL Large volumes of data No schema Partition tolerance – scale by adding more commodity servers.
NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
The Relational Model1 Transaction Processing Units of Work.
CAP Theorem Justin DeBrabant CIS Advanced Systems - Fall 2013.
MongoDB Jer-Shuan Lin.
15.1 Transaction Concept A transaction is a unit of program execution that accesses and possibly updates various data items. E.g. transaction to transfer.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Software System Lab. Transactions Transaction Concept A transaction is a unit of program execution that accesses and possibly updates various.
NoSQL databases A brief introduction NoSQL databases1.
Advanced Database CS-426 Week 6 – Transaction. Transactions and Recovery Transactions A transaction is an action, or a series of actions, carried out.
Learn Hadoop and Big Data Technologies. Hadoop  An Open source framework that stores and processes Big Data in distributed manner on a large groups of.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
10 Best Technologies to Learn at Eduonix in 2016 The tech field is progressing rapidly, with newer software applications and development tools being released.
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
BIG DATA/ Hadoop Interview Questions.
Locks, Blocks & Isolation Oh My!. About Me Keith Tate Data Professional for over 14 Years MCITP in both DBA and Dev tracks
Database Principles: Fundamentals of Design, Implementation, and Management Chapter 1 The Database Approach.
BI 202 Data in the Cloud Creating SharePoint 2013 BI Solutions using Azure 6/20/2014 SharePoint Fest NYC.
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
Neo4j: GRAPH DATABASE 27 March, 2017
CSCI5570 Large Scale Data Processing Systems
and Big Data Storage Systems
Cloud Computing and Architecuture
Big Data is a Big Deal!.
SAS users meeting in Halifax
PROTECT | OPTIMIZE | TRANSFORM
Free Transactions with Rio Vista
A free and open-source distributed NoSQL database
CLOUDERA TRAINING For Apache HBase
Modern Databases NoSQL and NewSQL
NOSQL.
Database Concepts.
Introduction to NewSQL
NOSQL databases and Big Data Storage Systems
Ministry of Higher Education
A Comparison of SQL and NoSQL Databases
Massively Parallel Cloud Data Storage Systems
1 Demand of your DB is changing Presented By: Ashwani Kumar
Big Data - in Performance Engineering
ACID PROPERTIES.
NOSQL and CAP Theorem.
NoSQL Databases An Overview
Batches, Transactions, & Errors
IMS & Wireline to Wireless Convergence
Transactions.
NoSQL W2013 CSCI 2141.
Big Data Young Lee BUS 550.
TIM TAYLOR AND JOSH NEEDHAM
Batches, Transactions, & Errors
Transaction Properties: ACID vs. BASE
Introduction to Data Science
Big Data Analysis in Digital Marketing
AGENDA Buzz word. AGENDA Buzz word What is BIG DATA ? Big Data refers to massive, often unstructured data that is beyond the processing capabilities.
The PROCESS of Queries John Deardurff August 8, 2015
NoSQL & Document Stores
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Copyright © JanBask Training. All rights reserved Get Started with Hadoop Hive HiveQL Languages.
Big Data.
Presentation transcript:

Jeremy Shafer Temple University NoSQL & Hadoop Jeremy Shafer Temple University

What is NoSQL? Before we can answer that question…. We need to know something about A.C.I.D. transactions The ideas behind A.C.I.D. transacitons were fomalized in the 1970s, and the acronym A.C.I.D. was coined in 1983.

Atomic - requires that each transaction be "all or nothing" A.C.I.D. transactions Atomic - requires that each transaction be "all or nothing" Consistent - ensures that any transaction will bring the database from one valid state to another Isolation - concurrent execution of transactions result in a system state that would be obtained if transactions were executed serially. Durable - once a transaction has been committed, it will remain so, even in the event of power loss, crashes or errors. Imagine a system designed to send money from one bank to another … how important would each of these be?

CAP Theorem

NoSQL challenges these assumptions In the 70s, that seemed like a bad idea. In the 80s, that seemed like a bad idea. In the 90s, that seemed like a bad idea. But in the late 2000s – NoSQL started to gain notable traction in the IT community. Why?

Thanks, Gartner research!! Big Data Velocity Variety Volume Thanks, Gartner research!!

So… what is NoSQL? NoSQL - an umbrella term for a loosely defined class of non-relational data stores Started to emerge in 2009 No predefined schema and/or tolerant of schema changes IT / Social Media companies needed techniques for dealing with large volumes of distributed data.

What is NoSQL? (Continued) The data structures used by NoSQL databases (meant to hold multiple, complex values) differ from those used in relational databases. This makes some operations faster in NoSQL and some faster in relational databases. Consider NoSQL technologies if you are: willing to accept the possibility of data loss for the sake of speed/performance dealing with tons of user generated content in varied formats willing to accept the possibility of some inconsitency in exchange for a system that can scale up to hundreds of thousands of users NoSQL is important, but also a bit over-hyped Watch what is NoSQL and Exploring possibilities video For a laugh, see: https://www.youtube.com/watch?v=b2F-DItXtZs (Warning: Not family-friendly language!)

Contenders MongoDB (Open Source, MongoDB Inc.) CouchDB (Open Source, Apache Software Foundation) Cassandra (Open Source, Facebook) Hadoop HIVE (Open Source, Apache Software Foundation) None of these technologies have “easy” user interfaces. They are, however, advanced… cutting edge. Being willing to sacrifice some ease-of-use for the sake of competitive advantage is an important business trade off! Watch CAP theorem video clip

Why I like HIVE Redundancy built on HDFS and the general approach makes sense (to me) Leverages knowledge of the SQL syntax This is a “Not Only” SQL solution

A little experiment

Hey… I thought this was supposed to be fast?

Where to from here? In the future, organizations will use many data technologies. Data professionals will need to be familiar with these different approaches and know how to match them to different problems. Learning the concepts is an important first step, but to really understand you’ll need to get experiment with these new technologies, even if only on a small scale.

If I were you… Learn some Linux: http://www.ee.surrey.ac.uk/Teaching/Unix/ Watch and work through the “Up and Running with NoSQL databases” tutorial at http://lynda.temple.edu Explore Hortonworks.com tutorials: http://hortonworks.com/tutorials If you are interested in HIVE see: http://hortonworks.com/blog/hive-cheat-sheet-for-sql-users/ If you are going to explore MongoDb or CouchDB, learn some Javascript. See Javascript Essential Training at http://lynda.temple.edu and take MIS3502! Let me know if you think there should be a “big data” course by emailing me at jeremy@temple.edu