Download presentation
Presentation is loading. Please wait.
Published byJoseph Powers Modified over 6 years ago
1
NoSQL Know Your Enemy Shelly Noll SRT Solutions, Ann Arbor, MI
@shellynoll
2
Disclaimer There is lots of disagreement about this topic
Everything I say could be wrong depending on who you ask Even if it’s right today, it will probably be wrong soon
3
What is nosql? It is a database management system
with the following features: Queries do not use SQL Doesn’t guarantee ACID properties Fault-tolerant, distributed architecture Coined by Carlo Strozzi in 1998 to describe a database he created that did not expose a SQL interface Term was co-opted in 2009 when Eric Evans from Rackspace and Johan Oskarsson from Last.fm organized an event to discuss the growing trend of open-source, distributed databases
4
Consistency Availability Partition Tolerance CAP Theorem
All nodes see the same data at the same time Availability Every request receives a success/failure response Partition Tolerance Operates despite failure of part of the system A distributed system can satisfy any two of these guarantees at the same time, but not all three A couple of basic theories we need to talk about to understand the difference between relational and noSQL databases
5
ACID vs BASE Atomicity Consistency Isolation Durability
Basically Available Soft State Eventual Consistency Instead of ACID properties found in relational database, nosql has something different. What is the opposite of a an acid? Nosql databases exhibit BASE properties All or nothing (atomicity) Data must be adhere to schema and rules (consistency) No transaction interferes with another (isolation) Permanency (durability) an application works basically all the time (basically available) does not have to be consistent all the time (soft-state) but will be in some known-state state eventually (eventual consistency,
6
ACID vs BASE ACID BASE Strong consistency Isolation Focus on “commit”
Nested transactions Conservative (pessimistic) Difficult to change schema Weak consistency Best effort Approximate answer OK Aggressive (optimistic) Simpler Faster Easier to change Consistency – adheres to the rules Isolation – transactions do not interfere Dr. Eric A. Brewer (2000)
7
Why Did This Happen??? Data-related reasons
Avoidance of unneeded complexity Avoidance of object-relational mapping Avoidance of making schema changes Performance-related reasons Higher throughput Horizontal scalability and running on commodity hardware Complexity and cost of setting up database clusters Complexity – consider Twitter – You have users, status updates, relationships between users, direct messages and not much else Object-relational mapping – object-oriented programmers have to create a layer in their applications that take the data from the database and transforms it into objects the application can use – also creates the overhead in syncing the state of the objects in memory with the entities in the database – expensive, time-consuming, nosql APIs look more like the objects programmers use NoSQL compromises reliability for better performance
8
Mongodb example
9
Database Types Key-Value Graph Document Store Column Store
10
Database type disagreement
Stephen Yen Ken North Rick Cattel Jonathan Ellis Wikipedia Amazon SimpleDB Entity-Attribute-Value Data Store Document Store Apache Hadoop Tabular Cassandra Wide Columnar Store Extensible Record Store Columnfamily Eventually-Consistent Key-Value Store Google Bigtable Key-Value Store HBase HyperTable Redis Data-Structures Server Collection Key-Value Cache
11
Key-Value Data is stored in a schema-less way with a key and a value
Limited querying capability Values can usually be of any data type, or could be a serialized object Variations Eventually consistent Hierarchical Ordered Key-value cache (in RAM or on disk) Memcached Redis Riak Basho Voldemort
12
Graph Based on graph theory
Data is stored as nodes (entities), properties, and edges (relationship) Allows for calculations between nodes Shortest distance between nodes Analysis of relationships Bigdata CloudGraph Neo4j OrientDB
13
Document Store Stores document-oriented or semi-structured data
Documents may be encoded as XML, YAML, JSON, BSON, PDF, MS Word, MS Excel, etc. Documents are not required to adhere to a standard schema Offers a query language to retrieve documents based on content Amazon SimpleDB Apache CouchDB Lotus Notes MongoDB
14
Column store Stores data in a tabular format
Different names for the exact same thing Wide Columnar Store ColumnFamily Tabular Entity-Attribute-Value Data Store Extensible Record Store Multivalue BigTable Apache Hadoop Cassandra Google Bigtable Hbase HyperTable
15
Popular databases Type Created By Language Used By Bigtable
Column store Google Google File System Cassandra Apache Java Netflix, Twitter, Constant Contact, Reddit, Digg CouchDB Document store Erlang Various Facebook applications Hadoop Yahoo! HBase Facebook's messaging platform HyperTable Zvents C++ Baidu Memcached Key-value store Danga C LiveJournal, YouTube, Reddit, Zynga, Facebook, Twitter MongoDB 10gen MTV Networks, Craigslist, Foursquare Neo4j Graph Neo Technology Adobe, Cisco Redis Vmware ANSI C Github, Craigslist, Blizzard, Digg, Twitter, Flickr, Stackoverflow Riak Basho Erlang, C, C++, JavaScript Comcast, Mozilla, AOL, Ask.com SimpleDB Amazon Voldemort LinkedIn
16
Comparisons Key-Value Stores High None Column Stores Moderate Low
Performance Scalability Flexibility Complexity Key-Value Stores High None Column Stores Moderate Low Document Stores Variable (High) Graph Databases Variable Relational Databases Ben Scofield (2010)
17
Where wouldn’t you use nosql?
Data is critical to the function of the business/application Data has strong and/or slowly changing schema Need true transactional capabilities Need data mining capabilities Set-based updates Banking apps Healthcare apps Enterprise apps
18
Where would you use nosql?
Heavy read/write Single-user Simple, non- structured data Lack of interconnected data Doesn’t matter if it takes a while to get the data consistent Data is not critical Social networking apps Mobile apps
19
Future of nosql UnSQL A query language for NoSQL databases
Does not have data definition language Acquisition of NoSQL databases by larger companies Similar to what happened in the BI space where IBM, Microsoft, and HP acquired smaller players
20
Shelly Noll SRT Solutions, Ann Arbor, MI Twitter
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.