Download presentation
Presentation is loading. Please wait.
Published byStephanie Francis Modified over 8 years ago
1
Just insert..it's NOSQL Assem Ragab
2
Agenda ● NOSQL in others' sight ● Life evolves! ● New business generation ● RDBMS limitations Vs new Business needs ● Who believes in NOSQL? ● What NOSQL introduces ● CAP Theorem
3
NOSQL in others' sight “The NoSQL databases are beginning to feel like an ice cream store that entices you with a new flavor of the month”...Thomas Kurian, Executive Vice President, Product Development Oracle
4
NOSQL in others' sight “I kinda hate the term 'NoSQL'. I much prefer the term Not Only SQL. Best is non- relational”...Kenny Gorman, X-Database Architect at Hi5 Networks
5
NOSQL in others' sight “Facebook was at one point spending $1M per month for specialized database hardware to serve their pictures”... Facebook executive
6
NOSQL in others' sight “EMC is using a mixture of traditional databases and newfangled NoSQL data stores to analyze public perception of the company and its products” “The NoSQL technologies are useful in summarizing a huge data set, while SQL can then be used for a more detailed analysis”...Subramanian Kartik, distinguished EMC engineer
7
NOSQL in others' sight “As Netflix moved into the cloud, we needed to find the appropriate mechanisms to persist and query data within our highly distributed infrastructure. Our goal is to build fast, fault tolerant systems at Internet scale. We realized that in order to achieve this goal,we needed to move beyond the constraints of the traditional relational model”...Yury Izrailevsky, Director of Cloud and Systems Infrastructure at Netflix
8
NOSQL in others' sight “When you’re storing every transaction for 800 million users and handling more than 60 million queries per second, your database environment had better be something special”...Domas Mituzas, Database engineer at Facebook, Data & Performance Engineer at Wikipedia
9
Life evolves!
11
New business generation ● Cloud computing ● Increased number of internet users ● Social networks became a trend ● Internet became a fertile field for money makers and entrepreneurs
12
Who believes in NOSQL?
13
Facebook ● More than 900,000,000 users ● Average user has 137 friends ● People spend 8 billion minutes on the site every day ● There are 3.5 billion pieces of content shared weekly. 2.5 billion photos are uploaded every month, and 1.2 million photos are served up every second ● “Today we have somewhere in the neighborhood of 30,000 servers”...Jeff Rothschild, vice president of technology at Facebook
14
Twitter Twitter by numbers herehere
15
Netflix ● American provider of on-demand Internet streaming media in the United States, Canada,Latin America..etc ● 23 million total number of Netflix streaming subscribers ● 2 billion hours number of hours spent by Netflix members watching streamed video
16
Others ● Google has 91 million searches per day ● Google was processing 20,000 terabytes of data (20 petabytes) a day. This large-scale computing capability is a big part of Google’s competitive advantage over Yahoo, Microsoft, and everyone else. ● Amazon has 59 million active customers ● And More than 42 terabytes (1000 gigabytes) of data
17
RDBMS limitations Vs new Business needs ● RDBMS shows many pitfalls in context of these new business requirements ● - ACID (Atomicity,Consistency,Isolation and Durability) ● - Normalization ● - Impossible to efficiently do joins in such scale ● - Single point of failure ● - Relational schema isn't always the best choice ●
18
RDBMS limitations Vs new Business needs ● RDBMS shows many pitfalls in context of these new business requirements ● - To work efficiently you need to fund much more ● - You know you have a big data problem when your hardware budget is growing exponentially ● -Scalability : Relational databases are designed to run on a single machine, so to scale, you need buy a bigger machine
19
RDBMS limitations Vs new Business needs-Scaling in RDBMS
20
What NOSQL introduces ● “Not Only SQL” instead of “No SQL” ● Most of NOSQL DBMSs are open source ● Massive write performance and availability ● Easier administration and operations
21
What NOSQL introduces ● Run in RAM ● Schema migration [ trivial and gradual ] ● No single point of failure ● Distributed systems support
22
CAP Theorem ● Requirements to distributed systems ● - Consistency:The system is in a consistent state after any operation. All clients see the same data [strong Vs eventual consistency] ● - Availability:The System is always on,no downtime. All clients find some available replica ● - Partition tolerance:The system continues to function even when split into subsets
23
CAP Theorem ● CAP theorem (E.Brewer, N.Lynch) : ● “ You can satisfy at most 2 out of the 3 requirements”
24
CAP Theorem ● CA: ● - Single cluster (2 PCs) ensures all nodes are always in contact,But in case of partitioning? ● CP: ● - Some data maybe inaccessible(availability sacrificed),but the rest still consistent and accurate ● AP ● - Some of returned data maybe inaccurate
25
Let's recap all together ● New business requirements ● RDBMS doesn't always fit ● NOSQL introduces a new mindset and capabilities to serve new business features
26
Open discussion
27
Cassandra ● Developed by Facebook(inbox), now Apache ● P2P ● -Every node is aware of all other nodes in the cluster ● Design goals ● -High availability ● -Eventual consistency ● -Incremental scalability
28
Cassandra ● Data model ● -Column:the lowest/smallest increment of data. It's a triplet that contains a name, a value and a timestamp. ● -Super Column:A SuperColumn contains name & a value which is a map containing an unbounded number of Columns ● -Column Family(CF):A ColumnFamily is a structure that contains an infinite number of (Rows!!). Let's see moreLet's see more
29
Cassandra ● Cluster membership ● -Gossip,every node gossips 1-3 other nodes about he state of the cluster(merge incoming info with its own) ● -Changes in the cluster (Node in/out)
30
Cassandra ● Writes ● -Client sends request to a random node in the cluster ● -Data replicated to N nodes in the cluster (configuration) ● Read ● -Single read (return the first response) ● -Quorum read (return value that most nodes agreed on)
31
Facebook with Cassandra (2009,50 TB data,150 nodes)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.