CSE 775 – Distributed Objects Bekir Turkkan & Habib Kaya

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
Relational Database Alternatives NoSQL. Choosing A Data Model Relational database underpin legacy applications and meet business needs However, companies.
Chapter 3 Database Management
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
What is a database? Databases are designed to offer an organized mechanism for storing, managing and retrieving information.
A Study in NoSQL & Distributed Database Systems John Hawkins.
Systems analysis and design, 6th edition Dennis, wixom, and roth
NoSQL continued CMSC 461 Michael Wilson. MongoDB  MongoDB is another NoSQL solution  Provides a bit more structure than a solution like Accumulo  Data.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
NoSQL Databases NoSQL Concepts SoftUni Team Technical Trainers Software University
Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Himanshu Grover.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
CSE 3330 Database Concepts MongoDB. Big Data Surge in “big data” Larger datasets frequently need to be stored in dbs Traditional relational db were not.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
Lecture 8: Databases and Data Infrastructure CS 6071 Big Data Engineering, Architecture, and Security Fall 2015, Dr. Rozier.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
NOSQL DATABASE Not Only SQL DATABASE
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
BIG DATA/ Hadoop Interview Questions.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Why NO-SQL ?  Three interrelated megatrends  Big Data  Big Users  Cloud Computing are driving the adoption of NoSQL technology.
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
James A. Senn’s Information Technology, 3rd Edition
Neo4j: GRAPH DATABASE 27 March, 2017
CS 405G: Introduction to Database Systems
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
NoSQL: Graph Databases
DBSI Teaser Presentation
and Big Data Storage Systems
CS122B: Projects in Databases and Web Applications Winter 2017
Introduction In the computing system (web and business applications), there are enormous data that comes out every day from the web. A large section of.
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Open Source distributed document DB for an enterprise
Every Good Graph Starts With
Modern Databases NoSQL and NewSQL
NOSQL.
Dineesha Suraweera.
NOSQL databases and Big Data Storage Systems
NoSQL Systems Overview (as of November 2011).
Database Management System (DBMS)
Storage Systems for Managing Voluminous Data
Massively Parallel Cloud Data Storage Systems
1 Demand of your DB is changing Presented By: Ashwani Kumar
MANAGING DATA RESOURCES
What is database? Types and Examples
Physical Database Design
Intro to NoSQL Databases
Intro to NoSQL Databases
Ch 4. The Evolution of Analytic Scalability
MANAGING DATA RESOURCES
NoSQL Databases Antonino Virgillito.
Overview of big data tools
Data Model.
Intro to NoSQL Databases
Database Systems Summary and Overview
Indexing 4/11/2019.
Introduction to NoSQL Database Systems
Chapter 3 Database Management
CMPE 280 Web UI Design and Development March 14 Class Meeting
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Intro to NoSQL Databases
The Database World of Azure
Presentation transcript:

CSE 775 – Distributed Objects Bekir Turkkan & Habib Kaya BIG DATA Project CSE 775 – Distributed Objects Bekir Turkkan & Habib Kaya

Project Details Research on new database trends Comparisons of the systems Implementations of a project on MongoDB

Outline History of database management systems What does NoSQL mean? Why NoSQL database systems? Types of NoSQL database systems Data models for widely used NoSQL dbs Query models of NoSQL MongoDB Demo

History 1970s SQL is invented 1990s Object oriented databases tried to take place 2000s NoSQL databases came to market (Google’s Big Table, Amazon’s Dynamo)

Current Estimated Usage Number of mentions of the system on websites General interest in the system Frequency of technical discussions about the system Number of job offers, in which the system is mentioned Number of profiles in professional networks, in which the system is mentioned Relevance in social networks Rankings

What Does NoSQL mean? Not Only SQL, implying that there are more than one storage mechanism to design a software product or solution Common observations Not using the relational model Running well on clusters (Scalable) Mostly open source Built for the 21st century web estates Schema-less

Why NoSQL?

Pros and Cons of SQL Pros Cons Persistent Data Concurrency Integration (Mostly) Standard Model Relation Certain model Scalability Performance Clustering

Scalability for SQL systems Scale up – use a more powerful SQL Server Scale out – use more SQL Servers Scale up Options Replacing server with a faster one or having more memory Switching from 2 socket to 4 socket server: Doubles the licensing cost Switching from 4 to 8 socket server: Prices get serious Switching from 8 to 16 or more: Need to change the license which cost around $60000 for each socket Scale out Options Using bidirectional or merge replication Putting several read-only SQL Servers behind a load balancer Using third-party scale-out products 

Advantages of NoSQL DBs Cost effective for technical infrastructure Scalable (Good for massive data) Good scale out architectures (Uses Commodity Servers) Better performance (Suitable for clustering) Suitable for agile development No need to waterfall method for development Object oriented programming is the norm

NoSQL DB System Types 4 Major models are widely used. Wide Column Store / Column Families Hadoop/Hbase (Java), Cassandra (CQL), MapR (type of Hadoop) Document Store MongoDB(BSON), CouchDB(JSON) Key Value / Tuple Store Riak(JSON), DynamoDB(Auto Scalable) Graph Databases  Neo4j(Many APIs), Infinite Graph (Java) More

Data Model Document Model Store data in documents (JSON type of documents) Simply each record and associated data is stored in same document Each document can contain different fields which helps for modeling unstructured and polymorphic data Provides to query on any field and the natural mapping of the document data model to objects in modern programming languages. Useful for a wide variety of applications due to the flexibility of the data model

Graph Model Use graph structures with nodes, edges and properties to represent data. Data is modeled as a network of relationships between specific elements Useful for the systems that relations is the core to the database like social networks

Key Value Model Most basic type of NoSQL database systems Every item in the database is stored as an attribute name, or key, together with its value. The value of the item is opaque to the database but some of the tools can provide metadata sets and enables searching like Riak Does not enforce a set schema across key-value pairs. Useful for representing polymorphic and unstructured data

Wide Column Stores / Column families Uses distributed multi-dimensional sorted map to store data Each record can vary in the number of columns that are stored, and columns can be nested inside other columns called super columns Columns can be grouped together for access in column families Data is retrieved by primary key per column family Useful for a narrow set of applications that only query data by a single key value

Examples for Data Models

Query Model Document Database provides the ability to query on any field within a document provides the ability to analyze data in place (like sql group by) Regarding updates, some of them provide find and modify capabilities so that values in documents can be updated in a single statement

Graph Database These systems tend to provide rich query models where simple and complex relationships can be interrogated to make direct and indirect inferences about the data in the system. Relationship-type analysis tends to be very efficient in these systems, whereas other types of analysis may be less optimal.

Key Value and Wide Column databases These systems provide the ability to retrieve and update data based only on a primary key. Some products provide limited support for secondary indexes To perform an update in these systems, two round trips may be necessary: first find the record, then update it. In the systems, the update may be implemented as a complete rewrite of the record whether a few bytes have changed or the entire record.

Consistency Model NoSQL systems typically maintain multiple copies of the data for availability and scalability purposes Consistent Systems: writes by the application are immediately visible in subsequent queries Eventually Consistent Systems: Writes are not immediately visible. Most applications and development teams expect consistent systems. Different consistency models pose different trade-offs for applications in the areas of consistency and availability. Eventually consistent systems provide some advantages for writes at the cost of making reads and updates more complex.

APIs There is no standard for interfacing with NoSQL systems. The maturity of the API can have major implications for the time and cost required to develop and maintain the underlying NoSQL system. Idiomatic drivers minimize onboarding time for new developers and simplify application development.

Commercial Support and Community Strength Choosing a database is a major investment and difficult to change No standard and too many systems in the market Need to find the best fit for the needs Support is an important part of evaluating NoSQL products

MongoDB Demo

MongoDB File Storage MongoDB uses BSON format to store files. BSON is short for Binary JSON MongoDB deals with 4MB files so BSON files are chunked into 4MB files using GridFS.

References http://www.mongodb.com/nosql-explained http://docs.mongodb.org/manual/tutorial/getting-started/ http://nosql-database.org/ http://db-engines.com/en/ranking http://nosqlguide.com/column-store/nosql-databases-explained-wide-column-stores/ http://bi-bigdata.com/2013/01/13/what-is-wide-column-stores/ http://news.dice.com/2012/07/16/sql-vs-nosql-which-is-better/ http://dataconomy.com/sql-vs-nosql-need-know/ http://www.thoughtworks.com/insights/blog/nosql-databases-overview http://www.tutorialspoint.com/data_mining/dm_cluster_analysis.htm http://www.brentozar.com/archive/2011/02/scaling-up-or-scaling-out/ http://planetcassandra.org/what-is-nosql/#nosql-database-types http://www.sas.com/en_us/insights/big-data/what-is-big-data.html https://www.digitalocean.com/community/tutorials/understanding-sql-and-nosql-databases-and-different-database-models http://www.webopedia.com/quick_ref/important-big-data-facts-for-it-professionals.html https://blog.udemy.com/nosql-vs-sql-2/ http://www.thegeekstuff.com/2014/01/sql-vs-nosql-db/ http://www.couchbase.com/nosql-resources/what-is-no-sql http://www.w3schools.com/json/json_intro.asp

Thanks for Listening