A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Hilda Mackin, Gonzalo Perez, and Charles Tappert.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

C6 Databases.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
Relational Database Alternatives NoSQL. Choosing A Data Model Relational database underpin legacy applications and meet business needs However, companies.
Basic guidelines for the creation of a DW Create corporate sponsors and plan thoroughly Determine a scalable architectural framework for the DW Identify.
Database Administration
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
A Social blog using MongoDB ITEC-810 Final Presentation Lucero Soria Supervisor: Dr. Jian Yang.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
By N.Gopinath AP/CSE. Why a Data Warehouse Application – Business Perspectives  There are several reasons why organizations consider Data Warehousing.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Chapter 5 Lecture 2. Principles of Information Systems2 Objectives Understand Data definition language (DDL) and data dictionary Learn about popular DBMSs.
Persistence Store Project Proposal.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
Information: Policy, Strategy and Systems Module Overview
Fundamentals of Information Systems, Seventh Edition 1 Chapter 3 Data Centers, and Business Intelligence.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
NOSQL DATABASE Not Only SQL DATABASE
ITIL VS COBIT 06 PLM - Group 9
ANALYSIS PHASE OF BUSINESS SYSTEM DEVELOPMENT METHODOLOGY.
Directory Services CS5493/7493. Directory Services Directory services represent a technological breakthrough by integrating into a single management tool:
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Database Principles: Fundamentals of Design, Implementation, and Management Chapter 1 The Database Approach.
James A. Senn’s Information Technology, 3rd Edition
Neo4j: GRAPH DATABASE 27 March, 2017
Building a Data Warehouse
2 Phase Commit Protocol In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC) is a type of atomic commitment.
Building Enterprise Applications Using Visual Studio®
CS 405G: Introduction to Database Systems
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
NoSQL: Graph Databases
and Big Data Storage Systems
Fundamentals of Information Systems, Sixth Edition
Hadoop.
Data and database administration
CS122B: Projects in Databases and Web Applications Winter 2017
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
NoSQL Database and Application
Fundamentals of Information Systems, Sixth Edition
Modern Databases NoSQL and NewSQL
Software Engineering (CSI 321)
ServiceNow Implementation Knowledge Management
Chapter 6 Database Design
CHAPTER 3 Architectures for Distributed Systems
NOSQL databases and Big Data Storage Systems
Database Management Systems
NoSQL Systems Overview (as of November 2011).
Database Management System (DBMS)
1 Demand of your DB is changing Presented By: Ashwani Kumar
MANAGING DATA RESOURCES
Chapter 5 Designing the Architecture Shari L. Pfleeger Joanne M. Atlee
Physical Database Design
NoSQL Databases Antonino Virgillito.
Nada Al Dosary Edited By: Maysoon AlDuwais
Enterprise Program Management Office
NoSQL Not Only SQL University of Kurdistan Faculty of Engineering
SAMANVITHA RAMAYANAM 18TH FEBRUARY 2010 CPE 691
Introduction to NoSQL Database Systems
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Working with GEOLocation Data
The Database World of Azure
Presentation transcript:

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Hilda Mackin, Gonzalo Perez, and Charles Tappert

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Introduction  In the past, relational databases were used for all tasks to be processed by a database  Organizations have a significant problem: How should they choose between traditional SQL databases and No SQL data stores, which have very different characteristics?  The framework will help IT departments align perceived risks of No SQL database adoption with actual risks, helping IT Managers in their database adoption and the identification of risk factors that affect the new database technologies.  Quality Attribute Framework and Risk Analysis of No SQL databases that can measure different quality metrics such as Availability, Security, which are critical to choosing the right No SQL database for a given domain and for making better software development and design decisions.  The framework developed in this work through a qualitative analysis of risk vectors are obtained through surveys of top IT leaders and IT companies.

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Purpose of the research  This research investigates in a qualitative method the attributes that contribute to a No SQL Quality Attribute Framework and possible risks in the areas of Security and Availability.  This article improves and complements the study “Choosing the right No SQL database for the job: A Quality Attribute Evaluation” in the following aspects:  Quality attributes are classified and evaluated by four types of No SQL database types  Quality Attribute “Availability” and properties derived from it  Quality Attribute “Security” and properties derived from it  Security and Business Availability risks  Qualitative Attributes of No SQL databases are evaluated in different scenarios such as Financial, Medical, Retail, Social Networking. Contributions to the IT community Achieve a coherent way for IT Managers to understand No SQL database quality attributes and help them gain some insight on mitigation strategies for current No SQL databases risks. What No SQL databases attributes are generating the most positive and negative impacts on adoption and minimizing security, and business availability risks? Are there any misalignments between actual and perceived No SQL database risks uncovered through surveys to experts and IT professionals’ perceived risks?

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Defining Problems to address in the Research First Main Problem to Address in the research  What No SQL databases Quality Attributes are generating the most positive and negative impacts on Security, and Business Availability risks?  What Software Quality Attributes are required to evaluate and adopt No SQL databases?  What is the perceived Business Availability and Security Risks associated with these new database technologies in the context of Big Data? Second Main Problem to Address in the research  Are there any misalignments between actual and perceived No SQL database risks uncovered through surveys to experts and IT professionals’ perceived risks?  What are the impacts in the organization? Technical aspects, No Technical aspects First Sub Problem to Address in the research  Are Relational and No SQL databases totally mutually exclusive?  How applications intersect between Relational and No SQL databases? Presenting Financial, Medical, Retail, and Social Network use cases.  If the technologies do not overlap, but they are in intersection define Why to choose one or the other database?

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases No SQL Definition  No SQL databases were built in the need to deal with the increasing amount of complex data, required in some type of real time applications, addressing availability over consistency, horizontally scalable, distributed architecture and open source. [6] Data Model There are four types of No SQL databases: Key Value Database  Do not require a schema and offer great flexibility and scalability.  Do not offer ACID capability (Atomicity, Consistency, Isolation, Durability)  Key-value stores consist of keys and their corresponding values, which allows for data to be stored in a schema-less way.  Example: Memcached, Redis  High volume, media rich data gathering and storage  Caching layers for connecting RDBMS and No SQL databases

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Columnar Database Relational databases are row oriented. In Columnar Database the data is stored across rows, column oriented. Columnar, tabular stores consist of tables which can have a different schema for each row Such stores do not declare null fields as relational databases do, which scales very poorly when the number of columns becomes increasingly large. Example: Bigtable and Hbase. [7]  Implementations are best suited for:  High volume, incremental data gathering and processing  Real time information exchange (messaging)  Frequently changing content serving Document Database Document databases consist of a set of documents (possibly nested), each of which contains fields of data in a standard format like XML or Json. They allow for data to be structured in a schema-less way as such collections. Examples: Couch Base, MongoDB and OrientDB.[7]

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Graph Database Consist of a set of graph nodes linked together by edges. Each node contains fields of data and querying the store commonly uses mathematical graph-traversal algorithms to achieve performance. Examples: Neo4J, OrientDB

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases W ORK P LAN  Presenting previous study’s comparison of No SQL engines and their quality attributes. Then categorize these No SQL engines into four No SQL database types and add two additional engines.  Then, the quality attributes of the two examples of each type are consolidated to arrive at a direct comparison of the four No SQL database types. Finally, enlarge the set of attributes to include security and develop a framework for a qualitative survey.  Previous study identified several desirable quality attributes to evaluate No SQL databases: Availability, Consistency, Durability, Maintainability, Read Performance, Recovery Time, Reliability, Robustness, Scalability, Stabilization Time and Write Performance. It also selected some popular No SQL databases: Aerospike, Cassandra, Couchbase, CouchDB, HBase, Mongo DB, and Voldemort. Their study resulted in a summary table (Table 1) to aid software engineers and architects in their decision process when selecting a given No SQL database according to certain quality attributes.  Table 1. Quality attributes for popular No SQL engines (redrawn from [3]). Legend: “G” = Great, “+” = Good, “A” = Average, “-” = Mediocre, “B” = Bad, and “?” = Unknown/NA.

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Eight popular databases used by enterprise have been selected based on literature research and data collected from Survey and Interviews  Cassandra (Columnar)  HBase (Columnar)  Mongo DB (Document)  Couch DB (Document)  Redis (Key Value)  Voldemort (Key Value)  Neo4J (Graph)  OrientDB (Graph)

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Quality Attribute focused Survey classified by No SQL database type (Key Value, Columnar, Document, and Graph) where databases are compared with regards to their suitability for quality attributes. Expert opinions and positions Survey of enterprises and experts Anonymity of Participants Iterations Controlled Feedback Statistical Aggregation of Group Responses Research Tasks Inputs and Outputs The result of this methodology is a “Quality Attribute Framework and Risks Analysis of adopting No SQL databases”, which will aid software engineers and architects in their decision process when selecting a No SQL database according to their software quality, attributes requirements.

A Quality Attribute Framework and Risks Analysis of Adopting NoSQL Databases Security Quality Attributes Defined as the software and management set of tools that protect the database against attacks, hackers, and virus,  Authentication (password and user’s login)  Authorization (set of read and write permissions request on tables, creation of users, administrative functions)  Encryption (three levels: Data at rest, Client to server communication, Server to Server connection)  Auditing (Audit log) Enterprise editions Mongo DB and Cassandra have data at rest encryption and auditing  Data integrity  Confidentiality  Documentation

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Additional Quality Attributes Some additional attributes will be addressed in the study as part of the contribution envisioned Attributes related to ease of developing when using No SQL databases Most Popular No SQL Database Most Mature No SQL Database Attributes related to Querying possibilities Concurrency Control Latency Attributes related to Automatic Conflict Resolution, Version tested

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Eight popular No SQL engines are categorized into the four No SQL database types, taking five of the engines from Table 1 (Voldemort, MongoDB, CouchDB, Cassandra, HBase). Adding three additional engines (Redis, Neo4J, and OrientDB) so that two examples illustrate each of the four types as shown in Table 2. The attributes for the engines from Table 1 come from reference [3], the attributes for Redis and Neo4J from reference [4], and the attributes for OrientDB were created in this study to have two example engines per type. Table 2. Quality attributes for popular No SQL engines (redrawn from [3]). Legend: “G” = Great, “+” = Good, “A” = Average, “-” = Mediocre, “B” = Bad, and “?” = Unknown/NA.

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases No SQL Database Types The quality attributes of the two examples of each type are consolidated to arrive at a direct comparison of the four No SQL database types as shown in Table 3. Averaging the attribute values for the two example engines of each type was done conservatively by leaning toward the lower rating on the 5-point scale. Table 3. Averaged quality attributes for the four No SQL database types. Legend: “G” = Great, “+” = Good, “A” = Average, “-” = Mediocre, “B” = Bad, and “?” = Unknown/NA.

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases Quality Attribute Framework Table 4. Quality attributes for popular No SQL engines. Legend: “G” = Great, “+” = Good, “A” = Average, “-” = Mediocre, “B” = Bad, and “?” = Unknown/NA.

A Quality Attribute Framework and Risks Analysis of Adopting No SQL Databases C ONCLUSION  There are a large number of No SQL data stores that can be classified in four different types of No SQL databases. However, there is not a Quality Software framework that can help managers to decide which database is the most appropriate for their use case.  The diversity of data stores present challenges to differentiate and get a perspective of which databases is the most suited, establishing issues and opportunities for future research.  Sophisticated security and privacy provisions are needed and need to be explored. At the corporate level, companies e institutions need to bring software technology that offers security features and capabilities for expansion.  The development of a framework for quality attribute evaluation would benefit the No SQL adoption in the long term.  The framework proposed will help IT departments align perceived risks of No SQL database adoption with actual risks.