ISSUES THE CLOUD AND DATABASES. WHAT KIND OF DATA MANAGEMENT IS A GOOD FIT WITH THE CLOUD? Analytical data management: data attributes Far more reads.

Slides:



Advertisements
Similar presentations
Thanks to Microsoft Azure’s Scalability, BA Minds Delivers a Cost-Effective CRM Solution to Small and Medium-Sized Enterprises in Latin America MICROSOFT.
Advertisements

C-Store: Data Management in the Cloud Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 5, 2009.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
ICS (072)Database Systems: A Review1 Database Systems: A Review Dr. Muhammad Shafique.
IBM TJ Watson Research Center © 2010 IBM Corporation – All Rights Reserved AFRL 2010 Anand Ranganathan Role of Stream Processing in Ad-Hoc Networks Where.
Databases Chapter Distinguish between the physical and logical view of data Describe how data is organized: characters, fields, records, tables,
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Overview Distributed vs. decentralized Why distributed databases
Integration of Applications MIS3502: Application Integration and Evaluation Paul Weinberg Adapted from material by Arnold Kurtz, David.
Data Mining – Intro.
1 Alternate Title Slide: Presentation Name Goes Here Presenter’s Name Infrastructure Solutions Division Date GIS Perfct Ltd. Autodesk Value Added Reseller.
Principles of Information Systems, Sixth Edition Organizing Data and Information Chapter 5.
A Comparsion of Databases and Data Warehouses Name: Liliana Livorová Subject: Distributed Data Processing.
Software Architecture April-10Confidential Proprietary Master Data Management mainly inspired from Enterprise Master Data Management – An SOA approach.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
By N.Gopinath AP/CSE. Why a Data Warehouse Application – Business Perspectives  There are several reasons why organizations consider Data Warehousing.
Database Systems: Design, Implementation, and Management Ninth Edition
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
XML, distributed databases, and OLAP/warehousing The semantic web and a lot more.
The Design Discipline.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Adra Match BALANCER: Balance Sheet Reconciliation Software Powered by the Microsoft Azure Cloud MICROSOFT AZURE ISV PROFILE: ADRA MATCH Adra Match develops.
Database System Concepts and Architecture
7.1 Managing Data Resources Chapter 7 Essentials of Management Information Systems, 6e Chapter 7 Managing Data Resources © 2005 by Prentice Hall.
CSCI 5980: From GPS and Google Earth to Spatial Computing Fall 2012 Midterm Presentation Chapter 7: Architectures Team 9: Thao Nguyen, Nathan Poole October.
Case 2: Emerson and Sanofi Data stewards seek data conformity
NOSQL DATABASES Please remember to read the NOSQL Distilled book and the Seven Databases book.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
Maximize Return on Engagement via Scalable Omni-Channel Online Services in the Cloud COMPANY PROFILE: XOMNI, INC. Founded in 2011 and headquartered in.
Communicate with All Workers Involved in the Process of Delivering High-Quality Health Care by Choosing Dossier365 on the Azure Platform MICROSOFT AZURE.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
ICS (072)Database Systems: An Introduction & Review 1 ICS 424 Advanced Database Systems Dr. Muhammad Shafique.
MICROSOFT AZURE ISV PROFILE: D-SCOPE SYSTEMS D-Scope Systems is an enterprise-level medical media product and integration specialist company. It provides.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
1 CS 430 Database Theory Winter 2005 Lecture 2: General Concepts.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Actualog Social PIM Helps Companies to Manage and Share Product Information Using Secure, Scalable Ease of Microsoft Azure MICROSOFT AZURE ISV PROFILE:
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Built on Azure, Moodle Helps Educators Create Proprietary Private Web Sites Filled with Dynamic Courses that Extend Learning Anytime, Anywhere MICROSOFT.
Mailjet and Microsoft Azure Offer All-in-One Infrastructure and Deliverability while Saving IT and Enterprise Time and Money with Scalability MICROSOFT.
Built on the Powerful Microsoft Azure Platform, Mproof’s Clientele ITSM Provides Companies with a Complete Software Suite to Manage Services MICROSOFT.
CloudWay.ro Gives Clients Fast Invoicing, Stock Management, and Resource Planning via Microsoft Azure and Azure SQL Database MICROSOFT AZURE ISV PROFILE:
Microsoft Azure and DataStax: Start Anywhere and Scale to Any Size in the Cloud, On- Premises, or Both with a Leading Distributed Database MICROSOFT AZURE.
Principles of Information Systems, Sixth Edition Organizing Data and Information Chapter 5.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
INTRODUCTION TO INFORMATION SYSTEMS LECTURE 9: DATABASE FEATURES, FUNCTIONS AND ARCHITECTURES PART (2) أ/ غدير عاشور 1.
Task Performance Group Provides Cutting-Edge E-Commerce B2B EDI Integration Using MegaXML SaaS Solution on Microsoft Azure Cloud Platform MICROSOFT AZURE.
McGraw-Hill/Irwin ©2008,The McGraw-Hill Companies, All Rights Reserved Chapter 5 Data Resource Management.
Call-Center Agents, Customers Communicate More Conveniently with SMS Chat App COMPANY PROFILE: EARLY CONNECT Early Connect is a regional SaaS ISV founded.
Managing Data Resources File Organization and databases for business information systems.
Data Platform and Analytics Foundational Training
Data Mining – Intro.
DocFusion 365 Intelligent Template Designer and Document Generation Engine on Azure Enables Your Team to Increase Productivity MICROSOFT AZURE APP BUILDER.
Open Source distributed document DB for an enterprise
Primal and Microsoft Azure Deliver Personalized Content, Intelligence, and Analytics That Match Your Content to the Interests of Your Audience MICROSOFT.
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Built on the Powerful Microsoft Azure Platform, Lievestro Delivers Care Information, Capacity Management Solutions to Hospitals, Medical Field MICROSOFT.
Chapter 1 Database Systems
Data Warehousing and Data Mining
Adra ACCOUNTS: Transaction Matching Software Powered by the Microsoft Azure Cloud That Helps Optimize the Accounting and Finance Processes MICROSOFT AZURE.
Data Model.
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Chapter 1 Database Systems
Iserve – Bulk Cash Deposit Kiosk
Presentation transcript:

ISSUES THE CLOUD AND DATABASES

WHAT KIND OF DATA MANAGEMENT IS A GOOD FIT WITH THE CLOUD? Analytical data management: data attributes Far more reads than writes, so security and privacy less of an issue Tend to have far greater data needs, so there is a need for more servers The size of the data set grows over time and does not stabilize, so a better fit with expanding cloud server availability Analytical applications often want data from multiple sources, and availability is much better in a cloud environment

MORE ON ANALYTICAL PROCESSING Analytical Data Managements: system attributes Shared nothing works better when access is mostly reads ACID transactions do not need to be enforced as there is no need for a single, global state for all users Generally, statistical results are okay even if some very secure data is not discovered

WHAT IS NEEDED FOR NEW GENERATION OF CLOUD DBS? Focus on making use of broad parallelism and on shifting/expanding set of servers Looser notion of fault tolerance, as there is often no need to restart an interrupted query or if a branch of a query is killed Need to be able to operate on data in multiple formats, encryptions, attribute domains, namespaces, schemas, database products – heterogeneity! Must be able to sit underneath business intelligence systems

HYBRID DATABASES: IS THIS THE ANSWER? Folks don’t want to learn/buy/program new data management products But folks do want commercial grade systems with professional support Would make the transition from transaction apps to analytical apps easier – like with relational data warehousing But would we end up with an inelligant mess?

WHAT ABOUT OBJECT DATABASES? A RETURN? Blending a host language with a query language makes sense when queries involve complex calculations It is easy to extend an o-o language with statistical procedures The encapsulation of o-o languages is a good match with the wide and independent distribution of data in a cloud environment O-O procedures could be built and deployed by distributed volunteers

MOPE ON O-O DBS Partial results could be maintained and kept up to date, with batch updating of raw data only infrequently We know how to build multiple language interfaces to accommodate multiple o-o languages O-O databases are a good match with service- based interfaces – see diagram on page 29

OBJECT-ORIENTED DBS: RELEVANT RESEARCH & DEV. Adaptive query processing and optimization in real time Parallel and distributed database technology Massively parallel systems Shared nothing systems Data management stream technology

PROBLEM: MOST BUSINESS DATA RIGHT NOW IS IN A RELATIONAL FORMAT We don’t have truly massively parallel and distributed query models for relational data We don’t have truly massively parallel and distributed data partitioning for relational data To perform efficient and fluid analytical processing of data in the cloud, we would need to create new links quickly, but we won’t have a focused, fixed schema as we do in standard relational systems Object extensions to relational systems don’t include method encapsulation, only expanded domains

MORE CLOUD ISSUES: CENTRALIZED CONTROL? Is the cloud trusted or anonymous? Trusted, provider-specific commercial cloud solutions are much safer, centrally managed, and optimized as a single network, not as a mesh of networks In many environments, even trusted, centralized environments, many machines are not properly managed and are controlled by immediate users People don’t like their machines being co-opted, and so trust is not enough to guarantee dependibility

MORE ON THE CLOUD: OTHER APPLICATIONS? Is analytical processing the only likely application? There are many data sharing applications There are many applications for selling access to bulk data Data mining is a more focused form of analytical processing, but demands a very precise level of heterogeneity resolution and integration in the case of most medical and financial applications (and others)

DATA MINING Kinds of data (from Data Mining by Han and Kamber) Relational dbs Data warehouses Transaction processing systems Object-relational dbs Time sequence and temporal dbs Spatial dbs Text dbs Multimedia dbs Legacy dbs Data streams The Web…

HETEROGENEITY IN DATABASES: DATA MINING IMPLICATIONS Note how broad the “Web” is on the previous slide Includes countless hand-rolled dbs Includes databases hidden by web development frameworks like Ruby on Rails Includes data accessible only via specific APIs Includes data accessible via XML and Xpath, Xquery technology Includes data stored in proprietary databases for applications like CAD, finance, animation, geography The heterogeneity problem will only be solved by widespread collaboration on unifying standards

MORE ON THE CLOUD: THE FUTURE OF TRANSACTION PROCESSING? Will the rigidly centralized notion of OLTP survive? Corporations are adapting to the cloud incrementally and using middleware to leverage their own clouds With global business comes global data processing, across time zones, and is often managed in a widely distributed fashion There are large corporations that handle financial and retail transactions for other companies Are people warming to the idea of managing their personal and small business data in the cloud, including document and other services?

BUT THE CLOUD IS PROCESS-CENTRIC AND NOT DATA-CENTRIC Is the process vs. data centric issued about to reawaken? The process folks kind of lost… Data is seen more and more as a valuable resource, even if it is only “sold” indirectly More of us are buying multimedia data There are actually 3 models, process and data centric, and encapsulated Some argue that the cloud is actually an encapsulated model and that in fact, data movement is difficult to optimize do to the dynamic nature of the network Object-oriented databases…?