The Top 10 Reasons Why Federated Can’t Succeed And Why it Will Anyway.

Slides:



Advertisements
Similar presentations
Steve Lewis J.D. Edwards & Company
Advertisements

The following 10 questions test your knowledge of desired configuration management in Configuration Manager Configuration Manager Desired Configuration.
Distributed Processing, Client/Server and Clusters
Database System Concepts and Architecture
Chapter 1 Overview of Databases and Transaction Processing.
Chapter 19: Network Management Business Data Communications, 4e.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #15.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Overview of Databases and Transaction Processing Chapter 1.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Distributed DBMSs A distributed database is a single logical database that is physically distributed to computers on a network. Homogeneous DDBMS has the.
AP 12/00 From Object-Oriented Programming to Component Software OO Languages: –Ada, Smalltalk, Java, C++ Class versus Object: –Express existence of objects.
Overview Distributed vs. decentralized Why distributed databases
Ch1: File Systems and Databases Hachim Haddouti
Chapter 7: Client/Server Computing Business Data Communications, 5e.
Page 1Prepared by Sapient for MITVersion 0.1 – August – September 2004 This document represents a snapshot of an evolving set of documents. For information.
Chapter 9: Moving to Design
1 Introduction Introduction to database systems Database Management Systems (DBMS) Type of Databases Database Design Database Design Considerations.
1 CONCENTRXSept 2000 Our Perspective “Integration without an architecture is like doing a jigsaw puzzle on your lap “ – R Tessier We look at the big picture.
Chapter 1 Overview of Databases and Transaction Processing.
Database Management Managerial Overview. Managing Data Resources Data are a vital organizational resource that need to be managed like other important.
IMS 4212: Distributed Databases 1 Dr. Lawrence West, Management Dept., University of Central Florida Distributed Databases Business needs.
Framework: ISA-95 WG We are here User cases Studies
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
Functions of a Database Management System
1 CS 430 Database Theory Winter 2005 Lecture 1: Introduction.
Introduction to Databases A line manager asks, “If data unorganized is like matter unorganized and God created the heavens and earth in six days, how come.
 DATABASE DATABASE  DATABASE ENVIRONMENT DATABASE ENVIRONMENT  WHY STUDY DATABASE WHY STUDY DATABASE  DBMS & ITS FUNCTIONS DBMS & ITS FUNCTIONS 
Computer Measurement Group, India Optimal Design Principles for better Performance of Next generation Systems Balachandar Gurusamy,
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
material assembled from the web pages at
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
Csi315csi315 Client/Server Models. Client/Server Environment LAN or WAN Server Data Berson, Fig 1.4, p.8 clients network.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 1: The Database Environment Modern Database Management 9 th Edition Jeffrey A. Hoffer,
Session-8 Data Management for Decision Support
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
(Business) Process Centric Exchanges
CSS/417 Introduction to Database Management Systems Workshop 4.
1 XML Based Networking Method for Connecting Distributed Anthropometric Databases 24 October 2006 Huaining Cheng Dr. Kathleen M. Robinette Human Effectiveness.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003.
Service Oriented Architecture CCT355H5 Professor Michael Jones Suezan Makkar.
Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Distributed Databases
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Creating a Data Warehouse Data Acquisition: Extract, Transform, Load Extraction Process of identifying and retrieving a set of data from the operational.
Chapter 1 Overview of Databases and Transactions.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Object storage and object interoperability
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Lecture 15 Page 1 CS 236 Online Evaluating Running Systems Evaluating system security requires knowing what’s going on Many steps are necessary for a full.
Chapter 8: Data Warehousing. Data Warehouse Defined A physical repository where relational data are specially organized to provide enterprise- wide, cleansed.
Chapter 1 Overview of Databases and Transaction Processing.
7.5 Using Stored-Procedure and Triggers NAME MATRIC NUM GROUP Muhammad Azwan Bin Khairul Anwar CS2305A Muhammad Faiz Bin Badrol Shah CS2305B.
System Architecture CS 560. Project Design The requirements describe the function of a system as seen by the client. The software team must design a system.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
CHAPTER 25 - Distributed Databases and Client–Server Architectures
Chapter 19: Network Management
Netscape Application Server
SuperComputing 2003 “The Great Academia / Industry Grid Debate” ?
Chapter 1: Introduction
Database Systems: Design, Implementation, and Management Tenth Edition
The Top 10 Reasons Why Federated Can’t Succeed
Introduction to Database Systems
Overview of Databases and Transaction Processing
Building a Database on S3
Introduction to Data Warehousing
PLANNING A SECURE BASELINE INSTALLATION
Terms: Data: Database: Database Management System: INTRODUCTION
Presentation transcript:

The Top 10 Reasons Why Federated Can’t Succeed And Why it Will Anyway

But First…  What is our purpose as a community? Produce (wonderful) new ideas Structure the field Educate the workforce

A Brief History of Federation   Many attempts since Functional Relational Object-oriented Logic-based XML  Still not solved (think of last night)  And never will be?

Number 10: Robustness  Systems fail  Sources slow or unavailable  In a distributed system, more pieces => more failures  Users don’t like failures

Number 9: Security  Different systems have different security mechanisms Hard to create a single coherent view of permissions  Distributed systems are more vulnerable More points of failure Hard to make security guarantees  Data is often the corporate jewels It must be protected

Number 8: Updates  Recording change isn’t always an UPDATE Application semantics must be accounted for Application APIs must be reckoned with  ACIDity isn’t always achievable Not all data sources display ACID properties  Varying degrees of support Strong transaction semantics not always possible or appropriate  And always painful Changes to multiple sources must be coordinated  Requirements for consistency vary

Number 7: Configurability  Many architectures possible Even with pre-existing sources, many choices Little or no guidance on tradeoffs  Lots of code to install Federation engine, data source clients Often choices here  Lots of connections to define Need tooling to support

Number 6: Administration  Monitoring is hard Not all sources have facilities to track events Variety of mechanisms for different events, and different sources Not always APIs  Tuning is difficult Need to understand what must change Need to take appropriate actions  Repairing is painful Distributed debugging Different vendors to deal with for fixes

Number 5: Semantic heterogeneity  Hard to identify commonalities Same terms, different meanings Different terms, same meaning Different structures representing different interpretations  Can’t integrate data effectively without them Can’t make sensible queries

Number 4: Insufficient Metadata  Need metadata to integrate, configure, administer and query  Every data source has different metadata No uniform standard Not always collected  Tools to examine and exploit missing

Number 3: Performance (Data Movement)  Distributed queries involve moving data  Geographic distribution is common WAN is slow  Large data volumes common Large numbers of objects Large objects  Caching isn’t a complete answer Changes can be frequent and hard to track Storage is not unlimited

Number 2: Performance (Complexity)  Decision-support appls do complex queries Many choices for how to execute Big differences in performance among choices  Need data from diverse sources May not have enough power in source Performance at sources may vary  Need expensive functions of data Function may not be implemented everywhere Flowing the data to the function expensive

Number 1: Performance (Pathlength)  Simple queries (OLTP-like) incur huge overheads Processing and networking costs  Simple queries are common Easier to write Automatically produced Workflows

So Why Will Federated Succeed?  It has to Integration one of the top IT issues  And it’s not going away Alternatives are expensive and/or painful  Write it by hand  EAI/Workflow  Consolidation (warehouse, data marts…)

So Why Will Federated Succeed? (2)  Simple scenarios exist Don’t need OLTP, high security, great robustness, … for all applications Customers know their data, or must learn anyway Needs are so great, compromise is possible

So Why Will Federated Succeed? (3)  Progress on technology being made 20 years of distributed query processing Plumbing in place  Commit protocols  Reliable messaging  Connectivity infrastructure XML (basic community agreement)  XML data format  XML schema  Web services We’re getting closer

What would we do if it ever did work?  Retire  Integrate the web? Data grids Data Google  P2P database?

For Discussion  Is research in this area warranted?  What are the most important research topics? Did we miss any?