An Overview of Issues in P2P database systems Presented by Ahmed Ataullah Wednesday, November 29 th 2006.

Slides:



Advertisements
Similar presentations
ECE /24/2005 A Survey on Position-Based Routing in Mobile Ad-Hoc Networks Alok Sabherwal.
Advertisements

Data Modeling and Database Design Chapter 1: Database Systems: Architecture and Components.
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.6/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
PeerDB: A P2P-based System for Distributed Data Sharing Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou Shawn Jeffery CS294-4 Peer-to-Peer Systems.
A Game Theoretic Approach to Provide Incentive and Service Differentiation in P2P Networks John C.S. Lui The Chinese University of Hong Kong Joint work.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
EE 4272Spring, 2003 Chapter 10 Packet Switching Packet Switching Principles  Switching Techniques  Packet Size  Comparison of Circuit Switching & Packet.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Database management concepts Database Management Systems (DBMS) An example of a database (relational) Database schema (e.g. relational) Data independence.
Mercury: Scalable Routing for Range Queries Ashwin R. Bharambe Carnegie Mellon University With Mukesh Agrawal, Srinivasan Seshan.
Overview Distributed vs. decentralized Why distributed databases
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Object Naming & Content based Object Search 2/3/2003.
A Game Theoretic Approach to Provide Incentive and Service Differentiation in P2P Networks Richard Ma, Sam Lee, John Lui (CUHK) David Yau (Purdue)
BUSINESS DRIVEN TECHNOLOGY
What Can Databases Do for Peer-to-Peer Steven Gribble, Alon Halevy, Zachary Ives, Maya Rodrig, Dan Suciu Presented by: Ryan Huebsch CS294-4 P2P Systems.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Emerging Research Dimensions in IT Security Dr. Salar H. Naqvi Senior Member IEEE Research Fellow, CoreGRID Network of Excellence European.
SQL Forms Engine Koifman Eran Egri Ozi Supervisor: Ilana David.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.
Database Management System Lecture 2 Introduction to Database management.
Evaluating Centralized, Hierarchical, and Networked Architectures for Rule Systems Benjamin Craig University of New Brunswick Faculty of Computer Science.
6-1 DATABASE FUNDAMENTALS Information is everywhere in an organization Information is stored in databases –Database – maintains information about various.
Chapters 17 & 18 Physical Database Design Methodology.
09/07/2004Peer-to-Peer Systems in Mobile Ad-hoc Networks 1 Lookup Service for Peer-to-Peer Systems in Mobile Ad-hoc Networks M. Tech Project Presentation.
On P2P Collaboration Infrastructures Manfred Hauswirth, Ivana Podnar, Stefan Decker Infrastructure for Collaborative Enterprise, th IEEE International.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
GeoGrid: A scalable Location Service Network Authors: J.Zhang, G.Zhang, L.Liu Georgia Institute of Technology presented by Olga Weiss Com S 587x, Fall.
TELEFÓNICA I+D Date: 25th October 2007 Sergio Garcí á Gómez © 2007 Telefónica Investigación y Desarrollo, S.A. Unipersonal SPIDERS Semantic.
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
CODD’s 12 RULES OF RELATIONAL DATABASE
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Evaluation of a Publish/Subscribe System for Collaboration and Mobile Working Collaborative Advertising over Internet with Agents Independent Study: Wireless.
Trust- and Clustering-Based Authentication Service in Mobile Ad Hoc Networks Presented by Edith Ngai 28 October 2003.
1 XML Based Networking Method for Connecting Distributed Anthropometric Databases 24 October 2006 Huaining Cheng Dr. Kathleen M. Robinette Human Effectiveness.
Distributed Virtual Environments Introduction. Outline What are they? DVEs vs. Analytic Simulations DIS –Design principles Example.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
CSCI 3140 Module 6 – Database Security Theodore Chiasson Dalhousie University.
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
Database Environment Chapter 2. Data Independence Sometimes the way data are physically organized depends on the requirements of the application. Result:
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Mobile Computing and Wireless Communication Pisa 26 November 2002 Roberto Baldoni University of Roma “La Sapienza”
Peer to Peer Network Design Discovery and Routing algorithms
Peer-to-Peer Systems: An Overview Hongyu Li. Outline  Introduction  Characteristics of P2P  Algorithms  P2P Applications  Conclusion.
1 Querying the Physical World Son, In Keun Lim, Yong Hun.
Distributed DBMS, Query Processing and Optimization
Challenge: Peers on Wheels – A Road to New Traffic Information Systems Jedrzej Rybicki, Björn Scheuermann, Wolfgang Kiess Christian Lochert, Pezhman Fallahi,
An overlay for latency gradated multicasting Anwitaman Datta SCE, NTU Singapore Ion Stoica, Mike Franklin EECS, UC Berkeley
CMSC 691B Multi-Agent System A Scalable Architecture for Peer to Peer Agent by Naveen Srinivasan.
Distributed Database Design Bayu Adhi Tama, MTI Fasilkom-Unsri Adapted from Connolly, et al., Database Systems 4 th Edition, Pearson Education Limited,
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
Network Topologies for Scalable Multi-User Virtual Environments Lingrui Liang.
The Biologically Inspired Distributed File System: An Emergent Thinker Instantiation Presented by Dr. Ying Lu.
Introduction to Load Balancing:
Distributed DBMS Concepts of Distributed DBMS
Mobile Computing.
Database management concepts
Database management concepts
Distributed Database Management Systems
Richard Ma, Sam Lee, John Lui (CUHK) David Yau (Purdue)
Presentation transcript:

An Overview of Issues in P2P database systems Presented by Ahmed Ataullah Wednesday, November 29 th 2006

2 Why mix P2P and databases More and more intelligent mobile devices  Storage capacities of 8 gigabytes and beyond are becoming the norm  Most devices are multipurpose and do more than just storage  These nodes can often independently connected to other multipurpose devices P2P systems have a ‘network effect’  No special infrastructure required to join (usually)  No requirements of availability and reliability  Community orientation Some motivating P2P database examples  Provincial health care network  Travel Agents (worldwide)

3 P2PDBMS – A generally accepted definition Unmanaged distributed database system  Number of nodes > 10^6  Most nodes (at least half) are offline at any given time  Nodes can leave at any given time and join from different locations Nodes are independent local database systems as well  Have a local schema and may contribute with some local resources (data, processing power, bandwidth etc.)

4 Widely accepted assumptions No central control  No standard schema (FNAME == FIRST_NAME)  No standardized local DBMS Goal centric communities  Peers are co-operative Some work related to game theory has been done with the contrary assumption  Location and location independent scenarios are treated differently by applications No reliability, serializability and correctness guarantees. Best effort is acceptable  Virtually no access control

5 P2P Database Management Systems What it boils down to…  File sharing, formalized and taken up a notch  Our objective is to port everything from the relational world (tables, constraints, foreign keys, materialized views, triggers etc) into a highly scalable and loosely connected network of database systems Why is that so difficult?

6 The Query Processing Nightmare SELECT MIN (PRICE), DATE, FLIGHT_NUMBER FROM FLIGHTS NATURAL JOIN AVAILABILITY WHERE ORIGION= ‘TORONTO’ AND DESTINATION=‘LONDON’ Schema issues  Schemas may not agree  Knowledge may not be consistent, Toronto = YYZ and London = LHR or LGW etc. Correctness  Have to look at every peer.  Not possible? Alternative solutions? Response Time  Most accurate answer up to certain point in time

7 The Query Processing Nightmare SELECT MIN (PRICE), DATE, FLIGHT_NUMBER FROM FLIGHTS NATURAL JOIN AVAILABILITY WHERE ORIGION= ‘TORONTO’ AND DESTINATION=‘LONDON’ Data placement issues  A correct answer may have to be derived  May require coordination among peers Local vs. Remote processing  Dynamic coordination rules  Is bandwidth more available or processing power? Cyclic nature of networks  Query propagation and update requests (and all other algorithms) have to be bounded

8 The Query Optimization Nightmare SELECT MIN (PRICE), DATE, FLIGHT_NUMBER FROM FLIGHTS NATURAL JOIN AVAILABILITY WHERE ORIGION= ‘TORONTO’ AND DESTINATION=‘LONDON’ Redundancy Issues  Same flight and price but different date? Materialized views  How often do we update these views Update propagation  problem for offline peers (push/pull strategy) Inserts and Deletes  Is every item unique?  Ownership model

9 Other issues which need attention SELECT MIN (PRICE), DATE, FLIGHT_NUMBER FROM FLIGHTS NATURAL JOIN AVAILABILITY WHERE ORIGION= ‘TORONTO’ AND DESTINATION=‘LONDON’ Semantic Optimization  Not very well studied  Must have a well designed model Fairness  Can one agent lie about his/her ticket prices  Incentives and Detection mechanisms Access control  Can it be offered at a high granularity? Consequences?

10 Conclusion (lessons learnt) P2P database systems are more than just database engines with networking modules above them Lot more work can be done in various sub areas  A minor tweak or assumption change can often lead to surprisingly different results  Interesting ideas like semantic query optimization, fine grained access control, fairness and control related issues have not been addressed  The need to do so perhaps also not been recognized