Research Directions in Databases Technological Education Institution of Larisa in collaboration with Staffordshire University Larisa 2005-2006 Dr. Theodoros.

Slides:



Advertisements
Similar presentations
Database Architectures and the Web
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Peer to Peer and Distributed Hash Tables
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.
ICS (072)Database Systems: A Review1 Database Systems: A Review Dr. Muhammad Shafique.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #13: P2P and Sensor Networks Shivkumar Kalyanaraman:
Information Retrieval in Practice
Technical Architectures
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Chapter 2 Database Environment.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Object Naming & Content based Object Search 2/3/2003.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Chapter 2 Database Environment Pearson Education © 2014.
What Can Databases Do for Peer-to-Peer Steven Gribble, Alon Halevy, Zachary Ives, Maya Rodrig, Dan Suciu Presented by: Ryan Huebsch CS294-4 P2P Systems.
Dr. Kalpakis CMSC 461, Database Management Systems Introduction.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Overview of Search Engines
Peer-to-Peer Databases David Andersen Advanced Databases.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Database Environment 1.  Purpose of three-level database architecture.  Contents of external, conceptual, and internal levels.  Purpose of external/conceptual.
Evaluating Centralized, Hierarchical, and Networked Architectures for Rule Systems Benjamin Craig University of New Brunswick Faculty of Computer Science.
1 P2P Computing. 2 What is P2P? Server-Client model.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
1 P2P File-Sharing Solution CS654 – Software Architecture course project Guide: T V Prabhakar Members: S Pavan Kumar – Y1306 D V Janardhan Rao – Y
Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate.
Peer-to-Peer Networks University of Jordan. Server/Client Model What?
M1G Introduction to Database Development 6. Building Applications.
Chapter 1 : Introduction §Purpose of Database Systems §View of Data §Data Models §Data Definition Language §Data Manipulation Language §Transaction Management.
Session-8 Data Management for Decision Support
CSS/417 Introduction to Database Management Systems Workshop 4.
©Silberschatz, Korth and Sudarshan1.1Database System Concepts Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition.
ICS (072)Database Systems: An Introduction & Review 1 ICS 424 Advanced Database Systems Dr. Muhammad Shafique.
Chapter 1 Introduction Yonsei University 1 st Semester, 2015 Sanghyun Park.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Bayu Adhi Tama, M.T.I 1 © Pearson Education Limited 1995, 2005.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Fall CSE330/CIS550: Introduction to Database Management Systems Prof. Susan Davidson Office: 278 Moore Office hours: TTh
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
Chapter 2 Database Environment.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Lecture On Introduction (DBMS) By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Postgraduate Module Enterprise Database Systems Technological Educational Institution of Larisa in collaboration with Staffordshire University Larisa
Christoph F. Eick: Final Words COSC Topics Covered in COSC 3480  Data models (ER, Relational, XML)  Using data models; learning how to store real.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Research Directions in Enterprise Database Systems
Introduction to DBMS Purpose of Database Systems View of Data
Peer-to-Peer Data Management
CHAPTER 3 Architectures for Distributed Systems
Chapter 2 Database Environment Pearson Education © 2009.
EE 122: Peer-to-Peer (P2P) Networks
Chapter 1 Database Systems
Database Environment Transparencies
Introduction to DBMS Purpose of Database Systems View of Data
Chapter 1 Database Systems
Presentation transcript:

Research Directions in Databases Technological Education Institution of Larisa in collaboration with Staffordshire University Larisa Dr. Theodoros Mitakos

Topics that fit the interests of database research include the following algorithms complexity concurrency constraints data integration data mining data modeling data on the Web data streams data warehouses distributed databases information retrieval knowledge bases logic multimedia physical design privacy query languages query optimization real-time data recovery scientific data security semantic Web semi-structured data spatial data temporal data transactions updates views Web services XML

Cutting edge areas Benchmarking and performance evaluation Data cleaning and integration Database monitoring and tuning Data privacy and security Data warehousing and decision-support systems Embedded, sensor, mobile databases and applications Managing uncertain and imprecise information Peer-to-peer data management Query processing and optimization

The reason of data integration is to make easier the access to independable heterogeneous sources of information such as simple databases, enterprise databases connected with intranests or Web resources. Different systems have been implemented in order to allow the users to extract useful information. The architecture of these systems includes a central mediator who receives the queries from the users and is also responsible for the answer that will be given. Additionally a wrapper in the information source is used to translate the data and handle localy the query. The last few years researchers from the database community recognize the need of data integration in a peer to peer environment. Such applications are supply-chain-management, B2B integration, Web-based file sharing services (Napster, Gnutella, Morpheus), biotechnology, astronomy etc. Introduction

P2P (peer to peer nodes) Distributed query processing has been a topic of database research since the late 1970's. In recent years, the problem has been revisited in the setting of peer-to-peer (P2P) file sharing systems, which have focused on a point in the design space that is quite different from traditional database research. P2P filesharing applications demand extreme scalability and federation, involving orders of magnitude more machines than even the most ambitious goals of distributed database systems proposed in the literature. P2P filesharing networks knit together hundreds of thousands of unmanaged computers across the globe into a unied query system. The data in filesharing consists of simple files stored at the end-hosts; the names of the files are queried in situ without transmitting them to any centralized repository. A P2P filesharing network typically runs thousands of concurrent keyword queries over the names of the files in the network. Separately, it supports point-to-point downloads of actual file content between peers.

P2P (peer to peer nodes) Popular P2P filesharing systems like Gnutella and Kazaa are based on very simple designs, and there is controversy over their effectiveness. These systems connect peer machines into an ad-hoc, unstructured network. Query processing proceeds in a very simple fashion known as “flooding”: a node transmits a keyword query to its P2P network neighbors, who forward the query on recursively for a finite number of hops known as the “time-to-live” (TTL) of the query. Any node with a matching filename reports back to the query source, which displays a list of matching file locations and properties. Given the scale of these networks, flooding-based schemes are not exhaustive; a given query will visit only a small fraction of nodes in the network. As a result, these networks provide no guarantees on query recall, and often fail to return matches that actually exist in the network.

P2P (peer to peer nodes) Recently, researchers have been focusing significant attention on alternative structured P2P networks that can support content-based routing. These networks are able to ensure that all messages labeled with a given “key” are routed to a particular machine. This allows the network to coordinate global agreement on the location of particular items to be queried: keyed items are routed by the network to a particular node, and key-based lookups are routed to that same node. This functionality is provided without any need for centralized state, and works even as machines join and leave the P2P network. Content-based routing is akin to the “put()/get()” interface of hash tables, and these networks have thus been dubbed Distributed Hash Tables (DHTs). DHTs have matured rapidly in recent years via both theoretical results and system prototypes. Unlike the popular unstructured P2P networks, DHTs can in principle provide full recall from a network of connected peers. To date, there has been little agreement on the best design for query processing in P2P filesharing systems.

Converting XML data to relational data To this direction converting semistructured data to relational data have to be studied. It is obvious that the effective management of semistructured data is needed so that efficient querying to be possible.

Studying different architectures for metadata management Centralized architecture: A central server keeps track of all sources metadata changes. The raw data are distributed. Every time a source refreshes its data the change should be sent to the server. Distributed architecture: There is no central server. On the contrary all the sources intercommunicate to disseminate the change that happened to the data. Hierarchical Architecture: Some sources are used as local servers when global predicates are involved. These sources will report the change to the global server.

Management of dynamic modifiable schema Schema plays a very important role to semistructured data management. Especially when these data are XML encoded. In XML there are languages that describe the schema (like DTD and XML schema). Researchers are concerned with problem of dynamic modifiable schema since relational databases era. Nowadays the problem is more intense since changes to the schema appear very often. Multidimensional XML is a formalish that represents changes but it needs extentions.

Directions Bibliographic one that concludes in a tutorial (coupled with a tool to retrieve the articles). Research (e.g. P2P semantics and models, XML-MXML extentions, find and compare most active search engines) Implementation (an e-learning module, an MXML tool, a P2P tool)