1 Open DHT: A Public DHT Service for Developers of Distributed Applications University of California, Irvine Presented By : Ala Khalifeh

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

Distributed Processing, Client/Server and Clusters
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Peer to Peer and Distributed Hash Tables
Evaluation of a Scalable P2P Lookup Protocol for Internet Applications
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #13: P2P and Sensor Networks Shivkumar Kalyanaraman:
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
Chapter 17: Client/Server Computing Business Data Communications, 4e.
10/31/2007cs6221 Internet Indirection Infrastructure ( i3 ) Paper By Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Sharma Sonesh Sharma.
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California,
Internet Indirection Infrastructure Ion Stoica UC Berkeley.
X Non-Transitive Connectivity and DHTs Mike Freedman Karthik Lakshminarayanan Sean Rhea Ion Stoica WORLDS 2005.
OpenDHT: A Public DHT Service Sean C. Rhea UC Berkeley June 2, 2005 Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott.
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
A. Frank 1 Internet Resources Discovery (IRD) Peer-to-Peer (P2P) Technology (1) Thanks to Carmit Valit and Olga Gamayunov.
Chapter 7: Client/Server Computing Business Data Communications, 5e.
Object Naming & Content based Object Search 2/3/2003.
Data Storage and Data Processing Architectures The difficulty is in the choice George Moore, 1900.
1 CS6320 – Why Servlets? L. Grewe 2 What is a Servlet? Servlets are Java programs that can be run dynamically from a Web Server Servlets are Java programs.
Fixing the Embarrassing Slowness of OpenDHT on PlanetLab Sean Rhea, Byung-Gon Chun, John Kubiatowicz, and Scott Shenker UC Berkeley (and now MIT) December.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
Internet Indirection Infrastructure (i3) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002.
Distributed Systems: Client/Server Computing
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
A Public DHT Service Sean Rhea, Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu UC Berkeley and.
Data Processing Architectures The difficulty is in the choice George Moore, 1900.
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
Internet GIS. A vast network connecting computers throughout the world Computers on the Internet are physically connected Computers on the Internet use.
11/16/2012ISC329 Isabelle Bichindaritz1 Web Database Application Development.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Information-Centric Networks07a-1 Week 7 / Paper 1 Internet Indirection Infrastructure –Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Distributed Session Announcement Agents for Real-time Streaming Applications Keio University, Graduate School of Media and Governance Kazuhiro Mishima.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Putting it all together Dynamic Data Base Access Norman White Stern School of Business.
Hongil Kim E. Chan-Tin, P. Wang, J. Tyra, T. Malchow, D. Foo Kune, N. Hopper, Y. Kim, "Attacking the Kad Network - Real World Evaluation and High.
Chapter 17: Client/Server Computing Business Data Communications, 4e.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Distributed Databases
ABSTRACT The JDBC (Java Database Connectivity) API is the industry standard for database- independent connectivity between the Java programming language.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Multimedia & Mobile Communications Lab.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
1 University of California, Irvine Done By : Ala Khalifeh (Note : Not Presented)
Effective Replica Maintenance for Distributed Storage Systems USENIX NSDI’ 06 Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon,
Enabling e-Research in Combustion Research Community T.V Pham 1, P.M. Dew 1, L.M.S. Lau 1 and M.J. Pilling 2 1 School of Computing 2 School of Chemistry.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
Managing Enterprise GIS Geodatabases
CIS 250 Advanced Computer Applications Database Management Systems.
Connect Applications and Business Partners in Integration Cloud, the Reliable and Transparent Integration Environment Built on Microsoft Azure MICROSOFT.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
Nick McKeown CS244 Lecture 17 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications [Stoica et al 2001]
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
A Case Study in Building Layered DHT Applications
Chapter 8 Environments, Alternatives, and Decisions.
Internet Indirection Infrastructure (i3)
CHAPTER 3 Architectures for Distributed Systems
Enterprise Application Architecture
EE 122: Peer-to-Peer (P2P) Networks
Quasardb Is a Fast, Reliable, and Highly Scalable Application Database, Built on Microsoft Azure and Designed Not to Buckle Under Demand MICROSOFT AZURE.
Chapter 17: Client/Server Computing
Presentation transcript:

1 Open DHT: A Public DHT Service for Developers of Distributed Applications University of California, Irvine Presented By : Ala Khalifeh Estimated Time:25 Minutes (Note: Presented)

2 Open DHT: A Public DHT: Service for Developers of Distributed Applications Who did the survey (Done by : Ala’ Khalifeh- Overview  Website/URLs/references The Open DB website: OpenDHT: A Public DHT Service and Its Uses Sean Rhea, Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu UC Berkeley and Intel Research, SIGCOMM’05, August 21–26,  Application/Issues A framework proposed by Intel to enable researchers to develop a new distributed applications.

3 Generic system description Assumption  A DHT is a variation on a hash table—a classic data structure used to efficiently store and retrieve data.  In a hash table, data is stored as key-value pairs (a key is a label or description of data, such as “John’s age,” and a value is the data associated with the key—say, “33.”).

4 Assumption (cont’d) A distributed hash table (DHT) spreads this data structure across thousands or even millions of computers connected to the Internet. When a user queries the system for a given key-value pair, the system retrieves the data from one of the computers where it’s stored and returns the data to the user. The advantage is that an application built atop a DHT seamlessly inherits :  scaling, robustness and self-organizing properties of its underlying DHT.

5 System Architecture and Deployment (Open DHT: A Public, Shared DHT infrastructure)  Given this promising research output, in 2004 Intel launched the Open DHT project.  The project focuses on easing the deployment of distributed applications that use DHTs  Open DHT researchers proposed building a publicly accessible shared DHT service.

6 Before Open DHT After Infrastructures Problem To deploy a DHT, a developer must gain access to a large set of distributed machines to host the DHT. This is a difficult prospect for those who don’t have access to infrastructures PlanetLab PlanetLab, a testbed for deploying planetary-scale services that is open only to computer scientists in academia and industry. Open DHT runs on PlanetLab’s hosts, and thus extends PlanetLab’s reach beyond the research community to a broader community of developers.

7 Open DHT Deployment Model Deployment Rather than require that each developer deploy a DHT for every application. The Open DHT project proposes an alternate deployment model. Under this model, a single DHT (namely, Open DHT) is shared across multiple applications, thus amortizing the cost of deployment.

8  Deployment (Open DHT Deployment Model) (Continued) Requirements  Simple, Flexible, Secure The Open DHT service offers a simple put and get interface. Any Internet-connected computer can store or put key-value pairs in Open DHT, and any Internet- connected computer can retrieve or get the value stored under a particular key In contrast to the typical “build your own” DHT model, clients of Open DHT do not need to run a DHT node in order to use the service. Instead, they can issue put and get Operations to any Open DHT node, which processes the operations on their behalf.

9  Deployment (Open DHT Deployment Model) (Continued) By using Open DHT as a highly available naming and storage service, developers can ignore the complexities of deploying and maintaining a DHT and focus instead on the development of the application itself.

10 System Architecture (cont’d)

11 System Architecture (cont’d) Each PlanetLab node is a Linux host that runs the open-source Bamboo DHT implementation. Each node in the Open DHT deployment holds a portion of the DHT's total key-value store on its local disk. It answers put and get requests for that portion of the DHT's total key-value store, and routes requests for the remaining key-value entries to other DHT member nodes.

12 System Architecture (cont’d) Each node also serves as a gateway into the DHT for clients. An Open DHT client communicates with the DHT through the gateway of its choice using Sun RPC or XML RPC over TCP. Because of this, the service is easy to access from virtually every programming language and from behind almost all NATs.

13 Sample Applications FOOD (FreeDB Over Open DHT) A sample application that could improve the performance and availability of freedb.org, The Freedb project is an audio CD song title indexing system used to label the tracks on MP3 files, the client queries the database. As a result, the system displays the artist information, CD-title, tracklist and some additional infos. Today, FreeDB is a free service run by a small number of volunteers. The service indexes titles for over a million CDs and serves over four million read requests per week across ten widely dispersed “mirrors.” Mirroring is labor-intensive and limited in flexibility

14 Sample Applications Implementing FreeDB Over Open DHT was a trivial undertaking – FreeDOD is implemented in just 84 lines of Perl. Moreover, measurements show that FreeDOD outperforms the existing FreeDB service in terms of data availability and latency. In addition to FOOD, Open DHT researchers have implemented simple prototypes of an instant messaging application (in 123 lines of C++), a DHT-based file-system (531 lines of Java). And a multicast service (781 lines of Java) which allows a user to efficiently transmit content to large numbers of receivers,

15 Sample Applications

16 Categorization-1 Yes Download Yes Centralized Hybrid YesDistributedMeshDecentralized SearchStorageTopologyInitialization Functional Criteria The degree of decentralization

17 Performance Evaluation Simulation  Efficiency – results Measurement Setup a May 1, 2004 snapshots were stored of the FreeDB database containing a total of 1.3 million discids in OpenDHT. To compare the availability of data and the latency of queries in FreeDB and FOOD, both systems were queried for a random CD every 5 seconds. The FreeDB measurements span October 2–13, 2004, and The FOOD measurements span October 5–13.

18 Performance Evaluation  Availability and Latency During the measurement interval, FOOD offered availability superior to that of FreeDB. Only one request out of 27,255 requests to FOOD failed, where each request was tried exactly once, with a one-hour timeout. This fraction represents a 99.99% success rate, as compared with a 99.9% success rate for the most reliable FreeDB mirror, and 98.8% for the least reliable one

19 Performance Evaluation  Availability and Latency(cont’d) Comparing the full legacy version of FreeDB against FOOD, we observe that over 70% of queries complete with lower latency on FOOD than on FreeDB, and that for the next longest 8% of queries, FOOD and FreeDB offer comparable response time.  For the next 20% of queries, FOOD has less than a factor of two longer latency than FreeDB. Only for the slowest 2% of queries does FOOD offer significantly greater latency than FreeDB.

20  Security Authentication While OpenDHT does not currently support client authenticity, essentially no requests for such authentication from users. However we believe this apparent lack of concern for security is most likely due to these applications being themselves in the relatively early stages of deployment.