High Throughput Computing on P2P Networks Carlos Pérez Miguel

Slides:



Advertisements
Similar presentations
Megastore: Providing Scalable, Highly Available Storage for Interactive Services. Presented by: Hanan Hamdan Supervised by: Dr. Amer Badarneh 1.
Advertisements

Dynamo: Amazon’s Highly Available Key-value Store
CASSANDRA-A Decentralized Structured Storage System Presented By Sadhana Kuthuru.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Based on the text by Jimmy Lin and Chris Dryer; and on the yahoo tutorial on mapreduce at index.html
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
Map/Reduce in Practice Hadoop, Hbase, MongoDB, Accumulo, and related Map/Reduce- enabled data stores.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
NoSQL Databases: MongoDB vs Cassandra
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
A Grid Parallel Application Framework Jeremy Villalobos PhD student Department of Computer Science University of North Carolina Charlotte.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
A Decentralized Structure Storage Model - Avinash Lakshman & Prashanth Malik - Presented by Srinidhi Katla CASSANDRA.
Dynamo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as well as related cloud storage implementations.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
ZHT A Fast, Reliable and Scalable Zero-hop Distributed Hash Table
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.
Thesis Proposal Data Consistency in DHTs. Background Peer-to-peer systems have become increasingly popular Lots of P2P applications around us –File sharing,
A Distributed Architecture for Multi-dimensional Indexing and Data Retrieval in Grid Environments Athanasia Asiki, Katerina Doka, Ioannis Konstantinou,
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Apache Cassandra - Distributed Database Management System Presented by Jayesh Kawli.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Cassandra - A Decentralized Structured Storage System
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
S-Paxos: Eliminating the Leader Bottleneck
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Malugo – a scalable peer-to-peer storage system..
ZOOKEEPER. CONTENTS ZooKeeper Overview ZooKeeper Basics ZooKeeper Architecture Getting Started with ZooKeeper.
Part III BigData Analysis Tools (YARN) Yuan Xue
Seminar On Rain Technology
1 Student Date Time Wei Li Nov 30, 2015 Monday 9:00-9:25am Shubbhi Taneja Nov 30, 2015 Monday9:25-9:50am Rodrigo Sanandan Dec 2, 2015 Wednesday9:00-9:25am.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
HERON.
Data Management on Opportunistic Grids
Cassandra - A Decentralized Structured Storage System
Introduction to Distributed Platforms
Dynamo: Amazon’s Highly Available Key-value Store
NOSQL.
NOSQL databases and Big Data Storage Systems
MapReduce: Data Distribution for Reduce
CMSC Cluster Computing Basics
Presentation transcript:

High Throughput Computing on P2P Networks Carlos Pérez Miguel

Overview High Throughput Computing Motivation All things distributed: Peer-to-peer – Non structured overlays – Structured overlays P2P Computing Cassandra HTC over Cassandra Eventual consistency Experiments Future Work Conclusions

High Throughput Computing Concept introduced by the Condor team in 1996 In contrast to HPC, it optimizes the execution of a set of applications Figure of merit: the number of computational tasks per time unit Tasks are independent Examples: Condor, Oracle Grid Engine (Kalimero), BOINC

Functioning N worker nodes One master node Users interact with the master node Master manages pending task and idle workers using a queuing system Task are (usually) executed in FIFO order

Motivations Limitations of this model – Master node may become a scalability bottleneck – Failures in the master affects the whole system Is it possible to distribute the capabilities of the master node among all sytem nodes? How? (which technology can help?)

All things distributed: peer- to-peer Distributed systems in which all nodes have the same role Nodes are interconnected defining an application-level virtual network An overlay network This overlay is used to locate other nodes and information inside them Two types of overlays: structured and non-structured

Non-structured overlays Nodes are interconnected randomly Searchs in the overlay are made by flooding Efficient search of popular contents Cannot guarantee that any system point is reachable Not efficient in terms of number of messages

Non-structured overlays (II)

Structured overlays Nodes interconnected using some kind of (regular) structure Each node has an unique ID of N bits, defining a 2 N keyspace This keyspace is divided among the nodes

Structured overlays (II) Each object in the system has an ID and a position in the key space A distance-based routing protocol is used This permits reaching any point with O(log n) messages

Distributed Hash Tables Provides a hash-like user API: Put (ID, Object) Get (ID) Fast access to distributed information Used to distribute file, communicate users, VoIP, Video Streaming

P2P Computing Must be seen by the user as a single resource pool User should be able to submit jobs from any node in the system System stores job’s information permitting progress even when the user is not connected A FIFO order should be guaranteed DHTs are suitable for this purpose

DHTs for P2P Computing Must provide scalability in adverse conditions Must provide persistency (using replication) Replicas are synchronized by consensus algorithms Load balancing algorithms are also needed

DHTs for P2P Computing (II) In 2007 Amazon presented Dynamo, a DHT P2P system with persistence, scalability, access in O(n) and eventual consistency From Dynamo, many alternatives have been proposed: Riak, Scalaris, Memcached,... Facebook proposed Cassandra in 2009 with the same Dynamo capabilities and Google's BigTable data model

Cassandra Developed by Facebook and Twitter since 2009 Has been released to the Apache Foundation Developed in Java with multilanguage client libraries Pros: Fault tolerant, decentralized, scalable, durable Cons: Eventual consistency

Cassandra’s Data Model DHTs store (key, value) pairs Cassandra store (key, (values..)) tuples across different tables The different tables are named ColumnFamilies or SuperColumnFamilies CF are 4-dimensional tables SCF are 5-dimensional tables

Column Families WaitingQueue ColumnFamily JobIDNameOwnerBinary 1Task1User1URL 2Task2User2URL 3Task3User1URL NTaskNUser3URL

SuperColumn Families Queues SuperColumn Family Waiting Job1Job2JobN Task1User1Task2User2TaskNUserN Running Job1Job2JobN Task1User1Task2User2TaskNUserN

HTC over Cassandra A batch queue system has been implemented over Cassandra’s data model This permits idle workers decide which task to run, in FIFO order Users can: – Submit jobs – Check jobs’ status – Retrieve jobs’ results The use of Cassandra as underlying data storage allows for disconnected operation

HTC over Cassandra (II) System stores Job information – Name – Owner – Binaries Users information Queues information The system is totally reconfigurable at run time, permitting the utilization of unlimited queues with different policies

Eventual Consistency All changes in any object reach all object replicas eventually CAP theorem implies that it is not possible to have these three properties at the same time: – Consistency – Availability – Partition tolerance Cassandra have selected availability and partition tolerance instead of consistency In a failure-free scenario, Cassandra provides low latency

Eventual Consistency (II) This scenario implies the impossibility of atomic operations in Cassandra In our HTC system, collisions may happen when several nodes try to execute the same task We have implemented some partial solutions that reduce the probability of a collision: – QUORUM consistency for all I/O operations – Extra queue where idle nodes compete for the waiting task Reduces the collision probability from 30% to 4%

Experiments We have performed some experiments to evaluate our system A 20 nodes cluster has been used for this purpose – Each node has a P4 processor with hyperthreading – 1.5 – 2 GB of RAM Each node represents one user in the system We have used a workload generator in order to generate a works list for each user

Metrics Bounded Slowdown: Waiting time for a job plus the running time System utilization Scheduling time: time used by idle nodes to schedule a waiting job Collisions detected

System Load

Bounded Slowdown

Scheduling Time

Collisions

Future Work Find a viable solution to the Eventual Consistency problem Develop a workflow system with MapReduce tasks Reputation systems in order to classify nodes behavior

Conclusions HTC over P2P is possible A prototype has been developed Some preliminary experiments have been done obtaining good performance levels

QUESTIONS?