Resource Management in Volunteer Computing Grids An analysis of the different approaches to maximizing throughput on a BOINC grid Presented by Geoffrey.

Slides:

Advertisements

Similar presentations

Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)

Advertisements

C. Mastroianni, D. Talia, O. Verta - A Super-Peer Model for Resource Discovery Services in Grids A Super-Peer Model for Building Resource Discovery Services.

Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.

BitTorrent or BitCrunch: Evidence of a credit squeeze in BitTorrent?

Copyright 2004 Koren & Krishna ECE655/DataRepl.1 Fall 2006 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing.

BOINC: A System for Public-Resource Computing and Storage David P. Anderson University of California, Berkeley.

ALEAE : Handling Uncertainties in Large-Scale Distributed Systems Emmanuel Jeannot LORIA - INRIA - CNRS ALEAE Kick-off April 1st 2009.

Cloud Computing to Satisfy Peak Capacity Needs Case Study.

Madhavi W. SubbaraoWCTG - NIST Dynamic Power-Conscious Routing for Mobile Ad-Hoc Networks Madhavi W. Subbarao Wireless Communications Technology Group.

Priority Queuing Achieving Flow ‘Fairness’ in Wireless Networks Thomas Shen Prof. K.C. Wang SURE 2005.

Xavier León PhD defense

Differentiated Surveillance for Sensor Networks Ting Yan, Tian He, John A. Stankovic CS294-1 Jonathan Hui November 20, 2003.

The Organic Grid: Self- Organizing Computation on a Peer-to-Peer Network Presented by : Xuan Lin.

Diagnosis on Computational Grids for Detecting Intelligent Cheating Nodes Felipe Martins Rossana M. Andrade Aldri L. dos Santos Bruno SchulzeJosé N. de.

More routing protocols Alec Woo June 18 th, 2002.

Smart Redundancy for Distributed Computation George Edwards Blue Cell Software, LLC Yuriy Brun University of Washington Jae young Bang University of Southern.

High-Performance Task Distribution for Volunteer Computing Rom Walton

1 IOE/MFG 543 Chapter 14: General purpose procedures for scheduling in practice Sections : Dispatching rules and filtered beam search.

Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.

Ant Colonies As Logistic Processes Optimizers

Present by Chen, Ting-Wei Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids Maria Chtepen, Filip H.A. Claeys, Bart Dhoedt,

Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.

Grid Load Balancing Scheduling Algorithm Based on Statistics Thinking The 9th International Conference for Young Computer Scientists Bin Lu, Hongbin Zhang.

ICS 463, Intro to Human Computer Interaction Design: 9. Experiments Dan Suthers.

APPRAISING AND MANAGING PERFORMANCE

CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.

Ana-Maria Oprescu, Thilo Kielmann (Vrije University) Presented By Gal Cohen Cloud Computing Seminar CS Technion, Spring 2012.

 Co-channel interference as a major obstacle for predictable reliability, real-time, and throughput in wireless networking Reliability as low as ~30%

WP6: Grid Authorization Service Review meeting in Berlin, March 8 th 2004 Marcin Adamski Michał Chmielewski Sergiusz Fonrobert Jarek Nabrzyski Tomasz Nowocień.

Costs of Ancillary Services & Congestion Management Fedor Opadchiy Deputy Chairman of the Board.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.

CS 712 | Fall 2007 Using Mobile Relays to Prolong the Lifetime of Wireless Sensor Networks Wei Wang, Vikram Srinivasan, Kee-Chaing Chua. National University.

Vikramaditya. What is a Sensor Network?  Sensor networks mainly constitute of inexpensive sensors densely deployed for data collection from the field.

Michael Sirivianos Xiaowei Yang Stanislaw Jarecki Presented by Vidya Nalan Chakravarthy.

Network Aware Resource Allocation in Distributed Clouds.

UbiStore: Ubiquitous and Opportunistic Backup Architecture. Feiselia Tan, Sebastien Ardon, Max Ott Presented by: Zainab Aljazzaf.

Budget-based Control for Interactive Services with Partial Execution 1 Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.

Autonomous Replication for High Availability in Unstructured P2P Systems Francisco Matias Cuenca-Acuna, Richard P. Martin, Thu D. Nguyen

A Survey of Distributed Task Schedulers Kei Takahashi (M1)

DISCERN: Cooperative Whitespace Scanning in Practical Environments Tarun Bansal, Bo Chen and Prasun Sinha Ohio State Univeristy.

Co-Grid: an Efficient Coverage Maintenance Protocol for Distributed Sensor Networks Guoliang Xing; Chenyang Lu; Robert Pless; Joseph A. O ’ Sullivan Department.

Optimal Content Delivery with Network Coding Derek Leong, Tracey Ho California Institute of Technology Rebecca Cathey BAE Systems CISS 2009 March 19, 2009.

Covilhã, 30 June Atílio Gameiro Page 1 The information in this document is provided as is and no guarantee or warranty is given that the information is.

Scalable Computing on Open Distributed Systems Jon Weissman University of Minnesota National E-Science Center CLADE 2008.

1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.

BOINC Workshop 10 Hien Nguyen, Eshwar Rohit University of Houston Supervisors: Dr. Jaspal Subhlok University of Houston Dr. David P. Anderson SSL – U.C,

Trust-Sensitive Scheduling on the Open Grid Jon B. Weissman with help from Jason Sonnek and Abhishek Chandra Department of Computer Science University.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.

Vehicular Cloud Networking: Architecture and Design Principles

MMAC: A Mobility- Adaptive, Collision-Free MAC Protocol for Wireless Sensor Networks Muneeb Ali, Tashfeen Suleman, and Zartash Afzal Uzmi IEEE Performance,

Exploiting Group Recommendation Functions for Flexible Preferences.

Efficient Data Compression in Location Based Services Yuni Xia, Yicheng Tu, Mikhail Atallah, Sunil Prabhakar.

Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.

An Adaptive, High Performance MAC for Long-Distance Multihop Wireless Networks Sergiu Nedevschi *, Rabin K. Patra *, Sonesh Surana *, Sylvia Ratnasamy.

Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.

Developing an early warning system combined with dynamic LMS data

Job Scheduling in a Grid Computing Environment

Load Balancing and It’s Related Works in Cloud Computing

Analytics and OR DP- summary.

The Cryptoeconomic Way

SYSTEMS, DATA, AND INFORMATION

Winter 2016 (c) Ian Davis.

Providing Secure Storage on the Internet

CLUSTER COMPUTING.

by Xiang Mao and Qin Chen

Exploring Multi-Core on

Presentation transcript:

Resource Management in Volunteer Computing Grids An analysis of the different approaches to maximizing throughput on a BOINC grid Presented by Geoffrey Oxholm and Beata Chrulkiewicz CS-575 Position Paper Presentation Fall 2007

Volunteer Grids A Type of Grid Computer – Decentralized, volunteer nodes Supercomputing for free – 1.1 PetaFLOPS vs. 360 TeraFLOPS Image: Unreliable Nodes – Users can disconnect their computers anytime – Amount of donated resources is subject to change – Evil jerks can upload malicious data

Berkeley Open Infrastructure for Network Computing Duplicate work to ensure validity – R – The “Redundancy Factor” Validate computation results. If the validation fails, repeat computation. – Validation Methods: Majority Voting – More than R/2 nodes must agree M-First Voting – First M nodes must agree Image:

Success and Limitations of BOINC With proper configuration high throughput can be achieved Still quite difficult to get volunteers Proper configuration is difficult Fixed configurations can not account for constantly changing grid characteristics Image:

Fix: User Encouragement Feedback and Reward Each node generates statistics Teams can be formed Sense of pride in commitment Encourages users to donate more time, resources Image: Team OCUK total credit. Go team!

Fix: Maximizing Configuration Through Usage Simulation Enumerate a set of possible configurations Test configurations in a fraction of the time Avoid disturbing volunteers by simulating Zero in on an effective configuration Image:

Fix: Dynamic Redundancy Through Reliability Prediction Wait for a minimum number of nodes before assigning work Choose nodes which have higher reliability Higher reliability means less need for redundancy Successful completion yields higher reliability rating for the node Image:

Evaluation User Encouragement – Encourages cheating – Does nothing to maximize efficient use of resources Usage Simulation – Still requires researchers to configure system – Static configuration fails to match dynamic grid Reliability Rating – Subject to further exploitation – Further minimizes the value of slow nodes, working against incentives Image: GPL Licensed

Conclusion Build on existing methods – Continue to encourage users – Create a starting point by using simulation – Update reliability system to avoid conflict with system of incentives Develop new technologies – Blacklist malicious nodes – Develop a more comprehensive reliability system which uses past schedules to predict future availability Image:

Questions? Image: Geoff Oxholm Beata Churkiewicz