P2P technologies, PlanetLab, and their relevance to Grid work Matei Ripeanu The University of Chicago.

Slides:



Advertisements
Similar presentations
GT 4 Security Goals & Plans Sam Meder
Advertisements

Distributed Systems basics
PlanetLab Architecture Larry Peterson Princeton University.
High Performance Computing Course Notes Grid Computing.
1 On Death, Taxes, & the Convergence of Peer-to-Peer & Grid Computing Adriana Iamnitchi Duke University “Our Constitution is in actual operation; everything.
A Computation Management Agent for Multi-Institutional Grids
The Community Authorisation Service – CAS Dr Steven Newhouse Technical Director London e-Science Centre Department of Computing, Imperial College London.
Seminar Grid Computing ‘05 Hui Li Sep 19, Overview Brief Introduction Presentations Projects Remarks.
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
Peer-to-Peer Networking By: Peter Diggs Ken Arrant.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
A. Frank 1 Internet Resources Discovery (IRD) Peer-to-Peer (P2P) Technology (1) Thanks to Carmit Valit and Olga Gamayunov.
Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
CSPP 54001: Large-Scale Networked Systems Week 5: P2P Technologies and Applications Matei Ripeanu.
SESSION 9 THE INTERNET AND THE NEW INFORMATION NEW INFORMATIONTECHNOLOGYINFRASTRUCTURE.
Workload Management Massimo Sgaravatto INFN Padova.
P2P technologies, PlanetLab, and their relevance to Grid work Matei Ripeanu The University of Chicago.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
New Challenges in Cloud Datacenter Monitoring and Management
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Cloud computing Tahani aljehani.
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
Gordon Kass CEO & President 919/ x26 Porivo Technologies Inc. Measuring end-to-end web performance.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
Version 4.0. Objectives Describe how networks impact our daily lives. Describe the role of data networking in the human network. Identify the key components.
DISTRIBUTED COMPUTING
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Grid Security Issues Shelestov Andrii Space Research Institute NASU-NSAU, Ukraine.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Peer-to-Pee Computing HP Technical Report Chin-Yi Tsai.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Objectives Functionalities and services Architecture and software technologies Potential Applications –Link to research problems.
The Anatomy of the Grid Mahdi Hamzeh Fall 2005 Class Presentation for the Parallel Processing Course. All figures and data are copyrights of their respective.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Copyright © 2002 Intel Corporation. Intel Labs Towards Balanced Computing Weaving Peer-to-Peer Technologies into the Fabric of Computing over the Net Presented.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Scalable Grid system– VDHA_Grid: an e-Science Grid with virtual and dynamic hierarchical architecture Huang Lican College of Computer.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Dynamic Creation and Management of Runtime Environments in the Grid Kate Keahey Matei Ripeanu Karl Doering.
An Architectural Approach to Managing Data in Transit Micah Beck Director & Associate Professor Logistical Computing and Internetworking Lab Computer Science.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Grid Computing.
Introduction to Cloud Computing
The Globus Toolkit™: Information Services
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Grid Computing Software Interface
Presentation transcript:

P2P technologies, PlanetLab, and their relevance to Grid work Matei Ripeanu The University of Chicago

Why don’t we build a huge supercomputer? Top500 supercomputer list over time: Zipf distribution: Perf(rank) ≈ rank -k

Trend Increasingly interesting to aggregate the capabilities of the machines in the tail of this distribution. A virtual machine that aggregates the last 10 in Top500 would rank 32 nd in ’95 but 14 th in ‘03 Grid and P2P computing are, in part, results of this trend: Grids focus: infrastructure enabling controlled, secure resource sharing (for a relatively small number of resources) P2P focus: scale, deployability using integrated stacks. Challenge: design services that offer the best of both worlds complex, secure services, that deliver controlled QoS, are scalable and can be easily deployed.

Roadmap P2P Impact Applications Mechanisms PlanetLab Why is this interesting for you? Compare resource management in PlanetLab and Globus Wrap-Up

P2P Definition(s) Def 1: “A class of applications that take advantage of resources (e.g., storage, cycles, content) available at the edge of the Internet.” (‘00) Edges often turned off, without permanent IP addresses, etc. Def 2: “A class of decentralized, self-organizing distributed systems, in which all or most communication is symmetric.” (IPTPS’02) Lots of other definitions that fit in between

P2P impact today (1) Widespread adoption KaZaA – 360 million downloads (1.3M/week) one of the most popular applications ever! leading to (almost) zero-cost content distribution: … is forcing companies to change their business models … might impact copyright laws FastTrack2,460,120 eDonkey1,987,097 Overnet1,261,568 iMesh803,420 Warez440,289 Gnutella389,678 MP2P267,251 Sources: July ‘04

P2P impact today (2) P2P – file-sharing - generated traffic may be the single largest contributor to Internet traffic today Driving adoption of consumer broadband Internet2 Internet2 traffic statistics Source: July ‘04

P2P impact today (3) A huge pool of underutilized resources lays around, users are willing to donate these resources $1.5M / year in additional power which can be put to work efficiently (at least for some types of applications) TotalLast 24 hours Users 4,236,09023,365 Results received 764M1.13M Total CPU Time 1.3M years1.3K years Floating point operations 51.4 TFLOPS Source: website, Oct. 2003

Roadmap P2P Impact Applications Mechanisms PlanetLab Why is this interesting for you? Compare resource management in PlanetLab and Globus Wrap-Up

Applications: Number crunching Examples: UnitedDevices, DistributedScience many others Approach suitable for a particular class of problems. Massive parallelism Low bandwidth/computation ratio Error tolerance, independence from solving a particular task Problems: Centralized. How to extend the model to problems that are not massively parallel? Relevant to Grid space: Ability to operate in an environment with limited trust and dynamic resources

Applications: File sharing The ‘killer’ application to date Too many to list them all: Napster, FastTrack (KaZaA, iMesh), Gnutella (LimeWire, BearShare), Overnet Relevant to Grid space: Decentralized control Building a (relatively) reliable, data-delivery service using a large, heterogeneous set of unreliable components. Chunking, erasure codes Bytes transferred0.8 TB/day on average Number of download sessions230,000/day Number of local users≥ 10,000 Number of unique files~300,000 Source: Israeli ISP, data collection 1/15-2/13/2003 FastTrack (Kazaa) load at a small ISP

Applications: Content Streaming Streaming: the user ‘plays’ the data as as it arrives source Oh, I am exhausted! Client/server approach P2P approach Possible solution: The first few users get the stream from the server New users get the stream from the server or from users who are already receiving the stream Relevant to Grid space: offload part of the server load to consumers to improve scalability

Applications: Performance benchmarking Problem: Evaluate the performance of your service (Grid service, HTTP server) form end-user perspective Multiple views on your site performance Relevance to Grid space: Grid clients are heterogeneous, geographically dispersed Benchmark services for this set of consumers

End-to-end Performance Benchmarking Back-end Infrastructure Firewall Network Landscape Backbone ISP Consumer User 3 rd party content Web server App server Backbone Regional Network Enterprise Provider ISP Major Provider Local ISP T1 Corporate User Corporate Network Component Testing Datacenter Monitoring End-to-end Web Performance Testing Database Slide source:

Many other P2P applications … Backup storage (HiveNet, OceanStore) Collaborative environments (Groove Networks) Web serving communities (uServ) Instant messaging (Yahoo, AOL) Anonymous Censorship-resistant publishing systems (Ethernity, Freenet) Spam filtering

Mechanisms To obtain a resilient system: integrate multiple components with uncorrelated failures, and use data and service replication. To improve delivered QoS: move service delivery closer to consumer, integrate multiple providers with uncorrelated demand curves (reduces over-provisioning for peak loads) To generate meaningful statistics, to detect anomalies: provide views from multiple vantage points To improve scalability: Use decentralized (local) control, unmediated interactions

Roadmap P2P Impact Applications Mechanisms PlanetLab Why is this interesting for you? Compare assumptions and resource management mechanisms in PlanetLab and Globus Wrap-Up

PlanetLab Testbed to experiment with your networked applications. >400 nodes, >150 sites, PlanteLab consortium: 80+ universities, Intel, HP View presented to users: a distributed set of VMs Allocation unit: a slice = a set of virtual machines (VM), one VM at each node. 452 nodes 162 sites 450 research projects VMM Slice K OS Slice K OS Slice K OS Slice K OS

PlanetLab usage examples Stress-test your Grid services (Globus RLS) GSLab: a playground to experiment with grid- services ‘Better-than-Internet’ services: Resilient Overlays Multipath TCP (mTCP) Multicast Overlays VMM OS User account

Why should you find PlanetLab interesting? 1.Open, large-scale testbed for your P2P applications or Grid services 2.Solves a similar problem to Grids/Globus: building virtual organizations (or resource federations) Grids: testbeds (deployments of hardware and software) to solve computational problems. PlanetLab: testbed to play with CS applications Main problem for both: enable resource sharing across multiple administrative domains

Roadmap  Summarize starting assumptions on: user communities, applications, resources and attempt to explain differences  Look at mechanisms to build VO.

Assumptions: User communities PlanetLab: users are CS scientists that experiment with and deploy infrastructure services. Globus: users from a more diverse pool of scientists that are interested to run efficiently their (end-user) applications. Implication: functionality offered OS User applications Globus PL Services PlanetLab.

Assumptions: Application characteristics Different view on geographical resource distribution: PlanetLab services: « distribution is a goal » leverage multiple vantage points for network measurements, or to exploit uncorrelated failures in large sets of components Grid applications: « distribution is a fact of life » resource distribution: a result of how the VO was assembled (due to administrative constraints). Implication: mechanism design for resource allocation

Assumptions: Resources PlanetLab mission as testbed for a new class of networked services allows for little HW/SW heterogeneity. Globus supports a large set architectures; sites with multiple security requirements Implications: complexity, development speed

Assumptions: Resource ownership Goal: individual sites retain control over their resources PlanetLab limits the autonomy of individual sites in a number of ways: VO admins: Root access, Remote power button Sites: Limited choice of OS, security infrastructure Globus imposes fewer limits on site autonomy Requires fewer privileges (also can run in user space, ) PlanetLab emphasizes global coordination over local autonomy to a greater degree than Globus Implications: ease to manage and evolve the testbed PlanetLab Globus Individual site autonomy Control at VO level

Building Virtual Organizations Individual node/site functionality Mechanisms at the aggregate level Security infrastructure Delegation mechanisms Resource allocation and scheduling Resource discovery, monitoring, and selection.

Delegation mechanisms: Identity delegation Globus Implementation based on delegated X.509 proxies PlanetLab None Broker/scheduler usage scenario: User A sends a job to a broker service which, in turn, submits it to a resource. The resource manager makes authorization decisions based on the identity that originated the job (A). Broker X.509/SSL Delegated identity

Delegation mechanisms: Delegating rights to use resources PlanetLab Individual nodes managers hand out capabilities: akin to time- limited reservations Capabilities can be traded Extra layer to: provide secure transfer, prevent double spending, offer external representation GGF/Globus WS-Agreements protocols: To represent ‘contracts’ between providers and consumers. Local enforcement mechanism is not specified The two efforts are complementary! Broker/scheduler usage scenario: User A acquires capabilities from various brokers then submits the job. Delegated usage rights Broker Resource usage rights Job descriptions + Usage rights acquired

Global resource allocation and scheduling PlanetLab Globus Users Application Managers Brokers / Agents Node Managers Nodes (Resources) Resource usage delegation Sends capabilities (leases) Identity delegation Sends job descriptions

VO Compute Center VO (policies) VO Users Services (running on user’s behalf) Access rights Access CAS or VOMS issuing SAML or X.509 ACs SSL/WS-Security with Proxy Certificates Multiple VOs

Wrap-up Scale & volatility Functionality & infrastructure Grids P2P Convergence: Large, Dynamic, Self-Configuring Grids  Large scale  Intermittent resource participation  Local control, Self-organization  Weaker trust assumptions  Infrastructures to support diverse applications  Diversity in shared resources

Fin Links to papers, tech-reports, slides: Distributed Systems UChicago Thank you.

User process #1 Proxy Authenticate & create proxy credential GSI (Grid Security Infrastruc- ture) Gatekeeper (factory) Reliable remote invocation GRAM (Grid Resource Allocation & Management) Reporter (registry + discovery) User process #2 Proxy #2 Create process Register GT2 in One Slide Grid protocols (GSI, GRAM, …) enable resource sharing within virtual orgs; toolkit provides reference implementation ( = Globus Toolkit services) l Protocols (and APIs) enable other tools and services for membership, discovery, data mgmt, workflow, … Other service (e.g. GridFTP) Other GSI- authenticated remote service requests GIIS: Grid Information Index Server (discovery) MDS-2 (Monitor./Discov. Svc.) Soft state registration; enquiry