The software infrastructure of II

Slides:



Advertisements
Similar presentations
Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
Advertisements

BOINC: A System for Public-Resource Computing and Storage David P. Anderson University of California, Berkeley.
1 Configuring Internet- related services (April 22, 2015) © Abdou Illia, Spring 2015.
1 Configuring Web services (Week 15, Monday 4/17/2006) © Abdou Illia, Spring 2006.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Volunteer Computing.
BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.
Tripwire Enterprise Server – Getting Started Doreen Meyer and Vincent Fox UC Davis, Information and Education Technology June 6, 2006.
16.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 16: Examining Software Update.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
1 Web Database Processing. Web Database Applications Static Report Publishing a report is prepared from a database application and exported to HTML DB.
OM. Brad Gall Senior Consultant
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
1 Guide to Novell NetWare 6.0 Network Administration Chapter 13.
A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.
A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate.
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
Operating Systems TexPREP Summer Camp Computer Science.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.
07:44:46Service Oriented Cyberinfrastructure Lab, Introduction to BOINC By: Andrew J Younge
Lessons Learned from David P. Anderson Director, Spaces Sciences Laboratory U.C. Berkeley April 2, 2002.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Case Study.  Client needed to build data collection agents for various mobile platform  This needs to be integrated with the existing J2ee server 
and Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public and Grid Computing.
Intro to Datazen.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
CT101: Computing Systems Introduction to Operating Systems.
Volunteer Computing and BOINC Dr. David P. Anderson University of California, Berkeley Dec 3, 2010.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.
The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.
Platform as a Service (PaaS)
Architecture Review 10/11/2004
Volunteer Computing and BOINC
Platform as a Service (PaaS)
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
Volunteer Computing: SETI and Beyond David P
Volunteer Computing for Science Gateways
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
2. OPERATING SYSTEM 2.1 Operating System Function
Designing a Runtime System for Volunteer Computing David P
Credits: 3 CIE: 50 Marks SEE:100 Marks Lab: Embedded and IOT Lab
Network Operating Systems (NOS)
Job Scheduling in a Grid Computing Environment
Chapter 2: System Structures
Introduction to Operating System (OS)
TexPREP Summer Camp Computer Science
PHP / MySQL Introduction
Chapter 1: Introduction
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
The Application Lifecycle
Chapter 2: System Structures
Chapter 2: The Linux System Part 1
Configuring Internet-related services
Cloud Web Filtering Platform
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
Introduction to Operating Systems
Chapter 3: Operating Systems Computer Science: An Overview
Chapter 2: Operating-System Structures
LO2 – Understand Computer Software
Software - Operating Systems
Windows 10 An Operating System
Presentation transcript:

The software infrastructure of SETI@home II David P. Anderson Space Sciences Laboratory U.C. Berkeley

Public-resource computing Home PCs your computers academic business Challenges: low bandwidth at client costly bandwidth at server firewall/NAT issues sporadic connection untrustworthy, insecure clients server security heterogeneity must recruit participants Advantages: scale free growth public education no institutional policy issues

Achievements of SETI@home 1,000,000 years of CPU time in 3 years Sustained 30 TeraFLOPs 1.5E21 floating-point operations 3,600,000 users in 226 countries 40 Terabytes of data processed 3 billion “events” detected Solved scaling, security problems

SETI@home II Broadband pulse search on existing data Parkes observatory: Southern sky Multi-beam receivers Wider frequency band Use KL transform Data archival on clients

SETI@home software shortcomings Monolithic client and server Limited communication model Limited computation/data model Ad hoc accounting model

PRC platform goals Research lab X University Y Public project Z applications projects Research lab X University Y Public project Z resource pool Participants install one program, select projects, specify constraints Projects are autonomous Advantages of a shared platform: Better instantaneous resource utilization Better long-term resource utilization Faster/cheaper for projects, software is better Easier for projects to get participants Participants learn more

Distributed computing platforms Academic and open-source Globus Cosm XtremWeb Jxta Commercial Entropia United Devices Avaki

BOINC (Berkeley Open Infrastructure for Network Computing) Overall structure Storage model Computation model Programming interface Operational interface Participant’s view

Scheduling server (C++) Overall structure Project: Participant: BOINC DB (MySQL) Project work manager lib Scheduling server (C++) Web interfaces (PHP) data server (HTTP) data server (HTTP) data server (HTTP) App agent App agent App agent Core agent (C++)

Storage model Files: input, output, executables Created by client or project Files are immutable File transfer by HTTP File attributes: Name URL list Persistent Upload-when-present executable MD5 checksum Digital signature <file_info> <name>protein_db.12</name> <persistent/> <url>http://a.b/c</url> <url>ftp://x.y/z</url> <md5_cksum>fw7398h</md_cksum> <nbytes>4782747</nbytes> </file_info>

File management Implicit Explicit Executables, input and output files are transferred pursuant to computation Explicit Clients report persistent files Scheduling server maintains DB of files on active hosts Project can request upload, download, delete

Workunits Represents inputs to a computation Components: Cmdline args, environment vars Expected resource usage Description of input files <file_info> <name>out123</name> <url>http://…</url> </file_info> <workunit> <file_assoc> <file_name>out123</file_name> <app_name>input</app_name> </file_assoc> </workunit>

Results Represents results of a computation Components: Which host did the computation Exit status Stderr output CPU time Output file description Template Actual <file_info> <name>out123</name> <generated_locally/> <upload_when_present/> <url>http://…</url> </file_info> <result> <file_assoc> <file_name>out123</file_name> <fd>1</fd> </ file_assoc > </result> <file_info> <name>out123</name> <url>http://…</url> <md5_cksum>182aed847</md5_cksum> </file_info>

Work sequences (long computations with big footprints) Results can be linked into sequences Result is sent to host that handled predecessor If result times out, sequence is shifted to another host Upload state Check for abort

Hosts and scheduling Host measurements Workunit properties CPU performance (integer/FP/memory) RAM, cache, disk free/total On/idle/connected statistics Network bandwidth statistics Workunit properties RAM/disk/computation requirements Scheduling policy Client: project quotas; high/low water marks Server: workunit feasibility test; prioritization

Accounting and result validation Standardized unit of credit (CPeUro?) CPU time * (int+FP+mem) Result validation (optional): Compare redundant results, flag incorrect results Granted credit: Minimum of claimed credit among correct results

Programming interfaces Application May be multi-file; any executable API for interaction with core client (optional) Checkpoint/restart: MFILE class Graphics: render to shared memory Software development tools Version management Web-based bug tracking

Operational interfaces Operations Add/manage app versions Create workunits/results Query results Query client problems Interfaces C++ libraries Scriptable apps Web-based

Participant preferences Examples: Work only while computer idle Confirm before connecting Don’t work if running on batteries High, low water marks Limits on disk space, bandwidth Application-specific preferences List of projects + authenticators + % allocation Edited via Web interface Can define multiple “preference sets”

Participation Initial project registration: Subsequent projects: Create account on project web site Authenticator is emailed Install core client, enter authenticator Subsequent projects: Add project to preferences on home site

Core client Goals FSM structure Concurrent communicate/compute Obey user preferences Application, screensaver or service Multi-platform; multiprocessor-capable FSM structure file transfers Scheduler requests main loop poll HTTP transactions running applications wait() active sockets select()

Conclusion BOINC features BOINC status Projects: Multiproject, multi-app open PRC platform Simple/small but general BOINC status Mostly feature-complete Client runs on Linux, Solaris, Windows, MacOS X http://boinc.sourceforge.net Projects: SETI@home Arecibo (later this year) Other SETI@home (Parkes etc.) Climate modeling, other science projects Genetic art