ACAT, Amsterdam, 26-4-2007 1 and BOINC Bruce Allen Director, Max Planck Institute for Gravitational Physics, Hannover.

Slides:



Advertisements
Similar presentations
BOINC: A System for Public-Resource Computing and Storage David P. Anderson University of California, Berkeley.
Advertisements

Searches for continuous gravitational waves with LIGO and GEO600 M. Landry for the LIGO Scientific Collaboration LIGO Hanford Observatory, Richland WA.
What Is a Computer and What Does It Do?
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
LIGO-G Z how to find gravity waves with your home PC Michael Landry LIGO Hanford Observatory California Institute of Technology.
High-Performance Task Distribution for Volunteer Computing Rom Walton
Overview Ground-based Interferometers Barry Barish Caltech Amaldi-6 20-June-05.
Basic Unix Dr Tim Cutts Team Leader Systems Support Group Infrastructure Management Team.
Platform as a Service (PaaS)
Windows Server MIS 424 Professor Sandvig. Overview Role of servers Performance Requirements Server Hardware Software Windows Server IIS.
Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.
1 Network Statistic and Monitoring System Wayne State University Division of Computing and Information Technology Information Technology.
Your Interactive Guide to the Digital World Discovering Computers 2012.
1 port BOSS on Wenjing Wu (IHEP-CC)
Hands-On Virtual Computing
A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.
HTCondor and BOINC. › Berkeley Open Infrastructure for Network Computing › Grew out of began in 2002 › Middleware system for volunteer computing.
Cluster currently consists of: 1 Dell PowerEdge Ghz Dual, quad core Xeons (8 cores) and 16G of RAM Original GRIDVM - SL4 VM-Ware host 1 Dell PowerEdge.
Chapter 2: Operating-System Structures. 2.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 14, 2005 Operating System.
Planning and Designing Server Virtualisation.
ORGANIZING AND ADMINISTERING OF VOLUNTEER DISTRIBUTED COMPUTING PROJECT Oleg Zaikin, Nikolay Khrapov Institute for System Dynamics and Control.
LIGO- XXX Bruce Allen, LSC Talk LIGO Scientific Collaboration - U. Wisconsin - Milwaukee 1 Effects of Timing Errors and Timing Offsets in Pulsar.
OSG Area Coordinators’ Meeting LIGO Applications (NEW) Kent Blackburn Caltech / LIGO October 29 th,
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Block1 Wrapping Your Nugget Around Distributed Processing.
Hour 7 The Application Layer 1. What Is the Application Layer? The Application layer is the top layer in TCP/IP's protocol suite Some of the components.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.
David Cameron Riccardo Bianchi Claire Adam Bourdarios Andrej Filipcic Eric Lançon Efrat Tal Hod Wenjing Wu on behalf of the ATLAS Collaboration CHEP 15,
07:44:46Service Oriented Cyberinfrastructure Lab, Introduction to BOINC By: Andrew J Younge
Lessons Learned from David P. Anderson Director, Spaces Sciences Laboratory U.C. Berkeley April 2, 2002.
Grid MP at ISIS Tom Griffin, ISIS Facility. Introduction About ISIS Why Grid MP? About Grid MP Examples The future.
6-10 Oct 2002GREX 2002, Pisa D. Verkindt, LAPP 1 Virgo Data Acquisition D. Verkindt, LAPP DAQ Purpose DAQ Architecture Data Acquisition examples Connection.
1Computer Sciences Department Princess Nourah bint Abdulrahman University.
Chapter 3: Services of Network Operating Systems Maysoon AlDuwais.
LIGO-G Z GWDAW10, December 16, S3 Final Results Bruce Allen, for the LIGO Scientific Collaboration.
Amit Warke Jerry Philip Lateef Yusuf Supraja Narasimhan Back2Cloud: Remote Backup Service.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public and Grid Computing.
State of LSC Data Analysis and Software LSC Meeting LIGO Hanford Observatory November 11 th, 2003 Kent Blackburn, Stuart Anderson, Albert Lazzarini LIGO.
Dr Jukka Klem CHEP06 1 Public Resource Computing at CERN – Philippe Defert, Markku Degerholm, Francois Grey, Jukka Klem, Juan Antonio.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
G Z The LIGO gravitational wave detector consists of two observatories »LIGO Hanford Observatory – 2 interferometers (4 km long arms and 2 km.
LIGO-G Z Peter Shawhan (University of Maryland) for the LIGO Scientific Collaboration Special thanks to Michael Landry and Bruce Allen Eastern.
January 2010 – GEO-ISC KickOff meeting Christian Gräf, AEI 10 m Prototype Team State-of-the-art digital control: Introducing LIGO CDS.
Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.
Volunteer Computing and Large-Scale Simulation David P. Anderson U.C. Berkeley Space Sciences Lab February 3, 2007.
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.
Introduction to Computers - Hardware
Platform as a Service (PaaS)
Xavier Siemens for the LSC University of Wisconsin -- Milwaukee
Bruce Allen, UWM LSC Meeting Nov 4-7, 2004 LIGO Scientific Collaboration, UW - Milwaukee.
Platform as a Service (PaaS)
2. OPERATING SYSTEM 2.1 Operating System Function
ASIS Status Report Bruce Allen LSC Meeting, LHO, August 16, 2001
Coherent wide parameter space searches for gravitational waves from neutron stars using LIGO S2 data Xavier Siemens, for the LIGO Scientific Collaboration.
Chapter 2: System Structures
Cloud based Open Source Backup/Restore Tool
The software infrastructure of II
Chapter 2: System Structures
Status of LIGO Patrick J. Sutton LIGO-Caltech
Ivan Reid (Brunel University London/CMS)
Chapter 2: Operating-System Structures
Web Application Development Using PHP
Presentation transcript:

ACAT, Amsterdam, and BOINC Bruce Allen Director, Max Planck Institute for Gravitational Physics, Hannover

ACAT, Amsterdam, What is Science: public distributed search for pulsars in data from the LIGO and GEO gravitational wave detectors. Outreach: cornerstone for American Physical Society’s World Year of Physics 2005 activities. Uses BOINC (Berkeley Open Infrastructure for Network Computing) to distribute the work. Development began in Spring 2004, mostly at the U. Wisconsin - Milwaukee and at the Max Planck Institute for Gravitational Physics. Launch: February 19, Now one of the largest distributed computing projects in the world. available for Windows, Mac, Linux. Currently providing about 80 Tflop/s of CPU power 24x7. (Like $20M computer + $7k/day electric bill) LIGO Scientific Collaboration’s “Continuous Wave Search Group” is using as primary ‘first pass’ search platform. Internet

ACAT, Amsterdam, Fundamental Physics Gravitational waves predicted by Einstein’s General Theory of Relativity (1916). No direct detection, though many efforts starting in the 1960s using resonant-mass detectors Four principal types of (astrophysical) sources. is looking for one type: continuous gravitational waves from isolated spinning neutron stars (pulsars). Blind search is computationally very difficult because the Earth’s rotation about its axis, and orbit about the Sun modulate the signal. Filtering a year of data for all possible waveforms would saturate all computers on the planet.

ACAT, Amsterdam, Detectors LIGO: one of a new generation of interferometric gravitational wave detectors. Cost ~ 500M USD. Sensitive range 40 Hz - 3 kHz Lasers measure distance between mirrors hanging in a vacuum using interferometry Gravitational waves make mirrors “swing” and perturb interference pattern Fractional motion ∆L/L ~ ∆L L

ACAT, Amsterdam, How does search work? Basic method: matched filtering Instrument data is distributed in the frequency domain. Currently about 100 GB of data. Five mirror servers used for data distribution. About 70,000 host machine active at any time. A typical host machine gets MB of data, sufficient for tens to hundreds of hours of computation. It searches this data for signals. Typical work units about 12 CPU hours long. Return file from hosts is compressed; average size 130kB. Output data is compared with another host machine for validation purposes. Total returned data set size (from 100 million CPU hours) is a few TB.

ACAT, Amsterdam, Example results (S3 analysis) Hz band shows no evidence of strong pulsar signals in sensitive part of the sky, apart from the hardware and software injections. There is nothing “in our backyard”. Outliers are consistent with instrumental lines. All significant artifacts away from r.n=0 are ruled out by follow-up studies. WITH INJECTIONS WITHOUT INJECTIONS

ACAT, Amsterdam, How Big? BOINC Currently there are about 30 BOINC projects They consume ~ 450 Teraflops 24 x 7 Active developer community (~ 20 people) with message boards, SVN archives, mailing lists, etc.

ACAT, Amsterdam, How does work? Participant view: Download and install BOINC (takes about one minute) Enter project URL Create password when queried Optionally: -Use BOINC manager to track work and progress -Set preferences about when BOINC runs, and resource limits. -Sign up for multiple BOINC projects -Assign resource shares for different BOINC projects -Participate in project message boards -Create a profile -Form or join a team -Chase credits Developer view: BOINC is an open-source project based at Berkeley Create or port a science application (use BOINC libraries for I/O) Write a screensaver (OpenGL) Build and optimize application code on different platforms Set up a project server Write custom back-end components for project server: work unit generator, validator, assimilator Attract users and do science!

ACAT, Amsterdam, Participants

ACAT, Amsterdam, GEO-600 Hannover LIGO Hanford LIGO Livingston Current search point Current search coordinates Known pulsars Known supernovae remnants User name User’s total credits Machine’s total credits Team name Current work % complete  Screensaver

ACAT, Amsterdam, BOINC server: a set of daemons Database Work Results Users Hosts Forums … Web Server Web (PHP) pages cgi script scheduler (subprocess) cgi script file_upload_handler (subprocess) Validator (daemon) Compares results to identify correct one(s) Transitioner (daemon) Endless loop, generating new results as needed for workunits with failed/lost results Assimilator (daemon) Endless loop, collecting correct results File deleter (daemon) Delete input files that are no longer needed Database purger (daemon) Deletes rows from Work and Results table of database when no longer needed Project specific parts Work generator (daemon) Makes more work as needed

ACAT, Amsterdam, Servers Three servers -Project server (dual Xeon) runs BOINC daemons -Database server (quad Opteron) with 32 GB memory -File storage (24 disk SATA RAID with 8 TB usable) OS is Linux (FC3/FC4) Three identical spare backup servers. Total hardware cost about $50k (including spares) Internal GB switch Hot-swap SATA disks Two 3kVA UPS systems

ACAT, Amsterdam, How do users get credit? Credit only granted when work has been validated by automatic comparison with identical work performed by other users

ACAT, Amsterdam, How does BOINC work? Database contains a work table and a result table. Client machine contacts scheduler, gets send data from a row in the work table. This has URLs for executable program, data files, command line arguments, estimated run times, etc. When program has run, the science results are returned in a data file. Metadata (exit status, CPU time,…) kept in a row in the result table. The BOINC daemons function as a state machine. They continue to send additional work as needed until valid result found

ACAT, Amsterdam, took ~ 8 months to develop Application software: -Adding BOINC API calls to application: 1 week -Making the application checkpoint/restart: 2 weeks -Adding upload/download file compression: 1 week -Writing/testing/debugging screensaver: 1 month -Building on Windows (non-Posix!): 2 months. This was mostly because the code requires some libraries designed for automake/autoconf builds on Unix systems. Server software: -Building and testing the validator: 2 months -Developing BOINC locality scheduler (sends work to users with a given data file): 1 month Server hardware: -Setting up and burning in servers: 1 month

ACAT, Amsterdam, BOINC Pros and Cons PROS Distributed computing project does not need to build its own unique infrastructure. Can share CPU and code with other projects. Solid second-generation design was the prototype) Smart design principle: host machines are unreliable in all possible ways Access to a lot of inexpensive CPU cycles. “Easy” for users to add hosts. Scales to at least 10 6 active hosts (but database and project servers are ultimately bottlenecks to scaling) Public outreach is built-in Well developed community tools BOINC API and library are well-designed and well-structured C++. Not a hack Public open-source project CONS It takes months to port and test a reliable application & typical work deadlines are two weeks: not good for quick turnaround studies or “trying something out”! “Hard” for projects to add new applications or analysis. Biggest host platform (Win32) is not POSIX. Even fopen() acts differently! Data bandwidth: no more than 1 MB data exchange per CPU hour Should fit into small memory (200 MB max) and small disk space (100 MB) for broad appeal Must write automated validator Fixing bugs can be hard (on remote hosts) Very heterogeneous hosts Project science must have some real public appeal