Download presentation
Presentation is loading. Please wait.
Published byBriana Kelley Modified over 8 years ago
1
ACAT, Amsterdam, 26-4-2007 1 Einstein@home and BOINC Bruce Allen Director, Max Planck Institute for Gravitational Physics, Hannover
2
ACAT, Amsterdam, 26-4-2007 2 What is Einstein@home? Science: public distributed search for pulsars in data from the LIGO and GEO gravitational wave detectors. Outreach: cornerstone for American Physical Society’s World Year of Physics 2005 activities. Uses BOINC (Berkeley Open Infrastructure for Network Computing) to distribute the work. Development began in Spring 2004, mostly at the U. Wisconsin - Milwaukee and at the Max Planck Institute for Gravitational Physics. Launch: February 19, 2005. Now one of the largest distributed computing projects in the world. Einstein@home available for Windows, Mac, Linux. Currently providing about 80 Tflop/s of CPU power 24x7. (Like $20M computer + $7k/day electric bill) LIGO Scientific Collaboration’s “Continuous Wave Search Group” is using Einstein@home as primary ‘first pass’ search platform. Internet http://einstein.phys.uwm.edu/
3
ACAT, Amsterdam, 26-4-2007 3 Fundamental Physics Gravitational waves predicted by Einstein’s General Theory of Relativity (1916). No direct detection, though many efforts starting in the 1960s using resonant-mass detectors Four principal types of (astrophysical) sources. Einstein@Home is looking for one type: continuous gravitational waves from isolated spinning neutron stars (pulsars). Blind search is computationally very difficult because the Earth’s rotation about its axis, and orbit about the Sun modulate the signal. Filtering a year of data for all possible waveforms would saturate all computers on the planet.
4
ACAT, Amsterdam, 26-4-2007 4 Detectors LIGO: one of a new generation of interferometric gravitational wave detectors. Cost ~ 500M USD. Sensitive range 40 Hz - 3 kHz Lasers measure distance between mirrors hanging in a vacuum using interferometry Gravitational waves make mirrors “swing” and perturb interference pattern Fractional motion ∆L/L ~ 10 -21 ∆L L
5
ACAT, Amsterdam, 26-4-2007 5 How does search work? Basic method: matched filtering Instrument data is distributed in the frequency domain. Currently about 100 GB of data. Five mirror servers used for data distribution. About 70,000 host machine active at any time. A typical host machine gets 10 - 30 MB of data, sufficient for tens to hundreds of hours of computation. It searches this data for signals. Typical work units about 12 CPU hours long. Return file from hosts is compressed; average size 130kB. Output data is compared with another host machine for validation purposes. Total returned data set size (from 100 million CPU hours) is a few TB.
6
ACAT, Amsterdam, 26-4-2007 6 Example results (S3 analysis) 50-1500 Hz band shows no evidence of strong pulsar signals in sensitive part of the sky, apart from the hardware and software injections. There is nothing “in our backyard”. Outliers are consistent with instrumental lines. All significant artifacts away from r.n=0 are ruled out by follow-up studies. WITH INJECTIONS WITHOUT INJECTIONS
7
ACAT, Amsterdam, 26-4-2007 7 How Big? Einstein@Home BOINC Currently there are about 30 BOINC projects They consume ~ 450 Teraflops 24 x 7 Active developer community (~ 20 people) with message boards, SVN archives, mailing lists, etc.
8
ACAT, Amsterdam, 26-4-2007 8 How does work? Participant view: Download and install BOINC (takes about one minute) Enter project URL Create password when queried Optionally: -Use BOINC manager to track work and progress -Set preferences about when BOINC runs, and resource limits. -Sign up for multiple BOINC projects -Assign resource shares for different BOINC projects -Participate in project message boards -Create a profile -Form or join a team -Chase credits Developer view: BOINC is an open-source project based at Berkeley Create or port a science application (use BOINC libraries for I/O) Write a screensaver (OpenGL) Build and optimize application code on different platforms Set up a project server Write custom back-end components for project server: work unit generator, validator, assimilator Attract users and do science!
9
ACAT, Amsterdam, 26-4-2007 9 Participants
10
ACAT, Amsterdam, 26-4-2007 10 GEO-600 Hannover LIGO Hanford LIGO Livingston Current search point Current search coordinates Known pulsars Known supernovae remnants User name User’s total credits Machine’s total credits Team name Current work % complete Screensaver
11
ACAT, Amsterdam, 26-4-2007 11 BOINC server: a set of daemons Database Work Results Users Hosts Forums … Web Server Web (PHP) pages cgi script scheduler (subprocess) cgi script file_upload_handler (subprocess) Validator (daemon) Compares results to identify correct one(s) Transitioner (daemon) Endless loop, generating new results as needed for workunits with failed/lost results Assimilator (daemon) Endless loop, collecting correct results File deleter (daemon) Delete input files that are no longer needed Database purger (daemon) Deletes rows from Work and Results table of database when no longer needed Project specific parts Work generator (daemon) Makes more work as needed
12
ACAT, Amsterdam, 26-4-2007 12 Servers Three Einstein@home servers -Project server (dual Xeon) runs BOINC daemons -Database server (quad Opteron) with 32 GB memory -File storage (24 disk SATA RAID with 8 TB usable) OS is Linux (FC3/FC4) Three identical spare backup servers. Total hardware cost about $50k (including spares) Internal GB switch Hot-swap SATA disks Two 3kVA UPS systems
13
ACAT, Amsterdam, 26-4-2007 13 How do users get credit? Credit only granted when work has been validated by automatic comparison with identical work performed by other users
14
ACAT, Amsterdam, 26-4-2007 14 How does BOINC work? Database contains a work table and a result table. Client machine contacts scheduler, gets send data from a row in the work table. This has URLs for executable program, data files, command line arguments, estimated run times, etc. When program has run, the science results are returned in a data file. Metadata (exit status, CPU time,…) kept in a row in the result table. The BOINC daemons function as a state machine. They continue to send additional work as needed until valid result found. 5123451234
15
ACAT, Amsterdam, 26-4-2007 15 Einstein@Home took ~ 8 months to develop Application software: -Adding BOINC API calls to application: 1 week -Making the application checkpoint/restart: 2 weeks -Adding upload/download file compression: 1 week -Writing/testing/debugging screensaver: 1 month -Building on Windows (non-Posix!): 2 months. This was mostly because the code requires some libraries designed for automake/autoconf builds on Unix systems. Server software: -Building and testing the validator: 2 months -Developing BOINC locality scheduler (sends work to users with a given data file): 1 month Server hardware: -Setting up and burning in servers: 1 month
16
ACAT, Amsterdam, 26-4-2007 16 BOINC Pros and Cons PROS Distributed computing project does not need to build its own unique infrastructure. Can share CPU and code with other projects. Solid second-generation design (SETI@Home was the prototype) Smart design principle: host machines are unreliable in all possible ways Access to a lot of inexpensive CPU cycles. “Easy” for users to add hosts. Scales to at least 10 6 active hosts (but database and project servers are ultimately bottlenecks to scaling) Public outreach is built-in Well developed community tools BOINC API and library are well-designed and well-structured C++. Not a hack Public open-source project CONS It takes months to port and test a reliable application & typical work deadlines are two weeks: not good for quick turnaround studies or “trying something out”! “Hard” for projects to add new applications or analysis. Biggest host platform (Win32) is not POSIX. Even fopen() acts differently! Data bandwidth: no more than 1 MB data exchange per CPU hour Should fit into small memory (200 MB max) and small disk space (100 MB) for broad appeal Must write automated validator Fixing bugs can be hard (on remote hosts) Very heterogeneous hosts Project science must have some real public appeal
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.