Technology for Citizen Cyberscience Dr. David P. Anderson University of California, Berkeley May 2011
Computational science ● Simulation of physical reality ● scales from molecules to universe ● Analyzing data from new instruments ● LHC, LIGO, SKA, gene sequencers ● Unbounded need for computing power
The Consumer Digital Infrastructure ● 1.5 billion PCs/laptops/tablets ● Graphics Processing Units: 100X CPU speed ● Terabyte-scale storage ● ~10 Mbps Internet bandwidth ● Ideal for scientific computing
Volunteer computing with BOINC volunteers projects CPDN WCG attachments
How to volunteer
Choose projects
Configure
Community
Graphical interface
Screensaver
Creating a BOINC project ● Install BOINC server software on a Linux box ● Compile apps for Windows/Mac/Linux ● Attract volunteers – web site – publicity – communicate with volunteers
Client/server communication BOINC client Scheduler Apache Scheduler request File upload/download HTTP MySQL database
Volunteer computing status ● 40 projects ● 500K volunteers ● 800K computers, 2M cores ● 14 PetaFLOPS ● would cost $5 billion/year on Amazon EC2 ● potential: > ExaFLOPS
Some projects ● IBM World Community Grid ● ● Climateprediction.net ● ●
Organizational issues ● Single-scientist projects are a dead end ● Better: umbrella projects – e.g., ● Focus on public, not in-house, resources
Scientific crowdsourcing ● Use volunteer’s human skills for scientific tasks ● cognition ● natural language ● knowledge ● intuition ● creativity
Examples ● ● find interstellar dust particles ● GalaxyZoo ● classify galaxies ● FoldIt! ● Fold protein molecules
Implementation training report view task voluntee r serve r Scheduler Apache MySQL database
Software systems ● Commercial ● Amazon Mechanical Turk ● Clickworkers ● Open source ● Bossa (UC Berkeley)
Directions ● Quantifiable accuracy ● volunteer calibration, task replication ● Use experts for different tasks ● Combine computing and crowdsourcing ● Task processing by small groups ● Generalized problem-solving