Volunteer Thinking with Bossa David P. Anderson Space Sciences Laboratory University of California, Berkeley
Scientific workflows define goals volunteers? lab assistants post-docs Herr Professor analyze data write papers perform experiments theorize clean data computers grad students design experiments
Volunteer thinking Tasks that can be divided into lots of self- contained “jobs” Jobs can be performed over the Internet Jobs can be done by a significant fraction of the general public (possibly with training) Currently ~10 projects, 50K participants
Example The Stardust mission Where’s the dust? – 23K volunteers – 43M viewings – 64 tracks found
Lessons learned Motivation – Volunteers will do boring tasks; no need to play games – scientific goals – community, competition – keep volunteers informed Quantifiably high accuracy is possible – calibration jobs – replication
Fold It!
Space of applications 2D vision – pattern recognition GalaxyZoo, Clickworkers) – image analysis (ESP game) 3D manipulation (Fold It!) Natural language Real-world knowledge What else?
Middleware jobs middleware people or computers identity accounting queuing assignment validation
What’s different? People vary Jobs may not be well-defined aptitude training
Bossa Reduces the amount of programming and DB design needed to deploy a volunteer thinking project Policies – Bossa doesn’t provide them – but it provides mechanisms that make it easy to implement a range of policies
Bossa abstractions Project Application Instance Application Job User Job User Instance
The structure of Bossa Application Bossa callback functions job_show() job_issued() job_finished() job_timed_out() bossa_show_job.php bossa_job_finished.php Bossa API DB
Job and result representation Job parameters, results stored in “opaque” PHP structures Callback function to display a job: job_show($job, $inst) Types of jobs – single web page – sequence of pages – offline app
Job distribution policy Project A: few jobs, lots of volunteers – do all jobs once, then twice, etc. Project B: many jobs, limited volunteers – do first job N times, then 2 nd job, etc. Bossa: each job has “priority” – adjusted by callback functions Jobs Instances
Volunteer assessment How to assess? – training course – calibration tasks – correctness as determined by replication Representation – may be multidimensional – may change during session Bossa mechanisms – opaque data for user – calibration jobs – Bolt course prerequisite
Replication policy Examples – fixed replication – adaptive replication Bossa mechanism – job_finished() decides whether more instances are needed, sets priority
Use of “experts” Alternatives – experts do the same job, but better – experts do different jobs Bossa mechanisms – users are assigned a “level” (0, 1,...) – jobs have priority P(i) for each level i Example – experts resolve ambiguous jobs – job_finished(): if ambiguous, raise P(1)
Bossa integration Current: Soon: Drupal BOINC volunteer computing Bolt teaching, training Bossa volunteer thinking BOINC Basics accounts, groups, credit, communication
Projects in development (fossils, Ethiopia) Africa satellite image analysis (UC Berkeley) Mars satellite image analysis (U of Hawaii)
Conclusion Bossa: middleware for volunteer thinking You provide: – job representation and display – job distribution and replication policies – volunteer assessment Future directions – thinking/computing workflows – group jobs – jobs as multiplayer online games