Implementation of a small-scale desktop grid computing infrastructure in a commercial domain
Background and use of BOINC in the academic context Overview Background and use of BOINC in the academic context Infrastructure constructed Quantification of the additional heating produced and power consumed Quantification of cost Conclusion
Berkeley Open Infrastructure for Network Computing (BOINC) Middleware for volunteer computing Open-source, NSF-funded development Community-maintained Server: used by scientists to make “projects” Client: runs on consumer devices “attach” to projects fetches/runs jobs in background
Relationship between Project, Application, Workunit and Task
Steps undertaken Set up a test project in Oxford and successfully build application Replicate test project infrastructure in commercial partner. Test the sandboxed application with a sample configuration on the test project Set up a workunit submission script in the test project. Set up the external workunit submission server and set up a script on this server that communicates with the workunit submission script in the test project. Link a limited number of pre-determined desktop clients (between 5->50 desktops) to the test project. Conduct two limited tests of two use-cases
Desktop client
Application Structure
Test Infrastructure Machine Operating system Minimum specification [CPU, RAM, Local Storage] Functions Test server Linux 8 CPU, 8GB, 1TB upload service, application service, download service
Environmental Monitoring
Utilisation Tranche Start Date End Date 1 22/5/17 2/6/17 2 25/6/17 29/6/17 3 2/7/17 8/7/17 4 14/7/17 5 17/7/17 18/7/17 6 23/7/17 24/7/17 7 29/8/17 13/9/17
WU Success and Failure
WU Success and failure per host
Environmental Analysis
Energy consumption Considering only WU calculated on the machine and hence only that systems consumption Baseline using period 15/07/18 to 29/08/17 Only consider tranches 4 & 7 Resulting consumption driven by BOINC utilisation 426.61W per WU €0.032 per WU (assuming 35 minute WU length)
Temperatures in exemplar rooms
Room temperature difference
138.93 Tflop overall PoC – Builds computed Date Submitted Date 90% complete Build # # of WU % #WU Success % CPU time Success TFlops Success Failure 30/6/17 3/7/17 113 1248 1.12 31 0.06 4.65 13/7/17 117 4016 71 97 10.29 4.46 8/7/17 125 4125 44 6.59 8.62 24/7/17 25/7/17 140 251 80 0.77 0.16 29/8/17 13/9/17 151 27081 42 78 41.84 61.49 138.93 Tflop overall
Build Completion Rate
Comparison of costs Contracted HPC costs €3856 for 1TFlop year Consider Build 151 as basis for analysis of capability for whole year Start time = 30/08/2017 15:20:00 (1504106454) End Time = 13/09/2017 23:59:00 (1505347140) Capability as reported by BOINC Benchmarking of resources and CPU time provided = 103.32 TFlop To run Build151 on the contracted HPC system has available would cost €398420 If this level of resources was used constantly for a whole year if would provide 2,290TFlop of resource On the contracted HPC system available this would cost €8.83M for a whole year at this capacity But systems in a final BOINC project would only be available from 8pm to 6pm Mon-Fri and 24 hours at weekends Therefore we must scale this by 58%, so you would have a maximum capability of 1335.8TFlop *note that this is not a true comparison due to additional capability of HPC system though as the service is not available without these features the partner would still need to pay for them even if they were unused. Earlier averaged energy calculations mean that a WU costs €0.032 per WU. Therefore to run the 27081 WU run to 13/09/2017 23:59:00 within Build151 cost €866.59 But… with the BOINC system we are already using existing systems to provide the capability and therefore incur no extra system costs other than establishing server capability which has cost €120k through both the Evaluation and PoC phases.
Further Work Implement automated deployment. Set up the production project. Link a larger set of desktop clients to the production project. Further extend the capability of the external workunit submission process, allowing user submission with relevant permissions. Conduct full-scale extended runs at large scale of both current and other relevant usecases. Document the project setup. Consider other possible applications for this BOINC infrastructure
Conclusions Successfully run 47.5k workunits in a working BOINC infrastructure, Producing business relevant results for users, at a fraction of costs for comparable system, Makes no discernible impact on temperature within room studied (0.65C is within BMS error), Per WU energy cost of €0.032 (426.61W for ~35 minutes), No new infrastructure required, reusing existing computational systems improving ROI, Clear plan of required future activity to deploy in production, Directly scalable capability, With HPC -