1 Cloud Systems Panel at HPDC Boston June Geoffrey Fox Community Grids Laboratory, School of informatics Indiana University
HPDC Cloud Panel Members Randy Bryant, CMU Ian Foster, Chicago Greg Pfister, Consultant Dennis Quan, IBM Dan Reed, Microsoft Moderator Geoffrey Fox, Indiana University
HPDC Cloud Panel Questions What is and equally interestingly what isn't a cloud? – Especially with today’s hype – all is a cloud? What is implication of Clouds for Enterprise Data Centers? If TeraGrid evolves to PetaCloud, how would it look? What are research issues for Clouds? Can one (or who could) "trust" clouds? Will Cloud interoperability be important; if so at what interface(s) will it be provided? What is the killer app for clouds?
Some Conclusions for Scientific Applications Scientific applications and commercial applications share a set of requirements for cloud computing (customized environments, short-term and on- demand capability provisioning) But many scientific applications have additional demands for cloud computing (tightly coupled, state management) The Science community needs to take leadership to find the best ways to leverage this new technology and adapt it for scientific use. (industry won't do it for us)
Some Panel Comments I Clouds are already being used in many Web 2.0 applications which tend to involve little computing Gmail is stateful and this implies nontrivial cloud logic whereas traditional web access has no difficult issues; user restarts by reloading web page Round robin DNS critical technology Scientific data analysis natural cloud application Grids hardly mentioned Clouds likely to become dominant Enterprise data center technology Virtualization critical to cloud implementations
Some Panel Comments II MPI will only be supported if there is a business model to justify it; likely that there is no business model Mapreduce (Hadoop) has critical fault tolerance aspects; MPI needs to avoid tight synchronization that leads to inflexible non fault tolerant implementations Shared memory (between cores as needed in OpenMP) will not be supported Google does not put the fastest CPU’s in its data centers; better are more cheaper better power efficient slower CPUs
Some Panel Comments III Important emphasis on management tools allowing you better access to state of a cloud application Need to be able to track down sources of failures High level parallel (cloud) languages like Dryad or Mapreduce will be essential; again MPI not relevant as too low level Scaling is key characteristic of clouds Many QoS characteristics will be supported including guarantee not to run application in particular countries Interoperability not likely to be an early focus
Some Panel Comments IV Research issues include better high level programming models and use of clouds for science Although some critical data will not be allowed outside a corporate firewall, the panel did not see “trust” (security) as a critical issue Two interesting IBM clouds support internal IBM projects (Innoivators cloud) and IT industry in city of Wuxi IO requirements to read now petabytes and soon terabytes of data high Soon it will be considered quite routine to run queries on the fly against petabytes of data