Clouds in Bioinformatics Rob Knight HHMI and University of Colorado at Boulder.

Slides:



Advertisements
Similar presentations
1/17/20141 Leveraging Cloudbursting To Drive Down IT Costs Eric Burgener Senior Vice President, Product Marketing March 9, 2010.
Advertisements

Take your CMS to the cloud to lighten the load Brett Pollak Campus Web Office UC San Diego.
Cloud Computing Brandon Hixon Jonathan Moore. Cloud Computing Brandon Hixon What is Cloud Computing? How does it work? Jonathan Moore What are the key.
Cloud Computing to Satisfy Peak Capacity Needs Case Study.
HPC Pack On-Premises On-premises clusters Ability to scale to reduce runtimes Job scheduling and mgmt via head node Reliability HPC Pack Hybrid.
Cloud Computing PRESENTED BY- Rajat Dixit (rd2392)
Clouds and CI Magellan Perspective Shane Canon Lawrence Berkeley National Lab LSN April 4, 2012.
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS Ravi K Madduri University of Chicago and ANL.
Chapter 9 Designing Systems for Diverse Environments.
 Amazon Web Services announced the launch of Cluster Compute Instances for Amazon EC2.  Which aims to provide high-bandwidth, low- latency instances.
Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.
What is Cloud Computing? o Cloud computing:- is a style of computing in which dynamically scalable and often virtualized resources are provided as a service.
Manasa Guduru Sai Prasanth Sridhar Malini srinivasan Sinduja Narasimhan Reference: Aymerich, F. M., Fenu, G., & Surcis, S. (2008). An approach to a cloud.
SPRING 2011 CLOUD COMPUTING Cloud Computing San José State University Computer Architecture (CS 147) Professor Sin-Min Lee Presentation by Vladimir Serdyukov.
Supervisor: Hadi Salimi Abdollah Ebrahimi Mazandaran University Of Science & Technology January,
Cloud computing Tahani aljehani.
Duncan Fraiser, Adam Gambrell, Lisa Schalk, Emily Williams
Internet GIS. A vast network connecting computers throughout the world Computers on the Internet are physically connected Computers on the Internet use.
A Study in NoSQL & Distributed Database Systems John Hawkins.
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
 Cloud computing is one of the more recent technologies that many businesses, individuals and other industry organizations believe to by one of the keys.
CLOUD COMPUTING & COST MANAGEMENT S. Gurubalasubramaniyan, MSc IT, MTech Presented by.
Introduction to Cloud Computing
Tier 3g Infrastructure Doug Benjamin Duke University.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
Introduction to Cloud Computing Cloud Computing : Module 1.
Introduction to Cloud Technology StratusLab Tutorial (Orsay, France) 28 November 2012.
Infrastructure clouds, microbial genomics, and the Cloud Virtual Resource project (CloVR) Sam Angiuoli
Software Architecture
Introduction to Cloud Computing
Cluster Reliability Project ISIS Vanderbilt University.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
Infrastructure Clouds MAGIC Meeting Chris Hill, MIT September 7 th 2011 interested in modeling of Earth and planetary systems to both understand basic.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
What is the cloud ? IT as a service Cloud allows access to services without user technical knowledge or control of supporting infrastructure Best described.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
PaaSport Introduction on Cloud Computing PaaSport training material.
| nectar.org.au NECTAR TRAINING Module 1 Overview of cloud computing and NeCTAR services.
Cloud computing Cloud Computing1. NIST: Five essential characteristics On-demand self-service Computing capabilities, disks are demanded over the network.
CLOUD COMPUTING RICH SANGPROM. What is cloud computing? “Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
Web Technologies Lecture 13 Introduction to cloud computing.
Building and managing production bioclusters Chris Dagdigian BIOSILICO Vol2, No. 5 September 2004 Ankur Dhanik.
Globus.org/genomics Globus Galaxies Science Gateways as a Service Ravi K Madduri, University of Chicago and Argonne National Laboratory
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-2.
Document Name CONFIDENTIAL Version Control Version No.DateType of ChangesOwner/ Author Date of Review/Expiry The information contained in this document.
Cloud Computing. new buzz word driven largely by marketing and service offerings Provided by big corporate players Google IBM Amazon Apple.
Dr. Hussein Al-Bahadili Faculty of Information Technology Petra University Securing E-Transaction 1/24.
RANDY MODOWSKI COSC Cloud Computing. Road Map What is Cloud Computing? History of “The Cloud” Cloud Milestones How Cloud Computing is being used.
Welcome To We have registered over 5,000 domain names and host over 1,500 cloud servers for individuals and organizations, Our fast and reliable.
Clouds for Scientific Computing and Big Data Clouds pour le Calcul Scientifique : Ressources et Enjeux pour Paris Sud Dr Charles LOOMIS (LAL) 27 November.
Advanced cloud infrastructures and services SAULIUS ŽIŪKAS.
@3 rd DPLTA CERN 7 December 2009 by Homer Neal ( ) BaBar Computing rd DPLTA CERN 7 December 2009 by Homer Neal ( )
Unit 3 Virtualization.
Introduction to Cloud Technology
Tools and Services Workshop
University of Chicago and ANL
Joslynn Lee – Data Science Educator
GRID COMPUTING PRESENTED BY : Richa Chaudhary.
Introduction to Cloud Computing
VIRTUALIZATION & CLOUD COMPUTING
Brandon Hixon Jonathan Moore
Distributed computing deals with hardware
Campus and Phoenix Resources
Presentation transcript:

Clouds in Bioinformatics Rob Knight HHMI and University of Colorado at Boulder

How we use infrastructure clouds New instruments collect vast datasets: investigators care about time-to-result (especially for diagnostics/pathogen discovery), demand is highly elastic Workflow services (Galaxy, CIPRES, QIIME) deployed to cloud allow rapid processing of sequence data for tasks including genome assembly, microbial community analysis Of increasing interest: cloud-enabled GUIs that allow biologists to run analyses themselves using vast compute resources

Appeal of infrastructure clouds? Allows large resources (e.g. EC2) to be used on-demand at trivial cost compared to buying resource that handles peak load (e.g cores to process 1.6 TB from a HiSeq run overnight; more for interactive use) Greatly reduced hardware support and systems administration burden compared to purchasing own machine 3 rd parties using our software can easily pay for their own compute cycles, rather than drawing from our resources/allocations Machine image easily used by end-users with no prior cloud experience (installation on arbitrary clusters a nightmare)

Cloud Challenges Difficult to get large datasets into the cloud: must currently mail drives, need larger pipes! Unreliability. Have had to rewrite code to deploy on EC2, mostly to deal with random node and network failures, checkpointing apparently more critical than on traditional cluster. Existing clouds suited to only a subset of bio tasks: higher-memory configurations with local storage especially needed in order to minimize problems with I/O-bound tasks

Compare to other outsourcing Academic collaborations: clouds e.g. EC2 and Magellan vastly easier to use than native install on arbitrary and surprisingly configured cluster. Resource availability much better especially near grant deadlines. Disadvantage is cost (commercial) or reliability (current academic systems) Advantages relative to TG: ease of use, actually works for our applications (e.g. taskfarming and other loosely coupled tasks, especially long-running tasks, not priority for TG), administrators responsive, homogeneous across resources in a given cloud. No disadvantage from my perspective. Advantages relative to commercial: transparency of how result was obtained, ability to reproduce computation. No disadvantage from my perspective.