Canadian Bioinformatics Workshops www.bioinformatics.ca.

Slides:



Advertisements
Similar presentations
Cloud Computing Computer Science Innovations, LLC.
Advertisements

Creating HIPAA-Compliant Medical Data Applications with Amazon Web Services Presented by, Tulika Srivastava Purdue University.
Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute.
B. Ramamurthy 4/17/ Overview of EC2 Components (fig. 2.1) 10..* /17/20152.
Amazon Web Services (aws) B. Ramamurthy. Introduction  Amazon.com, the online market place for goods, has leveraged the services that worked for their.
1 NETE4631 Cloud deployment models and migration Lecture Notes #4.
Cloud Computing Brandon Hixon Jonathan Moore. Cloud Computing Brandon Hixon What is Cloud Computing? How does it work? Jonathan Moore What are the key.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 6 2/13/2015.
Creating an AMI at Amazon’s EC2 Joe Steele
Introduction to Unix GLY 560: GIS for Earth Scientists Class Home Page:
Creating a Biolinux AMI at Amazon’s EC2
Basic Unix Dr Tim Cutts Team Leader Systems Support Group Infrastructure Management Team.
PresentPC August 2009 Erick Engelke Engineering Computing.
Amazon EC2 Quick Start adapted from EC2_GetStarted.html.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
A crash course in njit’s Afs
Introduction to Programming G50PRO University of Nottingham Unit 1 : Introduction Paul Tennent
Introduction to UNIX/Linux Exercises Dan Stanzione.
Eucalyptus Virtual Machines Running Maven, Tomcat, and Mysql.
Customized cloud platform for computing on your terms !
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
COSC513 Project Linux Features Instructor: Prof. Mort Anvari Student: Yingfeng Luo ID: #
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
1 NETE4631 Amazon Cloud Offerings Lecture Notes #6.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
| nectar.org.au NECTAR TRAINING Module 10 Beyond the Dashboard.
Component 4: Introduction to Information and Computer Science Unit 4: Application and System Software Lecture 3 This material was developed by Oregon Health.
CS 390 Unix Programming Summer Unix Programming - CS 3902 Course Details Online Information Please check.
-- Don Preuss NCBI/NLM/NIH
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Networking in Linux. ♦ Introduction A computer network is defined as a number of systems that are connected to each other and exchange information across.
Data Science Background and Course Software setup Week 1.
Page 1 Printing & Terminal Services Lecture 8 Hassan Shuja 11/16/2004.
Getting Started Introduction Section 0 Lecture 1 Slide 1 Section 0 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate Lab Fall.
| nectar.org.au NECTAR TRAINING Module 10 Beyond the Dashboard.
AWS Usage Tips SCS APAC MAR Agenda About Amazon Web Service Sign up the AWS account AWS Management Oracle Apps AMI – Siebel CRM – EBS R
FILE MANAGEMENT Computer Basics 1.3. FILE EXTENSIONS.txt.pdf.jpg.bmp.png.zip.wav.mp3.doc.docx.xls.xlsx.ppt.pptx.accdb.
Getting Connected CPSC 1010 August 21, Connecting to the SOC Servers Why would we need to connect Work with files Transfer files from your local.
Launch Amazon Instance. Amazon EC2 Amazon Elastic Compute Cloud (Amazon EC2) provides resizable computing capacity in the Amazon Web Services (AWS) cloud.
Virtual Machines Module 2. Objectives Define virtual machine Define common terminology Identify advantages and disadvantages Determine what software is.
Canadian Bioinformatics Workshops
Transforming Science Through Data-driven Discovery Tools and Services Workshop Atmosphere Joslynn Lee – Data Science Educator Cold Spring Harbor Laboratory,
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
Canadian Bioinformatics Workshops
© 2015 MetricStream, Inc. All Rights Reserved. AWS server provisioning © 2015 MetricStream, Inc. All Rights Reserved. By, Srikanth K & Rohit.
Canadian Bioinformatics Workshops bioinformatics.ca.
INTRODUCTION TO AMAZON WEB SERVICES (EC2). AMAZON WEB SERVICES  Services  Storage (Glacier, S3)  Compute (Elastic Compute Cloud, EC2)  Databases (Redshift,
Short Read Workshop Day 1 - Experimental Design Example 1: How to log in to vieques.
Canadian Bioinformatics Workshops
SEMINAR ON.  OVERVIEW -  What is Cloud Computing???  Amazon Elastic Cloud Computing (Amazon EC2)  Amazon EC2 Core Concept  How to use Amazon EC2.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Course: Cluster, grid and cloud computing systems Course author: Prof
Canadian Bioinformatics Workshops
Introduction and Getting Started guide Alex Zlotnik Technion
Introduction to Operating Systems
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Introduction To Networking
Bioinformatic analysis using Jetstream, a cloud computing environment
Brandon Hixon Jonathan Moore
Getting Started: Amazon AWS Account Creation
Lecture 16B: Instructions on how to use Hadoop on Amazon Web Services
Different types of Linux installation
SQL Server on Amazon Web Services
SQL Server on Amazon Web Services
Presentation transcript:

Canadian Bioinformatics Workshops

2Module #: Title of Module

Module 0 Introduction to cloud computing (slides modified with permission from Francis Ouellette)

RNA sequencing and analysis bioinformatics.ca Learning objectives of the course Module 0: Introduction to cloud computing Module 1: Introduction to RNA sequencing Module 2: RNA-seq alignment and visualization Module 3: Expression and Differential Expression Module 4: Isoform discovery and alternative expression Tutorials – Provide a working example of an RNA-seq analysis pipeline – Run in a ‘reasonable’ amount of time with modest computer resources – Self contained, self explanatory, portable

RNA sequencing and analysis bioinformatics.ca Learning Objectives of module 0 Introduction to cloud computing Use of the wiki(s) in this workshop How to log into the cloud

,000 10, ,000 1,000, ,000 10, ,000 1,000,000 10,000, ,000,000 1,000,000,000 Disk Storage (Mbytes/$) DNA Sequencing (bp/$) Hard disk storage (MB/$) Doubling time=14 mo Hard disk storage (MB/$) Doubling time=14 mo Pre-nextgen sequencing (bp/$) Doubling time=19 mo Pre-nextgen sequencing (bp/$) Doubling time=19 mo Nextgen sequencing (bp/$) Doubling time=4 mo0 Nextgen sequencing (bp/$) Doubling time=4 mo0 Disk Capacity vs Sequencing Capacity,

RNA sequencing and analysis bioinformatics.ca About DNA and computers We'll hit the $1000 genome during 2015-?, then need to think about the $100 genome. The doubling time of sequencing has been ~5-6 months. The doubling time of storage and network bandwidth is ~12 months. The doubling time of CPU speed is ~18 months. The cost of sequencing a base pair will eventually equal the cost of storing a base pair

RNA sequencing and analysis bioinformatics.ca What is the general biomedical scientist to do? Lots of data Poor IT infrastructure in many labs Where do they go? Write more grants? Get bigger hardware?

RNA sequencing and analysis bioinformatics.ca Amazon Web Services (AWS) Infinite storage (scalable): S3 (simple storage service) Compute per hour: EC2 (elastic cloud computing) Ready when you are High Performance Computing Multiple football fields of HPC throughout the world HPC are expanded at one container at a time:

RNA sequencing and analysis bioinformatics.ca Some of the challenges of cloud computing: Not cheap! Getting files to and from there Not the best solution for everybody Standardization PHI: personal health information & security concerns In the USA: HIPAA act, PSQIA act, HITECH act, Patriot act, CLIA and CAP programs, etc. –

RNA sequencing and analysis bioinformatics.ca Some of the advantages of cloud computing: We received a grant from Amazon, so supported by ‘AWS in Education grant award’. There are better ways of transferring large files, and now AWS makes it free to upload files. A number of datasets exist on AWS (e.g genome data). Many useful bioinformatics AMI’s (Amazon Machine Images) exist on AWS: e.g. cloudbiolinux & CloudMan (Galaxy) – now one for this course! Many flavors of cloud available, not just AWS

RNA sequencing and analysis bioinformatics.ca In this workshop: Some tools (data) are on your computer on the web on the cloud. You will become efficient at traversing these various spaces, and finding resources you need, and using what is best for you. There are different ways of using the cloud: 1.Command line (like your own very powerful Unix box) 2.With a web-browser (e.g. Galaxy): not in this workshop

RNA sequencing and analysis bioinformatics.ca Things we have set up: Loaded data files to an ftp server We brought up an Ubuntu (Linux) instance, and loaded a whole bunch of software for NGS analysis. We then cloned this, and made separate instances for everybody in the class. We’ve simplified the security: you basically all have the same login and file access, and opened ports. In your own world you would be more secure.

RNA sequencing and analysis bioinformatics.ca Amazon AWS Management Console – quick walkthrough

RNA sequencing and analysis bioinformatics.ca For this workshop: all on Wiki! Login: FirstnameLastname Password: ‘guest’

RNA sequencing and analysis bioinformatics.ca The main CBW Wiki

RNA sequencing and analysis bioinformatics.ca The RNA-seq wiki

RNA sequencing and analysis bioinformatics.ca Macintosh users

RNA sequencing and analysis bioinformatics.ca Opening a ‘terminal session’

RNA sequencing and analysis bioinformatics.ca Creating a working directory on your mac

RNA sequencing and analysis bioinformatics.ca On Mac: Control+ Obtaining your AWS ‘key’ file from the wiki On Windows Save key file to your new ‘cbw’ directory

RNA sequencing and analysis bioinformatics.ca Viewing the ‘key’ file once downloaded

RNA sequencing and analysis bioinformatics.ca ls -l (long listing) drwx francis staff May 21:25../ 1 francis staff May 21:31 CBW.pem rwx : owner rwx : group rwx: world r read (4) w write (2) x execute (1) Which ever way you add these 3 numbers, you know which integers were used (6 is always 4+2, 5 is 4+1, 4 is by itself, 0 is none of them etc …) So, when you have: chmod 600 It is “rw” for the the file owner only Changing file permissions of your ‘key’ file

RNA sequencing and analysis bioinformatics.ca Logging into AWS (Mac) Use your assigned student #

RNA sequencing and analysis bioinformatics.ca Logging into AWS (Windows) Follow complete instructions on Wiki:

RNA sequencing and analysis bioinformatics.ca Copying files from AWS to your computer

RNA sequencing and analysis bioinformatics.ca So, at this point: Your laptop is ready for the workshop If it is not, you know where to get the information you need You know how to use the wiki for this workshop You know where all of the lectures are You have read all of the pre-lecture material If not, you know where the papers are, and you are a speed reader You know how to login to AWS

RNA sequencing and analysis bioinformatics.ca A much more detailed tutorial on AWS cloud computing… -to-AWS-Cloud-Computing -to-AWS-Cloud-Computing

RNA sequencing and analysis bioinformatics.ca We are on a Coffee Break & Networking Session