Canadian Bioinformatics Workshops bioinformatics.ca.

Slides:



Advertisements
Similar presentations
Cloud Computing Computer Science Innovations, LLC.
Advertisements

Creating HIPAA-Compliant Medical Data Applications with Amazon Web Services Presented by, Tulika Srivastava Purdue University.
UKOLN is supported by: This work is licensed under a Attribution- NonCommercial-ShareAlike 2.0 licence This excludes images B3: The Economical.
SSH Operation and Techniques - © William Stearns 1 SSH Operation and Techniques The Swiss Army Knife of encryption tools…
B. Ramamurthy 4/17/ Overview of EC2 Components (fig. 2.1) 10..* /17/20152.
Amazon Web Services Justin DeBrabant CIS Advanced Systems - Fall 2013.
Amazon Web Services (aws) B. Ramamurthy. Introduction  Amazon.com, the online market place for goods, has leveraged the services that worked for their.
EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.
1. Topics Is Cloud Computing the way to go? ARC ABM Review Configuration Basics Setting up the ARC Cloud-Based ABM Hardware Configuration Software Configuration.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 6 2/13/2015.
Creating an AMI at Amazon’s EC2 Joe Steele
Creating a Biolinux AMI at Amazon’s EC2
Mgt 240 Lecture Exam Review February 1, Homework Three Due Friday 2/4 at 5pm Due Friday 2/4 at 5pm Any questions? Any questions? Posted on course.
Matt Bertrand Building GIS Apps in the Cloud. Infrastructure - Provides computer infrastructure, typically a platform virtualization environment, as a.
Amazon EC2 Quick Start adapted from EC2_GetStarted.html.
A crash course in njit’s Afs
Windows Server MIS 424 Professor Sandvig. Overview Role of servers Performance Requirements Server Hardware Software Windows Server IIS.
Cloud Computing الحوسبة السحابية. subject History of Cloud Before the cloud Cloud Conditions Definition of Cloud Computing Cloud Anatomy Type of Cloud.
Eucalyptus Virtual Machines Running Maven, Tomcat, and Mysql.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Customized cloud platform for computing on your terms !
© Spinnaker Labs, Inc. Google Cluster Computing Faculty Training Workshop Open Source Tools for Teaching.
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
Accessing the Amazon Elastic Compute Cloud (EC2) Angadh Singh Jerome Braun.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 2.
Introduction to Cloud Computing
K. Liu, Q. Huang, J. Xia, Z. Li, P. Lostritto, Chapter 4 How to use cloud computing?, In Spatial Cloud Computing: a practical approach, edited by.
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
| nectar.org.au NECTAR TRAINING Module 10 Beyond the Dashboard.
Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
AWS Usage Tips SCS APAC MAR Agenda About Amazon Web Service Sign up the AWS account AWS Management Oracle Apps AMI – Siebel CRM – EBS R
Unix Servers Used in This Class  Two Unix servers set up in CS department will be used for some programming projects  Machine name: eustis.eecs.ucf.edu.
FILE MANAGEMENT Computer Basics 1.3. FILE EXTENSIONS.txt.pdf.jpg.bmp.png.zip.wav.mp3.doc.docx.xls.xlsx.ppt.pptx.accdb.
Getting Connected CPSC 1010 August 21, Connecting to the SOC Servers Why would we need to connect Work with files Transfer files from your local.
Launch Amazon Instance. Amazon EC2 Amazon Elastic Compute Cloud (Amazon EC2) provides resizable computing capacity in the Amazon Web Services (AWS) cloud.
Enw / Name. What is a on-line / paper based data capture form Can you give an example where each are used? Automated data capture systems are used around.
Data Hosting and Security Overview January, 2011.
#SummitNow Alfresco Deployments on AWS Cost-Effective, Scalable & Secure Michael Waldrop Director, Solutions Engineering .
Transforming Science Through Data-driven Discovery Tools and Services Workshop Atmosphere Joslynn Lee – Data Science Educator Cold Spring Harbor Laboratory,
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
© 2015 MetricStream, Inc. All Rights Reserved. AWS server provisioning © 2015 MetricStream, Inc. All Rights Reserved. By, Srikanth K & Rohit.
What is Cloud Computing 1. Cloud computing is a service that helps you to perform the tasks over the Internet. The users can access resources as they.
UNIX U.Y: 1435/1436 H Operating System Concept. What is an Operating System?  The operating system (OS) is the program which starts up when you turn.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
INTRODUCTION TO AMAZON WEB SERVICES (EC2). AMAZON WEB SERVICES  Services  Storage (Glacier, S3)  Compute (Elastic Compute Cloud, EC2)  Databases (Redshift,
Canadian Bioinformatics Workshops
SEMINAR ON.  OVERVIEW -  What is Cloud Computing???  Amazon Elastic Cloud Computing (Amazon EC2)  Amazon EC2 Core Concept  How to use Amazon EC2.
WHAT IS CLOUD COMPUTING? Pierce County Library System.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Course: Cluster, grid and cloud computing systems Course author: Prof
Canadian Bioinformatics Workshops
Andrew McCombs March 10th, 2011
FTP - File Transfer Protocol
AWS COURSE DEMO BY PROFESSIONAL-GURU. Amazon History Ladder & Offering.
Bioinformatic analysis using Jetstream, a cloud computing environment
What is an Operating System?
An Introduction to Cloud Computing
Lecture 16B: Instructions on how to use Hadoop on Amazon Web Services
Different types of Linux installation
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
LO3 – Understand Business IT Systems
Presentation transcript:

Canadian Bioinformatics Workshops bioinformatics.ca

2Module #: Title of Module

Module 1 bioinformatics.ca You are free to: Copy, share, adapt, or re-mix; Photograph, film, or broadcast; Blog, live-blog, or post video of; This presentation. Provided that: You attribute the work to its author and respect the rights and licenses associated with its components. Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero.ccZero Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at;

Module 1 bioinformatics.ca Web: Workshop announcement mailing list:

Module 1 Cloud Computing with AWS Zhibin Lu & BF Francis Ouellette High-throughput Sequencing June 10-11,

Module 1 bioinformatics.ca Learning Learning Objectives Introduction to cloud computing Use of wiki in this workshop How to log into the cloud Amazon AWS management console

Module 1 bioinformatics.ca Cloud computing … and new software paradigm Data sets are reaching the Petabyte scale. Data (and the security rules that come with it) will be somewhere, and you will move your software to it. Software development paradigm will change: no more reading of files into RAM, processing, and then writing output: you need to think about processing streaming data coming from a sequencing machine somewhere on the net.

,000 10, ,000 1,000, ,000 10, ,000 1,000,000 10,000, ,000,000 1,000,000,000 Disk Capacity vs Sequencing Capacity, Disk Storage (Mbytes/$) DNA Sequencing (bp/$) Hard disk storage (MB/$) Doubling time=14 mo Hard disk storage (MB/$) Doubling time=14 mo Pre-nextgen sequencing (bp/$) Doubling time=19 mo Pre-nextgen sequencing (bp/$) Doubling time=19 mo Nextgen sequencing (bp/$) Doubling time=4 mo0 Nextgen sequencing (bp/$) Doubling time=4 mo0

Module 1 bioinformatics.ca We now have ~ $1000 genome, but now need to think more about the cost of the analysis. The doubling time of the reduction of sequencing in cost is in the “many months” range. The doubling time of storage and network bandwidth is “very small number of years” range. The doubling time of CPU speed is 18 months. The cost of sequencing a base pair will equal the cost of storing a base pair by in the next “very small number” of years. About DNA and computers

Module 1 bioinformatics.ca What is the general biomedical scientists to do? Lots of data Poor IT infrastructure in many labs Where do they go? Write more grants? Get bigger hardware? Look to the sky?

Module 1 bioinformatics.ca Typical sequencing company pipeline: Genomic companies already there! ACGTACGT AAGTTCGG ATGGCGTA GTCCCTTT TTGGGGTG TAGTGAGG CGCTGATT CGGAGAG All of the hard work done here! All of the hard work done here!

Module 1 bioinformatics.ca Most people already there! Google docs Dropbox Netflix Twitter

Module 1 bioinformatics.ca Amazon Web Services (AWS) Infinite storage (scalable): S3 (simple storage service) Compute per hour: EC2 (elastic cloud computing) Ready when you are High Performance Computing Multiple football fields of HPC throughout the world HPC are expanded at one contained at a time:

Module 1 bioinformatics.ca Some of the challenges with cloud computing: Not cheap! Getting files to and from there Not the best solution for everybody Standardization PHI: personal health information & security concerns In the USA: Patriot act

Module 1 bioinformatics.ca Some of the advantages with cloud computing: At the CBW: we received a grant from Amazon, so supported by ‘AWS in Education’ grant award. There are better ways of transferring large files, and now AWS makes it free to upload files. A number of datasets exist on AWS (e.g genome data). Many useful bioinformatics AMI’s (Amazon Machine Images) exist on AWS: e.g. cloudbiolinux & CloudMan (Galaxy) – CBW AMI Many flavors of cloud available, not just AWS

Module 1 bioinformatics.ca In this workshop: Some tools (data) are on your computer on the web on the cloud. You will become efficient at traversing these various spaces, and finding resources you need, and using what is best for you. There are different ways of using the cloud: 1.Command line (like your own very powerful Unix box) 2.With a web-browser (e.g. Galaxy)

Module 1 bioinformatics.ca This is what a 5MB hard drive looked like in 1956! What will it be in 2056? “Big Data” is a relative term!

Module 1 bioinformatics.ca Things we have set up: Loaded data files to an AWS We brought up an Ubuntu (Linux) instance, and loaded a whole bunch of software for NGS analysis. We then cloned this, and made separate instances for everybody in the class. We’ve simplified the security: you basically all have the same login and file access, and opened ports. In your own world you would be more secure.

Module 1 bioinformatics.ca SSH (Secure Shell) A encrypted network protocol To connect to remote machine/server Server fingerprint Public key authentication – Public key, Private key

Module 1 bioinformatics.ca For this workshop: all on Wiki! Login: FirstnameLastname Password: guest

Module 1 bioinformatics.ca

Module 1 bioinformatics.ca Logging into cloud

Module 1 bioinformatics.ca Mac Windows

Module 1 bioinformatics.ca

Module 1 bioinformatics.ca Mac/Linux

Module 1 bioinformatics.ca Windows

Module 1 bioinformatics.ca Mac/Linux On Mac: Control+

Module 1 bioinformatics.ca Windows

Module 1 bioinformatics.ca Mac/Linux

Module 1 bioinformatics.ca Module 1 bioinformatics.ca ls -l (long listing) drwx francis staff May 21:25../ 1 francis staff May 21:31 CBWNY.pem rwx : owner rwx : group rwx: world r read (4) w write (2) x execute (1) Which ever way you add these 3 numbers, you know which integers were used (6 is always 4+2, 5 is 4+1, 4 is by itself, 0 is none of them etc …) So, when you have: chmod 600 It is “rw” for the the file owner only

Module 1 bioinformatics.ca Mac/Linux Windows ssh -i CBWNY.pem

Module 1 bioinformatics.ca Mac/Linux Windows ssh -i CBWNY.pem ubuntu

Module 1 bioinformatics.ca Mac/Linux Windows ssh -i CBWNY.pem

Module 1 bioinformatics.ca Mac/Linux Windows From now on, just double-click CBW to login.

Module 1 bioinformatics.ca

Module 1 bioinformatics.ca Amazon AWS Management Console

Module 1 bioinformatics.ca So, at this point: Your laptop is ready for the workshop You know how to load and view files into IGV You know where to get the information you need You know how to use the wiki for this workshop You know where all of the lectures are You have read all of the pre-lecture material If not, you know where the papers are, and you are a speed reader You know how to login to AWS

Module 1 bioinformatics.ca We are on a Coffee Break & Networking Session Wish you were here? Register for the Canadian Bioinformatics Workshops: