Short Read Workshop Day 1 - Experimental Design Example 1: How to log in to vieques.

Slides:



Advertisements
Similar presentations
The Past, Present, and Future of DNA Sequencing
Advertisements

Objectives Overview Define an operating system
ENEE150: Discussion 1 Section 0104 Please Sit Down at a Computer and Login!
CSCI 1411 FUNDAMENTALS OF COMPUTING LAB Lab Introduction 1 Shane Transue MSCS.
Introducing the Command Line CMSC 121 Introduction to UNIX Much of the material in these slides was taken from Dan Hood’s CMSC 121 Lecture Notes.
High Throughput Sequencing
11 © 2009 PerkinElmer © 2010 PerkinElmer November 20, 2012 DNA Services Overview.
1 Mapping a Drive on the USF IIS Server. 2 Mapping a Drive To map a drive to a network file directory in Windows you must be on a Microsoft local area.
A crash course in njit’s Afs
ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 22, 2011assignprelim.1 Assignment Preliminaries ITCS 6010/8010 Spring 2011.
MCB Lecture #3 Sept 2/14 Intro to UNIX terminal.
V Avon High School Tech Crew Agenda Old Business –Delete Files New Business –Week 10 Topics: Coming up: –Yearbook Picture: Feb 7 12:20PM.
Eucalyptus Virtual Machines Running Maven, Tomcat, and Mysql.
Your Interactive Guide to the Digital World Discovering Computers 2012.
 Mary Jane Heider ◦ Director, Academic Computing ◦  Judie Littlejohn ◦ Online Learning ◦
Summer  Session starts at 11:00 am ◦ We’ll be online shortly ◦ Speaker test starts about 10:45  To ask questions, ◦ use the chat window.
Chromium OS is an open-source project that aims to build an operating system that provides a fast, simple, and more secure computing experience for people.
COMP1070/2002/lec3/H.Melikian COMP1070 Lecture #3 v Operating Systems v Describe briefly operating systems service v To describe character and graphical.
Customized cloud platform for computing on your terms !
Accessing Barney Off- Campus How can I get my H: files when I am not on the GU network? Business 111 Edward Mitchell Fall 2006.
Logging in to the Maine Innovation Cloud (and some other stuff) BMB550.
Bioinformatics and OMICs Group Meeting REFERENCE GUIDED RNA SEQUENCING.
Genomics – Next-Gen sequencing and Microarrays
Explain the purpose of an operating system
Logging into the linux machines This series of view charts show how to log into the linux machines from the Windows environment. Machine name IP address.
PROGRAMMING PROJECT POLICIES AND UNIX INTRO Sal LaMarca CSCI 1302, Fall 2009.
Next Generation DNA Sequencing
Wimba Participant Orientation North Dakota University System May 2009.
TECHNICAL ORIENTATION WINTER Technical Orientation Session starts at 2:00 pm We’ll be online shortly Speaker test starts about 1:45 To ask questions,
Unix and Samba By: IC Labs (Raj Kidambi). What is Unix?  Unix stands for UNiplexed Information and Computing System. (It was originally spelled "Unics.")
Genomics Core Facility at UNH: High-Throughput Sequencing on the Illumina HiSeq 2500 Platform Project Consultation Sample Submission Library Creation Illumina.
Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
Chapter 9 Operating Systems Discovering Computers Technology in a World of Computers, Mobile Devices, and the Internet.
Unix Servers Used in This Class  Two Unix servers set up in CS department will be used for some programming projects  Machine name: eustis.eecs.ucf.edu.
1 Mapping a Drive on a USF IIS Server. 2 Mapping a Drive To map a drive to a network file directory in Windows you must be on a Microsoft local area network,
CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.
+ Vieques and Your Computer Dan Malmer & Joey Azofeifa.
+ Introduction to Unix Joey Azofeifa Dowell Lab Short Read Class Day 2 (Slides inspired by David Knox)
Short Read Workshop Day 1 - Experimental Design Example 1: How to log in to vieques.
IGV Demo Slides:/g/funcgen/trainings/visualization/Demos/IGV_demo.ppt Galaxy Dev: 0.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
Library QA & QC Day 1, Video 3
Setting up visualization. Make output folder for visualization files Log into vieques $ ssh
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
CHAPTER 7 Operating System Copyright © Cengage Learning. All rights reserved.
Short Read Workshop Day 1 - Experimental Design Video 1- Why to short read sequence (or not)
Editing, Transferring, and Running Files on Vieques Daniel Malmer Dowell Lab Short Reads Course 6/9/15.
Tutorial Six Linux Basics CompSci Semester Two 2016.
DISCOVERING COMPUTERS 2018 Digital Technology, Data, and Devices
Introduction to Unix for FreeSurfer Users
Holland Computing Center STAT802 Create and access Anvil Windows 10 SAS instance 01/23/2017.
Short Read Sequencing Analysis Workshop
Stubbs Lab Bioinformatics - 2 Retrieving sequence data files and Linux commands Nov 17, 2016 Joe Troy.
Cancer Genomics Core Lab
RNA Sequencing Day 7 Wooohoooo!
Short Read Sequencing Analysis Workshop
Sequence analysis Introduction
Operating Systems Overview
Canadian Bioinformatics Workshops
Workshop on Microbiome and Health
Tutorial Unix Command & Makefile CIS 5027
Yung-Hsiang Lu Purdue University
Logging into the linux machines
Build Your Own Computer

Quality Control & Nascent Sequencing
Presentation transcript:

Short Read Workshop Day 1 - Experimental Design Example 1: How to log in to vieques

Syllabus Day 1: Intro to Illumina & Library QC Instructors: Jamie Kershner/Daniel Malmer Day 2: Intro Linux Instructors: Joey Azofeifa/Daniel Malmer Day 3: Moving, clusters, notes Instructors: Joey Azofeifa/Daniel Malmer Day 4: Read QC and Read Trimming Instructors: Amber Sorenson/Daniel Malmer Friday catch-up day Day 5: Mapping and visualization Instructors: Jess Vera/Phil Richmond Day 6: Resequencing (genome-seq) Instructors: Phil Richmond/Aaron Odell Day 7: RNA-seq and differential expression Instructors: Aaron Odell/Jess Vera Day 8: Peak calling, ChIP-seq analysis Instructors: Amber Sorenson/Tim Read Friday catch-up day Week 1Week 2

Course structure Before class: – Videos ( – Firefox does not always work. Use Chrome, Internet Explorer or Safari During class (1-5 pm) – Examples with example files (fastq, sam, bam, bed) After class: – Homework

You will be learning “command line” This is not a GUI Spelling and Capitalization matter! Google can help you learn “command line unix” – The names for commands are not always intuitive (you will learn many this week)

Login Mac or unix Open Terminal by clicking on the magnifying glass in the corner and typing terminal and hitting enter ssh –X – Type password – When/if it says “enter passphrase” just hit enter 2x!!! PC – open Putty – Type vieques.colorado.edu under the host name – Hit open – Type identikey – Type password – When/if it says “enter passphrase” just hit enter 2x!!!

Check your access pwd ls mkdir mine touch temp.txt ls ls -lahtr cd mine pwd ls -lahtr cd.. pwd

VPN (Virtual Private Network) If you are not on campus to connect to vieques you must have vpn installed and turn it on before you log in! – Otherwise the login will just hang. internet-services/vpn/

Why use a server (cluster) Many computers can often do work much faster than one computer can. – built to run on a server/cluster (multi-threading) Some things take lots of space or memory Most bioinformatic programs are written for Unix/Linux – Because this platform is by programmers for programmers Often installing is a pain!

Which server can I use? Biofrontiers – Vieques CU Boulder – RC (Resource computing) Pro: they deal with up keep/installing and its mostly free Con: Under maintenance a lot, built more for physics problems Off campus – We don’t know… (Ask department?, Cloud computing?)

Big verse small Class will use small files for the sake of time. If you are using human or mouse data you will likely have much larger files.

Coffee Break… Videos read-class VPN internet-services/vpn/ `

Your project What experiment have you done or are you planning to do? What method are you using? Which sequencer? How many lanes will you sequence? Will you barcode? Inline or in adapter? Will you do Single vs. paired end? How many replicates will you do? What controls are you using? How much disk space will you use? Where will you process the data?

Some general guidelines Estimated coverage (DNA): Estimated reads (RNA): Coverage = Reads * Read Length Genome size ApplicationCoverage DNA de novo assembly100X DNA Resequencing~30X SNP analysis10X to 30X Genome sizeDifferential Expression Small (<20Mb)~10M Medium (~100Mb) ~20M Large (>1Gb)~30M

Common sequencer output MiSeq: – V2 kits: 15M clusters/run Run options: 1x50, 2x150, 2x250 – V3 kits: 25M clusters/run Run options: 1x150, 2x75, 2x300 NextSeq – HO kits: 400M clusters/run Run options: 1x75, 2x75, 2x150 – MO kits: 130M clusters/run Run options: 2x75, 2x150 HiSeq v3 – 200M clusters/lane – Run options: 1x50, 2x100

In class Problem 1, RNA-Seq You are performing an RNA-seq experiment to look at differential expression in mice – How do you check your RNA input? What are you looking for? – What are the different considerations for library prep?

In Class Problem 1, RNA-Seq You are performing an RNA-seq experiment to look at differential expression in mice – How do you check your RNA input? What are you looking for? Concentration: by Qubit for accuracy, also check DNA concentration Quality of total RNA: by Bioanalyzer, RIN>8 is generally recommended – What are the different considerations for RNA library prep? Stranded or not? How to remove ribosomal RNA? What kit? Multiplexing? How much sequencing do you need to do (#reads/sample)?

In Class Problem 1, RNA-Seq You are performing an RNA-seq experiment to look at differential expression in mice – How do you check your RNA input? What are you looking for? – What are the different considerations for library prep? You are running 3 different conditions each with 3 biological replicates – Which sequencing platform and run type do we need? Can we do this experiment in a single run/lane?

In class Problem 2 You want to resequence the human genome to find a dominant heterozygous mutation. – How much sequencing coverage do you need? – What sequencing format is the best? – How many reads do you need? – Do you need multiples runs or lanes?