RNA Sequencing Day 7 Wooohoooo!

RNA Sequencing Day 7 Wooohoooo!
Aaron Odell June 17th 2015

Outline For The Day Map raw fastq files to reference genome
Convert mapped data to visualization ready files “Quantify” mapped data over reference genome genes/annotations Differential Expression

Data Set Mouse RNA Seq mm10 Reference Genome (.fasta file)
PolyA Strand Specific Library. Ethanol treatment 2 x 100 Paired End Reads = 2 Fastq Files mm10 Reference Genome (.fasta file) Ensemble mm10 gene annotation file (.gtf) Due to time and compute restraints we will only be working with Chromosome 19 Pull up what a gtf looks like

What Mapping Program To Use?...
Need program that is able to handle spliced reads… Tophat2, GSNAP, STAR, etc … We will be using Tophat2 Excellent spliced alignment program evaluation

First things first Log onto vieques and load all modules we will need to run analysis successfully

module load commands module load python_2.7.3
module load samtools_0.1.16 module load numpy_1.6.1 module load pysam_0.8.1 module load bowtie_bowtie module load tophat_2.0.6 module load htseq_0.6.1

Lets begin… cp -r /projects/sreadgrp/RNA /Users/username/ Edit /Users/username/RNA/SCRIPTS/ILS_E_tophat_samtools.pbs in text editor Change all “odella” to your actual username Example /Users/odella/  /Users/your_username/ Change #PBS –M your_ _address to your actual address (or if you don’t want the then do nothing) From the /Users/username/RNA/SCRIPTS directory qsub ILS_E_1_tophat_samtools.pbs Have half the class copy. Then wait a minute or two. Then have to other half the class do the copy

While we wait… Edit /Users/username/RNA/SCRIPTS/ILS_E_htSeqCount.pbs
Change all “odella” to your actual username htSeqCount manual Once /Users/username/RNA/SCRIPTS/ILS_E_tophat_samtools.pbs has run to completion, qsub /Users/username/RNA/SCRIPTS/ILS_E_htSeqCount.pbs

For those of you who are bored…
Read through the tophat and htseq manuals Take a look at various files created by our mapping/quantification pipeline. Tophat manual explains what all these files are and where they come from Run this command from /Users/username/RNA/ILS_E directory samtools flagstat ILS_E_sorted.bam Do you understand the output? Search the internet or ask someone if you don’t

Time to visualize! Copy bam and genome files to student directory
cp /Users/username/RNA/ILS_E/ILS_E_sorted.bam /projects/sreadgrp/student/username/ILS_E_sorted.bam cp /Users/username/RNA/ILS_E/ILS_E_sorted.bam.bai /projects/sreadgrp/student/username/ILS_E_sorted.bam.bai cp /User/username/RNA/GENOME/chr19.fasta /projects/sreadgrp/student/username/chr19.fasta cp /User/username/RNA/GENOME/chr19.fasta.fai /projects/sreadgrp/student/username/chr19.fasta.fai cp /User/username/RNA/GENOME/chr19.gtf /projects/sreadgrp/student/username/chr19.gtf

Load the approriate files
Load genome from file Chr19.fasta Load from file Chr19.gtf ILS_E_sorted.bam

Look at ILS_E_htSeq.txt
Count the number of reads you see mapping in IGV over any given gene and compare to the htSeqCount value Does it make sense?

RNA Sequencing Day 7 Wooohoooo!

Similar presentations

Presentation on theme: "RNA Sequencing Day 7 Wooohoooo!"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

RNA Sequencing Day 7 Wooohoooo!

Similar presentations

Presentation on theme: "RNA Sequencing Day 7 Wooohoooo!"— Presentation transcript:

Similar presentations

About project

Feedback