Download presentation
Presentation is loading. Please wait.
Published byMolly Lamb Modified over 6 years ago
1
Stubbs Lab Bioinformatics - 3 Review RNA-Seq Analysis Overview Alignment using Tophat2
Nov 22, 2016 Joe Troy
2
Agenda Review of tools and Linux commands
Overview of the RNA-Seq Analysis Aligning short reads (.fastq files) with Tophat2 to create alignment files (accepted_hits.bam)
3
Also, to create bigwigs for UCSC track hubs, we use some UCSC software.
4
Linux commands (review and new)
cp copy. copy file ex: cp oldfile.txt newfile.txt copy folder ex: cp –R old_folder new_folder df –h See how much disk space is on the server cd change to new folder. ex: cd my_new_folder pwd print working directory, show the current folder ls –lh list contents with details (l), show file size & date as human readable (h) rm PERMANENTLY remove a file or folder. ex: rm my_file.txt removes a file named “my_file.txt” in the current working director. ex: rm -r myfolder removes a folder, and all of its contents named “myfolder” in the current working directory. ex: rm *.txt removes all file ending with ‘.txt’. ex: rm * removes everything in the current working directory BE CAREFUL. screen Screen allows you to start a “sub-process” on stubbslab.igb.illinois.edu, exit that subprocess while it continues to run (allowing you to disconnect from stubbslab.igb.illinois.edu), and reattach to the process at a later time. sh Used to start a shell script. ex: sh main_script_tophat_16Gso.sh
5
RNA-Seq data analysis Context and Overview
6
INPUT: .tgz file(s) from ftp.biotec.illinois.edu
INPUT: .fastq short read files OUTPUT: “accepted_hits.bam” file from each “.fastq file” OUTPUT: .fastq short read files Retrieve and un-compress short read files Align Reads to genome Next Step: review alignment stats sftp command Tophat 2 script tar command
7
Terminal is used to access the Linux command line on a MAC
8
Instructions to alignment short reads with tophat2
INSTRUCTION SLIDE 1 Josephs-MacBook-Pro:~ josephtroy$ ssh password: Last login: Mon Nov 21 20:15: from c hsd1.il.comcast.net ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda T 4.2T 156G 97% / /dev/sda G 14G 77G 16% /var /dev/sdb M 29M 246M 11% /boot tmpfs G G 0% /dev/shm /dev/sdb G 116G 145G 45% /var/lib/mysql ~]$ screen
9
Instructions to alignment short reads with tophat2
INSTRUCTION SLIDE 2 ~]$ cd /home/share/example_rna_seq_project_16Gso/ example_rna_seq_project_16Gso]$ ls -1 code_010_tophat2 code_020_alignment_summary_report code_030_MDS_plots code_040_create_track_hub_bigwigs code_050_cpm_means_report code_060_differential_expression_w_edgeR fastq_files output_010_tophat2_RUN_ _092530 example_rna_seq_project_16Gso]$ cd code_010_tophat2/ code_010_tophat2]$ ls main_script_tophat_16Gso.sh code_010_tophat2]$ sh main_script_tophat_16Gso.sh Start of Tophat … NOW HOLD DOWN THE CONTROL KEY AND PRESS a, THEN PRESS d, TO DETACH FROM THE SCREEN SESSION
10
Instructions to alignment short reads with tophat2
DEMONSTRATION SLIDE 3 ~]$ screen -ls There is a screen on: pts-2.stubbslab (Detached) 1 Socket in /var/run/screen/S-jmtroy2. ~]$ screen -r 11559 [end of tophat] code_010_tophat2]$ exit [end of tophat] code_010_tophat2]$ screen -ls No Sockets found in /var/run/screen/S-jmtroy2.
11
Review tophat2 output in Cyberduck
12
align_summary.txt NOTE: The “Mapped” rate of 99.9% is this high because of the way the example fastq files were created for the training exercise. The fastq files were created with only those reads already mapped to chromosome 5.
13
/home/share/example_rna_seq_project_16Gso/code_010_tophat2/main_script_tophat_16Gso.sh (1 of 2)
14
/home/share/example_rna_seq_project_16Gso/code_010_tophat2/main_script_tophat_16Gso.sh (2 of 2)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.