Download presentation
Presentation is loading. Please wait.
Published byLaureen Wood Modified over 9 years ago
1
Trinity College Dublin, The University of Dublin Data download: bioinf.gen.tcd.ie/GE3M25/project Get.fastq.gz file associated with your student ID http://bioinf.gen.tcd.ie/GE3M25/project
2
Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class3 Karsten Hokamp, PhD Genetics TCD, 30/11/2015 http://bioinf.gen.tcd.ie/GE3M25/project
3
Trinity College Dublin, The University of Dublin GE3M25 Data Handling Module Content Python Programming Bioinformatics ChIP-Seq analysis http://bioinf.gen.tcd.ie/GE3M25/project
4
Trinity College Dublin, The University of Dublin Class 3: Project Data Download project data Quality control Trimming Read mapping Visualisation http://bioinf.gen.tcd.ie/GE3M25/project
5
Trinity College Dublin, The University of Dublin Next Generation Sequencing - Applications Xu F, Wang Q, Zhang F, Zhu Y, Gu Q, Wu L, Yang L, Yang X. Impact of Next-Generation Sequencing (NGS) technology on cardiovascular disease research. Cardiovasc Diagn Ther 2012;2(2):138-146
6
Trinity College Dublin, The University of Dublin Source: Bio-Rad ChIP-Seq Basics ChIP = Chromatin ImmunoPrecipitation = highly ordered packaging of DNA and histones together
7
Trinity College Dublin, The University of Dublin = highly ordered packaging of DNA and histones together Rosa, S.; Shaw, P. Insights into Chromatin Structure and Dynamics in Plants. Biology 2013, 2, 1378-1410.
8
Trinity College Dublin, The University of Dublin Immunoprecipitation (IP) is the technique of precipitating a protein antigen out of solution using an antibody that specifically binds to that particular protein. ChIP-Seq Basics
9
Trinity College Dublin, The University of Dublin
10
Steps in this class: 1. Download FastQ data set (ChIP-Seq of TF in yeast) 2. Quality control (FastQC) 3. Storage of FastQC report file 4. Read mapping (Bowtie2) 5. Generate indexed and sorted BAM file 6. Visualisation (IGV) 7. Store BAM and index files GE3M25 Project
11
Trinity College Dublin, The University of Dublin Optional steps in this class: 1. Trimming by quality (UrQt) 2. Trimming for Illumina Universal Adapter (trim_galore) 3. Trimming for other adapters (trim_galore) 4. Other read mapper (BWA) 5. Comparison of results 6. Upload of most suitable BAM and index files GE3M25 Project
12
Trinity College Dublin, The University of Dublin Working on the Command Line Start: Open 'Terminal' from Spotlight or Dock
13
Trinity College Dublin, The University of Dublin GE3M25 Project Step 1 Download data 1.Browse to bioninf.gen.tcd.ie/GE3M25/project 2.Locate the file with your student ID 3.Click to download 4.Check Downloads folder for file
14
Trinity College Dublin, The University of Dublin GE3M25 Project Step 2 Quality Control with FastQC 1. Download FastQC 2. Load the (compressed) FastQ file 3. Save report 4. Rename to start with full Student ID
15
Trinity College Dublin, The University of Dublin GE3M25 Project Step 2 Info for project report 1. Data details (# sequences, read length, etc.) 2. Comments on quality aspects 3. Highlight of potential issues 4. Discuss ways to clean up data
16
Trinity College Dublin, The University of Dublin Quality Information Conversion of quality score:
17
Trinity College Dublin, The University of Dublin GE3M25 Project Step 3 Storage of FastQC report 1. Open HTML report in browser 2. Copy and paste information into a Word document or Ctrl-click to copy images (or use Grab for screenshots) 3. Mail document to you or store on USB/Network or upload HTML file through bioinf.gen.tcd.ie/GE3M25/project
18
Trinity College Dublin, The University of Dublin GE3M25 Project Step 4 Read mapping 1. Download bowtie2 programs and reference sequence bioinf.gen.tcd.ie/GE3M25/data_handling 2. Switch to Terminal for command line work 3. Extract bowtie2 programs: tar zxvf bowtie2.tgz Or: tar xvf bowtie2.tar 4. Build index:./bowtie2-build S288C_reference_sequence_R64-2-1_20150113.fsa yeast 5. Map reads with default parameters:./bowtie2 -x yeast -U XXX.fastq.gz -p 4 > bowtie2_def.sam
19
Trinity College Dublin, The University of Dublin GE3M25 Project Step 4
20
Trinity College Dublin, The University of Dublin GE3M25 Project Step 4 Read mapping 1. Download bowtie2 programs and reference sequence bioinf.gen.tcd.ie/GE3M25/data_handling 2. Switch to Terminal for command line work 3. Extract bowtie2 programs: tar zxvf bowtie2.tgz Or: tar xvf bowtie2.tar 4. Build index:./bowtie2-build S288C_reference_sequence_R64-2-1_20150113.fsa yeast 5. Map reads with default parameters:./bowtie2 -x yeast -U XXX.fastq.gz -p 4 > bowtie2_def.sam Replace!
21
Trinity College Dublin, The University of Dublin Working on the Command Line – the Prompt userhost directory symbol Spaces are important!
22
Trinity College Dublin, The University of Dublin Steps in this class: 1. Download FastQ data set (ChIP-Seq of TF in yeast) 2. Quality control (FastQC) 3. Storage of FastQC report file 4. Read mapping (Bowtie2) 5. Generate indexed and sorted BAM file 6. Visualisation (IGV) 7. Store BAM and index files GE3M25 Project
23
Trinity College Dublin, The University of Dublin GE3M25 Project Step 5 Generate indexed and sorted BAM file Sequence Alignment/Map Format - Standard format for read mapping results - Can be compressed to save space: binary SAM BAM format - Can be indexed for random access - samtools allow viewing and processing SAM data
24
Trinity College Dublin, The University of Dublin GE3M25 Project Step 5 samtools Download from bioinf, chmod and run ls -l samtools chmod +x samtools./samtools
25
Trinity College Dublin, The University of Dublin GE3M25 Project Step 5 samtools view options
26
Trinity College Dublin, The University of Dublin SAM Format
27
Trinity College Dublin, The University of Dublin GE3M25 Project Step 5 View SAM file./samtools view -S bowtie2_def.sam | less Change into BAM format./samtools view -bS bowtie2_def.sam > bowtie2_def.bam Sort BAM file./samtools sort bowtie2_def.bam bowtie2_def_sorted Index BAM file./samtools index bowtie2_def_sorted.bam
28
Trinity College Dublin, The University of Dublin Steps in this class: 1. Download FastQ data set (ChIP-Seq of TF in yeast) 2. Quality control (FastQC) 3. Storage of FastQC report file 4. Read mapping (Bowtie2) 5. Generate indexed and sorted BAM file 6. Visualisation (IGV) 7. Store BAM and index files GE3M25 Project
29
Trinity College Dublin, The University of Dublin GE3M25 Project Step 6 1. Download IGV (local copy on bioinf) 2. Unpack (on the command line_: unzip IGV_2.3.66.app.zip 3. Start by double-click in Finder 4. Load S. cerevisiae (sacCer3) genome 5. Load BAM file Visualisation with IGV (Integrated Genome Viewer)
30
Trinity College Dublin, The University of Dublin GE3M25 Project Step 6 Visualisation with IGV (Integrated Genome Viewer)
31
Trinity College Dublin, The University of Dublin Exercises Clean your data via trimming Run bowtie with different parameters How do these steps affect the number of mapped reads? How do they affect the peaks that you see in IGV?
32
Trinity College Dublin, The University of Dublin GE3M25 Project Step 7 Storage of BAM file upload BAM and bam.bai files through bioinf.gen.tcd.ie/GE3M25/project
33
Trinity College Dublin, The University of Dublin Don't forget to log out!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.