Download presentation
Presentation is loading. Please wait.
1
Next Generation Sequencing analysis
June 6th, 2017
2
Course instructors Antonio Marco Stuart Newman Vladimir Teif
3
Course plan : Introductory lecture : Lunch : ChIP-seq practical : RNA-seq practical : Integrative analysis
4
1st Generation Sequencing
5
Microarrays Affimetrix microarrays
6
2nd (Next) Generation Sequencing
Illumina MiSeq
7
Microarrays and NGS are used for different purposes
8
NGS METHODS AND THEIR APPLICATIONS
Chromatin domains Hi-C Figure adapted from
9
NGS data types RNA-seq, GRO-seq, CAGE, SAGE, CLIP-seq, Drop-seq
gene expression; non-coding RNA ChIP-seq, MNase-seq, DNase-seq, ATAC-se, etc protein binding; histone modifications chromatin accessibility; nucleosome positioning Bisulfite sequencing (DNA methylation) Hi-C, 3C, 4C, ChIA-PET, etc (Chromatin loops in 3D) Amplicon sequencing targeted regions; philogenomics; metagenomics Whole Genome Sequencing (WGS) de-novo assembly (new species or new analyses) Curated bibliography of NGS methods (~100 methods) can be found at
12
Where to get NGS data? Do your own experiment
Gene Expression Omnibus (GEO) Sequence read archive (SRA) European Nucleotide Archive The Cancer Genome Atlas (TCGA) Exome Aggregation Consortium (ExAC) You also have to upload your data!
13
How to analyze NGS data? Ask a bioinformatician
you need to explain what do you want, and for that you need to understand what/how can be done Do it yourself Command line –> become a bioinformatician Online wrappers –> simpler, but file size limits Example of a convenient online tool: Galaxy
14
ChIP-seq experiment workflow
1. Crosslink Protein-DNA complexes in situ 2. Isolate nuclei and fragment DNA (sonication or digestion) 3. Immunoprecipitate with antibody against target nuclear protein and reverse crosslinks 4. Release DNA, prepare sequencing library and submit for sequencing Adapted from
15
ChIP-seq analysis workflow
16
NGS output after sequencing: .fastq files (FASTQ format)
17
NGS data after mapping: .bed files (BED format)
Bowtie, BWA, ELAND, Novoalign, BLAST, ClustalW TopHat (for RNA-seq)
18
Data view in genome browsers
Jung et al., NAR 2014 UCSC Genome Browser (online) IGV (install on a local computer)
19
Peak shapes can be different
Park P. J., Nature Genetics, 2009
20
ChIP-seq: reads to peaks/regions
MACS2 (universal) HOMER (universal) CISER (histones ) PeakSeq edgeR CisGenome Park P. J., Nature Genetics, 2009
21
RNA-seq: reads to genes/regions
DESeq, edgeR, Cuffdiff
22
DNA methylation data DMRcaller BISMARK
23
Intersecting genomic regions
BedTools (command line) Galaxy (online)
24
Genomic features are also regions Is ChIP-seq signal enriched there?
Mattout et al., Genome Biology, 2015
25
Let’s look at many similar regions
deepTools 2.0
26
ChIP-seq heat maps for all genes, scaled with respect to their start (TSS) and end (TES)
deepTools 2.0
27
Cluster heatmaps deepTools 2.0
28
Comparing cluster heatmaps between two cell conditions
NucTools
29
Histone modifications around TSS
30
NGS data integration
31
Different datasets in several tracks of a genome browser
5mC Gifford et.al., Cell 2013
32
Heat maps again: Signal from data 1 around regions in data 2
Here: Nucleosome occupancy around bound CTCF in mouse stem cells Vainshtein et.al., BMC Genomics 2017
33
Correlation analysis: any 2 datasets can be correlated
34
Correlation of regulatory protein binding with gene expression
Pavlaki et al., 2016
35
Gene ontology (GO) analysis
Calo et al. (2015) Nature 518, 249–253 DAVID, Gorilla, GREAT, EnrichR
36
Motif enrichment analysis
HOMER, MEME Pavlaki et al., 2016
37
Motif enrichment analysis
MEME-ChIP
38
Summary of typical analyses:
Differential peak calling Differential gene expression Intersection of different signals Correlation of different signals Motif sequence analysis Gene Ontology analysis
39
Questions?
40
Computer cluster and Linux
NGS data are stored in very large text files NGS analysis is usually performed on a computer cluster using Linux. Why Linux? Because it is free, open-source, and very stable. Plus historic reasons. Linux likes working with large text files :)
41
WinSCP: Windows file manager
42
WinSCP: Windows file manager
genome.essex.ac.uk
43
WinSCP: Windows file manager
44
Putty: Linux command line
45
Putty: Linux command line
genome.essex.ac.uk
46
Putty: Linux command line
47
Putty: Linux command line
48
Learning Linux in 5 minutes
There are two options for your work in Linux: Type your commands one by one in Putty Write all commands in a file called “bash file”, then execute this file, and all your commands written there will be executed We have prepared your bash files, you will just need to execute them
49
5 Linux commands you need
cd DirectoryName – change directory less FileName – read file FileName qsub FileName – execute bash file qstat – check progress of all users wc FileName – count lines in FileName
50
Useful shortcuts To copy/paste from Windows to Putty:
Copy [CTRL]+[C], then right-click in Putty to paste it Anywhere in Command Line in Putty: [up], [down] keys - scrolls through command history Auto completion of file/directory names: <something-incomplete> [TAB] When specifying directory name: ".." (dot dot) - refers to the parent directory "~" (Tilda) or "~/" - refers to the home directory
51
Additional Linux hints
All commands, usernames, passwords, file & directory names in Linux are case sensitive. File paths (locations of files) use “/”, not “\”, e.g. /storage/projects/”. Avoid using spaces in filenames
52
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.