Download presentation
Presentation is loading. Please wait.
1
The Web frame for NGS output
2
NGS sequencing Tertiary Analysis Secondary Analysis Primary Analysis
Base calling/ Sequence trimming Secondary Analysis Assembly or Ref mapping Tertiary Analysis Calculate Mapping data/ expression profile Functional inference
3
Tentative Procedure for RNA –Seq Analysis
No-model Organism Tentative Procedure for RNA –Seq Analysis QC Discard the low –confident sequences for 3 groups (three time points) Program: SolexaQA ( Assembly Merge all reads from 3 Groups for assembly to form Contigs Program: Trinity ( 100GB RAM requested Mapping Map pair-end reads from each group on Contigs/ Annotate Contigs Program: LAST ( BLASTx, InterproScan Expression Estimate the expression value for each contig in each group (FPKM) Program: CummeRbund, an R/Bioconductor package ( Functional inference Functional enrichment analysis in GO and KEGG Program: Due to no-model organism, we may have to create the mapping identifier in KEGG and GO
4
Tentative Procedure for RNA –Seq Analysis
No-model Organism for Eel transcriptomics Tentative Procedure for RNA –Seq Analysis QC Discard the low –confident sequences generated from each library in Hi-seq 200, RNA-seq data, Pairend Program: SolexaQA ( Assembly Merge all reads from various libraries for assembly to form Contigs Program: Trinity ( 100GB RAM requested Mapping Map pair-end reads from each group on Contigs/ Annotate Contigs Program: LAST ( BLASTx, InterproScan Expression Profiling Estimate the expression value for each contig in each group (FPKM) Program: CummeRbund, an R/Bioconductor package ( Functional inference Functional enrichment analysis in GO and KEGG Program: Due to no-model organism, we may have to create the mapping identifier in KEGG and GO
5
Tentative Procedure for RNA –Seq Analysis
No-model Organism Tentative Procedure for RNA –Seq Analysis QC 去除品質較差的定序結果 Program: SolexaQA ( SeqTrim Assembly 由短序列基因定序結果中,組合出可能的基因表現模組(Merge all reads from 3 Groups for assembly to form Contigs) Program: Trinity, MIRA, Valvet, etc, multiple CPUs with over 100GB RAM requested Mapping 以組合出來的長序列基因片段為主體,將短序列歸位到基因主體上(Map pair-end reads from each group on Contigs) Program: Bowtie, LAST ( Expression 計算與統計不同樣品間同一段基因表現的概況,鑑別出有差異表現基因群(Estimate the expression value for each contig in each group (FPKM)) Program: CummeRbund, an R/Bioconductor package ( rseqC ( Functional inference 將找出的基因群進行功能性分析,找出在不同時間與組織下,與再生機制相關之調控途徑(Functional enrichment analysis in GO and KEGG) Program: Due to no-model organism, we may have to create the mapping identifier in KEGG and GO Validation 以Q-PCR來確認與再生相關之基因群表現概況 設計新的實驗來促進或是干擾再生機制,再透過NGS來找出更為精細的調控細節
6
QC by Graphs in SelexaQA
7
Annotations for each Contig
Contig in FASTA (N.A) Translated sequence (AA) in longest ORF Then perform Sequence Search (BLASTp) on NR, KEGG, GO, pFam (Interpro)
8
Database Structure PK = Contig ID
BLASTx pFAM KEGG GO FPKM PK = Contig ID Ref:
9
Query 1: text-based approach
Full –text search on Annotation tables Sequence Search/ BLAST Library Compare Immun Detail for each contig
10
Query 2 by Sequences BLASTn/ megablast/ tBLASTx Library Compare
Full –text search on Annotation tables Sequence Search/ BLAST Library Compare Worm Contigs Reference code :
11
Blast Result Detail for each contig
12
Detail for Each Contig Interpro/ pFAM
13
Query 3: Library Comparison
Full –text search on Annotation tables Sequence Search/ BLAST Library Compare Dynamic comparison like DDD Pool A Submit Pool B P-value
14
Table for BLASTX output (DB: NR)
Matched length/Query length Query_ID Hit ID Hit_annotation Hit_organism Query coverage E-value Contig 1 BAD elongation factor-1 alpha (EF-1alpha) Pelodiscus sinensis 97% 0.0 Contig 2
15
Table For KEGG Tables For pFam & GO As the output from each program
#seq_id hit_seq alignment_length identity (%) e_value KO_ID Definition pathway Note comp3_c0_seq1 xla:386604 449 0.84 K03231 elongation factor 1-alpha ko03013 RNA transport ko05134 Legionellosis Tables For pFam & GO As the output from each program Primary Key
16
The Result in one sheet Contig 1 PF00009/GTP_EFTU PF00010/ xxxxxxxx
Annotation from BLASTx Results of Pfamscan GO KEGG_KO KEGG Pathway FPKM _cond1 FPKM _cond2 FPKM _cond3 Contig 1 BAD / elongation factor-1 alpha (EF-1alpha) [Pelodiscus sinensis] PF00009/GTP_EFTU PF00010/ xxxxxxxx GO: GTPase activity GO: GTP binding K03231/galactose oxidase ko Galactose metabolism 190 200 3 Contig 2 - PF / p450 378 22 1000 Contig 3 CCCC PPPP 333 45 31
17
Library Compare 0 hr 48 hrs 24 hrs
18
The Way of Redundancy Reduction
Input 700Million reads 500,000 genes 48,000 Genes Refinement Final Set 1st Trinity Run Abundance Sorting Mapping by BOWTIE2 (LAST?), pick longest one as reduced set
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.