Download presentation
Presentation is loading. Please wait.
Published byGian Nobles Modified over 9 years ago
1
SPEAKER : TZU-CHUN LO ADVISOR : YAO-TING HAUNG RNA sequencing for differential expression genes
2
Outline Molecular Central Dogma RNA Sequencing Differential Expression Gene Case–Control Study Negative Binomial Distribution Hypothesis Testing Rice SNP, QTL, Pathway
3
Molecular Central Dogma The central dogma of molecular biology describes the flow of genetic information within a biological system. Forest Branches BBQ
4
DNA RNA Alignment RNA Sequencing Gene 1Gene 2 reads exons mRNA Finding differential expression genes via read counts each gene. Spliced alignment Alignment Read counts DEG process
5
Differential Expression Gene We want to find the cold-resistant genes in rice. Rice genome We should compare with two conditions. Room temperature Low temperature Gene 1 Gene 2 Gene 3 Gene 1 Gene 2 Gene 3 Gene 1 Gene 2 Gene 3 13 6 45 7 2 Cole-resistant differential expression genes :
6
Strategy for DEG Case–control study Two existing groups differing in outcome are identified and compared on the basis of some supposed causal attribute. Question Is the number adequate to the gene? How to define the gene is differential expression? conditioncasecontrol Gene 16971 Gene 28656 Gene 366111 69 v.s 71Almost the same 86 v.s 56 66 v.s 111 Possible DEG More likely DEG ……… Gene 4 8060 80 v.s 60How to judge? It is just one of sample in condition. ? Negative binomial distribution Hypothesis test
7
Negative Binomial Distribution NB is a count data distribution that can substitute Poisson distribution for better variance. i j Gene abundance parameterSmooth function 3 j=1~mi=1~n 69 Library size parameter Smooth function is more complex, so let us forget it.
8
FPKM An indicator used to represent mRNA expression. Fragments Per Kilobase of transcript per Million mapper reads. Genome Gene 1 Gene 2 Exon length: 8 10 7 8 9 bases 10 4 reads
9
FPKM Before hypothesis testing, we have to get FPKM and variance of FPKM. K-Readscasecontrol Gene 16971 Gene 28656 Gene 366111 ……… Var(K)casecontrol Gene 1106 Gene 2170166 Gene 3362310 ……… FPKMcasecontrol Gene 19.3414.75 Gene 222.3115.37 Gene 340.4853.98 ……… Var(FPKM)casecontrol Gene 163.6 Gene 2136132.8 Gene 3120.6109.3 ………
10
Hypothesis Testing Step 1 : You find some observations or clues support a novel idea. Step 2 : Assume a against opinion that you want to fight it. Step 3 : Go to test it and take a stand. p-value
11
T-test
12
FPKMcasecontrol Gene 19.3414.75 Gene 222.3115.37 Gene 340.4853.98 ……… Var(FPKM)casecontrol Gene 163.6 Gene 2136132.8 Gene 3120.6109.3 ……… T-testGene 1Gene 2Gene 3… p-value0.1870.0390.014…
13
Result Investigating Discussing alpha=0.05 with read counts & p-value. If alpha=0.04 or 0.03 ? We don’t know which alpha is the best, but we can do some subsequent processing. If alpha=0.05casecontrolp-valueresult Gene 169710.187X Gene 286560.039V Gene 3661110.016V Gene 480600.045V
14
RNA sequencing for Rice Plan Cold-resistant genes Samples Japonica (TN67): room temperature (R), low temperature (L) Indica (IR64): room temperature (R), low temperature (L) Rice 粳稻 (TN67) : 米粒闊而短,黏性較大, Q 彈,如 : 蓬萊米。 秈稻 (IR64) : 米粒細而長,黏性較小,易碎,如 : 在來米。 Zone TN67 : High-latitude, or high altitude IR64 : Low-latitude, or low altitude
15
TN67R IR64R TN67L IR64L Strategy for DEG Case–control study Four combinations Different varieties or distinct temperatures Four sets of differential expression genes The DEGs above combination (A,B,C,D) Negative binomial Inference probability situation by sample Hypothesis test Which is the DEG that we want Subsequent processing SNP, QTL, Pathway A B C D
16
SNP A single-nucleotide polymorphism is a sequence variation occurring when a single nucleotide differs between members of a biological species. Case Control Assembly ATGCCCTCGTAA TTACTGCGT ATGCGCTCGAAA TTACTCCGT ATGCCCTCGTAA TTACTGCGT SNP
17
QTL Quantitative traits refer to phenotypes (characteristics) that vary in degree and can be attributed to polygenic effects (product of two or more genes) Quantitative trait loci (QTLs) are stretches of DNA containing or linked to the genes that underlie a quantitative trait. Ex : QT(cold) Loci : 599~799 (base) DNA Cold tolerance (29) & pollen fertility (43) QTL length : ~million bases genes 1 1000 QTL
18
Pathway Pathway is a collection of manually drawn pathway maps representing molecular interaction and reaction networks. Gene No.2 Gene No.55 Gene No.99 Rice Cold-resistant
19
Conclusion Review RNA Sequencing Differential Expression Gene Case–Control Study Negative Binomial Distribution Hypothesis Testing Rice SNP QTL Pathway
20
Variance of negative binomial NB is a count data distribution that can substitute poisson distribution for better variance.
21
Strategy for DEG
22
QTL 生物的另一類性狀例如人類的身高、體重、高 血壓、糖尿病;水稻株高及產量對疾病的抵抗程度;老鼠 的體脂肪百分比;乳牛的乳產量;雞的產卵量,由 於其變異性是連續性的,不易分類,且易受環境影響,故 稱為數量性狀( quantitative trait )。數量性狀是由多 個基因所控制,由於每個基因對數量性狀均有影響,所以 每一基因的作用便相對地小。這些控制數量性狀的 基因稱為微效基因( polygenes )或又稱為數量性狀基因 座 (quantitative trait loci , QTL) 。 Rice genome size 430Mb
23
QTL
24
Negative binomial distribution NB is a count data distribution that can inference adequate number by sample. i j Smooth function
25
Negative binomial distribution NB is a count data distribution that can substitute Poisson distribution for better variance.
26
Hypothesis test Step 1 : You find some observations or clues support a novel idea.() Step 2 : Assume a against opinion that you want to fight it. Step 3 : Go to test it and take a stand. p-value
27
Case-control example Example Question Is the number adequate to the gene? Negative binomial How to define the gene is differential expression? Hypothesis test conditioncasecontrol Gene 16971 Gene 28656 Gene 366111 ……… 69 v.s 71Almost the same 86 v.s 56 66 v.s 111 Possible DEG More likely DEG
28
Variance of negative binomial NB is a count data distribution that can substitute Poisson distribution for better variance.
29
DNA RNA Alignment RNA sequencing Gene 1Gene 2 reads exons mRNA DNA We should align with regions above blue. Spliced alignment
30
RNA sequencing Spliced alignment TopHat Condition 1 : caseCondition 2 : control Sample123…123… Gene 1756970…737168… Gene 21018675…315649… Gene 3286645…120111145… ………………………
31
Readscasecontrol Gene 16971 Gene 28656 Gene 366111 ……… Variancecasecontrol Gene 16971 Gene 28656 Gene 366111 ………
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.