Download presentation
Presentation is loading. Please wait.
1
A biological crash course and introduction to prediction methods
RNA-RNA interaction A biological crash course and introduction to prediction methods
2
Part I – Biological crash course
Bacteria Plasmid copy control Post-segregational killing systems trans-encoded chromosomal RNAs RNA interference (gene silencing) Translation regulation C. elegans developmental regulation miRNA-miRNA interactions Human telomerase
3
DNA vs. RNA Bases #Strands Structure DNA A,C,G,T 2 Double helix RNA
A,C,G,U 1 or 2 Stem-loop, pseudoknots, etc.
4
Gene expression Central dogma of molecular biology
5
Translation mRNA -> protein via triplet code
What happens if mRNA is destroyed or otherwise can’t be translated?
6
Bacteria backgrounder
Single-celled organisms Prokaryotes = no nucleus Multi-cistronic transcripts -> multiple genes transcribed at one time, often with overlapping reading frames
7
Bacterial genetic information
Bacterial chromosome (1) Genome of organism Required for life Plasmids (2) Circular DNA molecules Double-stranded Independently self-replicating Not required for life, often confer selective advantage such as antibiotic resistance
8
Plasmid replication (1),(2) – Genes encoded on plasmid
(3) – Origin of Replication (ORI)
9
Plasmid copy control Recall independent self-replication
Copy number fluctuations are unavoidable Too many -> “runaway”, host dies Too few -> increased risk of plasmid loss Problem: How to control copy count? Solution: negative feedback loop mediated by RNA-RNA interaction
10
R1 copy control Genes: oriR1 – origin of replication
repA – lots of this protein product is required for replication initiation tap – translation of protein product is required for translation of repA protein copA – product is antisense RNA copB – product is a repressor protein (not covered here) Draw it on the blackboard as we go
11
R1 copy control (2) copA – RNA with stem-loop structure
copT – target segment of repA/tap mRNA, also forms a stem-loop structure Single loop-loop interaction
12
R1 copy control (3)
13
R1 copy control (4) copA RNA is unstable; it degrades
If not enough plasmids are producing copA antisense RNA (copy number is too low), more repA protein can be produced Therefore the plasmid can replicate
14
Post-segregational killing systems
Plasmid self-preservation mechanism Bacterial host losing plasmid results in host death R1 plasmid hok/sok system is the prototype All such systems work similarly
15
R1 hok/sok system hok/sok locus encodes:
hok protein – “host killing” Overlapping reading frame – mok – “modulator of killing” sok RNA – “suppressor of killer” mok must be translated for hok to be expressed mok cannot be translated if sok is present
16
R1 hok/sok system (2) hok mRNA is extremely compact
Many stem-loop structures Flush 5’ – 3’ pairing Highly stable -> long half-life Translationally inert mok segment is both: Translationally active Able to bind sok inhibitor RNA
17
R1 hok/sok system (3) sok RNA is highly unstable
Bacteria with R1 have lots of sok produced sok binds mok, hok is not translated Bacteria which lose R1 have: Lots of stable hok mRNA Quickly degrading sok RNA (low stability) No new sok RNA being produced hok is translated -> bacteria dies
18
Bacterial chromosomes
Plasmid antisense RNAs are generally cis-encoded Implies complete Watson-Crick complementarity Bacterial chromosomes contain trans-encoded antisense RNAs Not necessarily complete complementarity Often stress-related control systems draw cis and trans examples on board here: worth at least a couple of minutes Note that cis-acting = cis-encoded = cis-type, same for trans-
19
oxyS/fhlA in E. coli oxyS – RNA transcript induced by stress
fhlA – transcriptional activator site oxyS/fhlA complex binds via two loop-loop interactions
20
RNA interference (RNAi)
a.k.a. post-transcriptional gene silencing Double-stranded RNAs are introduced into the cell Complementary to mRNA for a gene Directly introduced in a wet lab, or Produced by the cell itself
21
RNA interference (2) dsRNAs are cleaved into nt segments (“small interfering RNAs”, or siRNAs) by an enzyme called Dicer
22
RNA interference (3) siRNAs are incorporated into RNA-induced silencing complex (RISC)
23
RNA interference (4) Guided by base complementarity of the siRNA, the RISC targets mRNA for degradation
24
RNA interference – why? Studying gene function Therapeutic suppression
Knock out or inhibit a gene’s normal function Can the organism survive? What phenotypic changes are observed? Therapeutic suppression E.g. cancer treatment
25
micro RNA (miRNA) Gene expression regulation
Created by similar process to siRNA Generally prevents binding of ribosome
26
Ex: C. elegans development
lin-4 and let-7 antisense RNAs Regulate larval development in C. elegans One of the two binding sites for lin-41 and let-7 interaction:
27
Human telomerase Telomerase = ribonucleoprotein complex
Ribo = ribosomal/RNA association Nucleo = nuclear localization Protein = contains a protein Responsible for maintaining telomere length in eukaryotic chromosomes Main components: Telomerase reverse transcriptase Human telomerase RNA (hTR)
28
Human telomerase (2) Reverse transcriptase
Transcribes RNA to DNA (rather than the usual DNA to RNA) Telomeres – repeated regions at the end of eukaryotic chromosomes hTR is the template for the repeated region
29
Human telomerase (3) hTR 11-nt templating region consists of:
Repeat template: CUAACCC Alignment domain: UAAC Positions telomerase on the DNA strand Provides template for repeat region
30
Human telomerase (4) A – secondary structure of hTR w/ proposed long-range interactions shown via connecting lines B – new model for secondary structure of hTR Catalytic domains: pseudoknot, CR4/CR5, template
31
Loop-loop interaction
Sometimes referred to as “kissing loops” Recall that all of the RNA-RNA interaction discussed so far (excepting RNAi), involve loop-loop interaction Predicting miRNA transcripts and targets involves loop structure prediction
32
References Couzin, J. (2002) “Breakthrough of the year – Small RNAs make big splash.” Science 298(5602): Lai, E.C., Wiel, C., and Rubin, G.M. (2004) “Complementary miRNA pairs suggest a regulatory role for miRNA:miRNA duplexes.” RNA 10(2): Moss, E.G. (2001) “RNA interference – It’s a small RNA world.” Current Biology 11(19):R Sharp, P.A. (2001) “RNA interference – 2001.” Genes and Development 15(5): Shi, Y. (2003) “Mammalian RNAi for the masses.” TRENDS in Genetics 19(1):9-12.
33
References (2) Ueda, C.T., and Roberts, R.W. (2004) “Analysis of a long-range interaction between conserved domains of human telomerase RNA.” RNA 10(1): Wagner, E.G.H. and Flärdh, K. (2002) “Antisense RNAs everywhere?” TRENDS in Genetics 18(5): Wagner, E.G.H., Altuvia, S., and Romby, P. (2002) “Antisense RNAs in bacteria and their genetic elements.” Advances in Genetics 45:
34
Part II – Prediction Identifying effective siRNAs Identifying targets
Neural network approach Identifying targets Mammalian miRNA target prediction
35
Prediction of siRNAs Sequence properties that make a good antisense RNA an effective gene inhibitor are not well understood Most computational models consider only: RNA structure prediction Motif searches
36
Neural net approach Training set: 490 known siRNA molecules
Input parameters: Base composition mRNA:siRNA binding energy properties 3’ and 5’ binding energy Structure of siRNA (hairpin energy and quality) Target function: efficacy
37
Neural net approach (2)
38
Neural net results 14 inputs, 11 hidden units, 1 output
Success rate of 92% Average prediction of 12 effective siRNAs per 1000 base pairs Stringent (high specificity) Good for designing siRNAs for RNAi
39
Prediction of miRNA targets
Mammals/vertebrates Lots of known miRNAs Mostly unknown target genes Initial method outline Look at conserved miRNAs Look for conserved target sites
40
micro RNAs in animals 0.5-1.0% of predicted genes encode miRNA
One of the more abundant regulatory classes Tissue-specific or developmental stage-specific expression High evolutionary conservation
41
micro RNAs in plants Finding targets in plants is relatively easy
Look for mRNA transcripts with near-perfect complementarity to known miRNAs Signal-to-noise ratio exceeds 10:1 for Arabidopsis (model plant organism) Naïve approach in C. elegans and D. melanogaster? No more hits than expected by random chance!
42
So what can we use? Pairing to nucleotides 2-8 at the 5’ end of the miRNA Target recognition Target regions enriched for genes involved in transcriptional regulation
43
Goals for algorithm Predict 100s of miRNA targets
Estimate false-positive rates Provide computational and experimental evidence of authenticity Identify common functionality classes other than transcriptional regulator genes
44
TargetScan Algorithm developed by Lewis et al 2003 Input: Output:
miRNA that is known to be conserved across multiple organisms Orthologous 3’ UTR sequences Cut-off values for two parameters Value for one free parameter Output: Ranked list of candidate target genes
45
TargetScan (1) Search UTRs in one organism
Bases 2-8 from miRNA = “miRNA seed” Perfect Watson-Crick complementarity No wobble pairs (G-U) 7nt matches = “seed matches” Choice of first organism is arbitrary
46
TargetScan (2) Extend seed matches Allow G-U (wobble) pairs
Both directions Stop at mismatches
47
TargetScan (3) Optimize basepairing Remaining 3’ region of miRNA
35 bases of UTR 5’ to each seed match RNAfold program (Hofacker et al 1994)
48
TargetScan (4) Folding free energy (G) assigned to each putative miRNA:target interaction Ignores initiation free energy RNAeval (Hofacker et al 1994) Can someone tell me what initiation free energy is?
49
TargetScan (5) Z score for each UTR (no match -> Z=1.0)
Term Z-score is misleading – it’s not really a statistic This is one of the cut-off parameters that can be set n = number of seed matches in UTR (may be more than one) Gk = free energy of miRNA:target site interaction of kth seed match T = parameter influencing relative weighting of UTRs with few high affinity target sites against UTRs with lots of low affinity target sites (experimentally determined)
50
TargetScan (6) Order UTRs by Z score Assign rank to each UTR
Repeat this process for each of the other organisms with UTR datasets Z sub C and R sub C are cut-off values – experimentally determined
51
TargetScan (7) UTR i is a predicted target if for all organisms:
52
Datasets nrMamm (mammalian – 79 sequences)
Homologs in human, mouse, and pufferfish Identical between human and mouse, not necessarily pufferfish (fugu) nrVert (vertebrate – 55 sequences) Identical between human, mouse, and fugu Non-redundant: if multiple miRNAs had the same seed, one representative chosen Major criticism of this method – how to choose that one representative?? Simplified signal-to-noise calculation
53
Sample program flow
54
Results for nrMamm nrMamm searched against human, mouse, and rat orthologous 3’ UTRs 451 miRNA:target interactions predicted for 400 unique genes Average 5.7 targets per miRNA Signal:noise ratio of 3.2:1 Very high number of targets per miRNA
55
Results for nrVert Additional search against fugu UTRs
Signal:noise ratio improves to 4.6:1 Relaxed cut-off values 115 predicted miRNA:target interactions for 107 unique genes 2.1 putative targets per miRNA
56
Signal:noise ratio calculation
Signal = number of predicted targets from nrMamm dataset Noise = number of predicted targets from randomly shuffled miRNAs Shuffled control sequences screened to ensure preservation of relevant features – don’t underestimate the noise!
57
Screening control sequences
Features to consider: Expected frequency of seed matches Expected frequency of matching to 3’ end of miRNA (after seed extension) Observed count of seed matches in UTR datasets Predicted free energies for seed:match interactions
58
Signal:noise results Filled bars are for authentic miRNAs
Open bars show the mean and standard deviation for shuffled sequences nrMamm set used for first two, nrVert used for set including fugu
59
Biological relevance Hypothesis: 5’ conservation of miRNAs is important for mRNA target recognition Highest signal:noise ratio observed when seed positioned close to 5’ end Hypothesis: highly conserved miRNAs are more involved in regulation High degree of conservation -> more predicted targets Membership in large miRNA family -> more predicted targets
60
Experimental verification
15 predicted target sites chosen All with known biological function Representative of the entire list of candidates 11 target sites confirmed Expression of upstream ORF influenced 27% false positives – close correspondance to predicted 30% false positives
61
References Chalk, A.M. and Sonnhammer, E.L.L. (2002) “Computational antisense oligo prediction with a neural network model.” Bioinformatics 18(12): Hofacker, I.L., Fontanta, W., Stadler, P.F., Bonhoeffer, S., Tacker, M., and Schuster, P. (1994) “Fast folding and comparison of RNA secondary structures.” Monatshefte fur Chemie 125: Lewis, B.P., Shih, I., Jones-Rhoades, M.W., and Bartel, D.P. (2003) “Prediction of mammalian microRNA targets.” Cell 115(7):
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.