Download presentation
1
RiboSearch Ben Daniel Ariel Kirshner Naomi
Instructor : Dr. Danny Barash Adaya Cohen
2
Introduction Biological Introduction Method Layout
“The merge strategy” Results and Conclusions
3
RNA A single-stranded nucleic acid made up of 4 nucleotides :
Purines : adenine (A), guanine (G) Pyramidines: cytosine (C), and uracil (U). WC pairs: A-U G-C
4
Introduction Biological
Old scheme Protein carry out all biological functions RNA : only a stage between DNA to protein with no catalytic function DNA RNA Protein
5
Biological introduction
New scheme Since the discovery of self-splicing RNAs in the early 1980’s, a number of new structural and catalytic RNAs have been discovered. Recent studies focusing on non-coding and small RNAs have led to discovery of RNA molecules that posses essential regulatory functions DNA RNA Protein
6
RNA Secondary Structure
Hairpin Internal loop Bulge loop Junction Stem (double strand) pseudoknot The secondary structure of many RNAs is usually more conserved than their sequence
7
Riboswitch Aptamer Coding section 3’ 5’ Expression platform 5’ UTR 3’ UTR RNA control elements that regulates gene expression, without the participation of proteins Utilize a unique mechanism where by small molecules bind to aptamer/box region causing a conformational switch Were found initially in 5’ UTR of bacteria with successive discoveries in prokaryotes There are evidence suggesting riboswitches could be found in eukaryotes.
8
Riboswitch mechanism Guanine bind to aptamer region with cause conformational change in the expression platform, which regulates the guanine metabolism.
9
G-box Regulates genes related to purine metabolism and transport
Binds purines Consists of 2 hairpins and 1 internal junction
10
RiboSearch Goal Finding G-box in eukaryotic genomes Method
Combining existing search methods into one overall package
11
Search Methods Whiffer – CS department, BGU
RNAMotif – Macke et al. , 2001 RNAProfile – Pavesi et al. , 2004 STR2 – CS department, BGU
12
Whiffer Input Pattern that consists of : Output
Sequence information Variable gaps Base pairing brackets representing WC pairs Output Candidates locations that meet constraints imposed by the method <<<< [2] TA [5] GTNTCTAC [3] <<<<< [3] CCNNNAA [3] >>>>> [5] >>>>
13
Whiffer Method Uses simple matching ,based on the constraints ,as opposed to dynamic programming.
14
RNAMotif Input Database of nucleotide sequences
Description file that consists of: Descriptor section Score section (optional) Output Candidates that meet the conditions of the descriptor and the scoring scheme
15
RNAMotif Sample descriptor file : descr h5 (minlen=6, maxlen=8)
ss (minlen=4, maxlen=6) h3 score { gcnt = 0; glen = 0; for( i = 1; i <= NSE; i++ ){ llen=length( se[i] ); glen=glen+llen; for( j = 1; j <= glen; j++ ){ b = se[i,j,1]; if( b == "g" || b == "c" ) gcnt++; { SCORE = 1.0 * gcnt / glen; if( SCORE < .4 ) REJECT; } ss h5 h3
16
RNAMotif Method Two-stage algorithm Stage I : Compilation stage
Analyzing the specific motif, called a descriptor and converting it into a search tree based on the helical nesting of the motif
17
RNAMotif Method Two-stage algorithm Stage II : DFS
Depth first search of the tree that was created by the compilation stage Each time a complete solution to the descriptor is found, the candidate is passed to an optional score section for scoring and ranking In absence of score section the candidate is accepted
18
RNAProfile Input Number of distinct hairpins a motif has to contain
Set of unaligned RNA sequences expected to share a common motif
19
RNAProfile Output Regions that are most conserved throughout the sequences, according to sequence of the regions Secondary structure that can be formed according to base-pairing and thermodynamic rules
20
RNAProfile Method Two phases Phase I :
Extracting a set of candidate regions from each input sequence, whose predicted optimal secondary structure contains the number of hairpins given as input Phase II : The regions selected are compared with each other to find the group of most similar ones, formed by a region taken from each sequence
21
Method Summery Whiffer RNAMotif RNAProfile
Combines sequence and structure similarity Very high specifity – potential candidates may be ruled out RNAMotif Similarity based mostly on structural elements, according to the descriptor RNAProfile Similarity based on both sequence and structure Recommended as a post-processing step
22
Structure (bracket notation)
The merge strategy Query: Sequence Structure (bracket notation) Input (((..((((…)))).)) Parsing Whiffer RNAMotif Parsing Candidates
23
Candidates The location contained within a gene The gene is relevant to the requested function (purine metabolism) Filtering RNAProfile Post processing Final candidates
24
Biological experiments
Final candidates Sequence alignment Biological experiments
25
Results – prokaryote Bacillus Halodurans
Merge RNAMotif Whiffer 7 4 Candidates 2 True positives 3 5 False positives False negatives
26
Results – eukaryote Arabidopsis Thaliana
Merge RNAMotif Run #2 Run #1 Whiffer - 70000 30 Candidates 11 17 Final candidates
27
Results – eukaryote Arabidopsis Thaliana
Most promising candidates Arabidopsis Thaliana
28
c2__ _ queryGBox CGTGGATATGGCACGCAAGTTTCTACCGGGCACCGTAAATGTCCGACTAT 50 c2__ _ _ TTCAGGTC-CATCTTTGGCTAGACCGAAGTCAGATAATTTGGCGTTAT 47 * * * ** * * **** * * *** * *** queryGBox G c2__ _ _ AGTCCTGAA 56
29
c3_ _ c3_sequences GGATGAGGAACCAATTGACCCTGGATTTCAAGATT-TACAAAAGAACGTA 49 queryGBox CGTGGATATGGCACGCAAGTTTCTACCGGGCACCGTA 37 ** *** **** ** *** * **** c3_sequences AGCATCC queryGBox AATGTCCGACTATG 51 * ***
30
RiboSearch - Conclusions
Filters false positives Sequences are by far less conserved within eukaryotes than prokaryotes The merge strategy is essential in eukaryotic genomes search
31
Our thanks Dr. Danny Barash Adaya Cohen
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.