Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chani & Malki present: Project adviser: Dr. Ron Wides The OdzFinder.

Similar presentations


Presentation on theme: "Chani & Malki present: Project adviser: Dr. Ron Wides The OdzFinder."— Presentation transcript:

1 Chani & Malki present: Project adviser: Dr. Ron Wides The OdzFinder

2 WANTED Name: Odz a.k.a: Ten-m Family: pair-rule gene Length: 10,000 bp

3 Getting to Know Odz …  Discovered in D. Melanogaster in 1994 Odz protein is expressed in neurons, developing brain and hindgut Odz protein is expressed in segmentation. Od O d z  Belongs to pair rule gene family  Plays a crucial role in the CNS during fetal development

4 The Odz Family Ten-m1 Ten-m2 Ten-m3 Ten-m4 Ten-a Ten-m Vertebrates Arthropods Odz gene orthologs have been found in 3 phylums: Nematodes

5 The Odz Protein  2731 Amino Acids III.hydrophobic sequences, probably transmembrane sequence EGF-like domainIntracellular kinase substrate domainODZ The only pair rule gene that encodes a protein!  Contains 3 domains: I. extracellular EGF-like repeats II. tyrosine kinase phosphorylation sites

6 EGF-like Repeats x(4)-C-x(0,48)-C-x(3,12)-C-x(1,70)-C-x(1,6)-C-x(2)-G-a-x(0,21)-G-x(2)-C-x EGF-like domain:  30 - 40 amino acid residues  Significant homology to epidermal growth factor (EGF)  Has been found in single or multiple copies in a number of other proteins  Generally found in the extracellular domain of membrane proteins or secreted proteins  Involved in receptor-ligand interactions  Includes 6 conserved cysteine residues involved in disulfide bonds

7 The lab’s goals: Genomics:  To find a broad family of Odz gene  Phylogenetic trees to discover segmentation mechanism  Massive alignment to find conserved regions  Biological in-vivo experiments to change regions Proteomics:  The protein’s role  How the protein functions  The protein’s interactions with other proteins ( i.e : notch)

8 Finding Odz Genes  BLASTing new EST libraries Data Bases Se/uences discovered in the lab EST Libraries Odz DataBase  Extracting DNA from various innocent creatures  BLASTing existing databases

9 Odz Database  The collected data was organized by Michal Markovitz in a relational database.  The database consists of 10 different tables. For example:

10 2 problems remained: 1. Blast results include many non Odz hits: prokaryotic hits non-metazoan hits EGF region hits Low similarity We need a program to automatically extract Odz hits from NCBI Blast results!!! 2. Every day… New sequences are added to the existing databases New EST libraries are released

11 A perl program that will automatically extract Odz hits from NCBI Blast results. The OdzFinder

12 Blast Report Tax Report UpdateDatabase Combination Look up table Evalue>y? Score>x? Evalue>y? Odz EGF? Metazoan? Prokaryote? All EGF No EGF Mixed EGF no yes input S.O.F.T - screen Odz Flow Template

13 >gi|163076235|gb|AC765764.7 Apis mellifera BAC clone RP11-18D7, complete sequence Length = 184032 Score = 153 bits (328), Expect = 3e-36 Identities = 59/59 (100%), Positives = 59/59 (100%) Frame = +3 / +3 Query: 3 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH 179 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH Subjct: 3 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH 179  The program extracts relevant information from each hit: input Blast Report  BLASTS are performed on the Odz orthologs  The results are sent to the OdzFinder program to be filtered.

14 >gi|163076235|gb|AC765764.7 Apis mellifera BAC clone RP11-18D7, complete sequence Length = 184032 Score = 153 bits (328), Expect = 3e-36 Identities = 59/59 (100%), Positives = 59/59 (100%) Frame = +3 / +3 Query: 3 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH 179 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH Subjct: 3 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH 179 Taxonomy Report Eukaryota.................................. 2502 hits 41 orgs [root; cellular organisms]. Bilateria................................ 2421 hits 33 orgs [Fungi/Metazoa group; Metazoa; Eumetazoa].. Coelomata.............................. 2396 hits 31 orgs... Deuterostomia........................ 2322 hits 23 orgs.... Chordata........................... 2296 hits 22 orgs..... Euteleostomi..................... 2236 hits 21 orgs [Craniata; Vertebrata; Gnathostomata; Teleostomi]...... Tetrapoda...................... 2022 hits 14 orgs [Sarcopterygii]....... Amniota...................... 1908 hits 12 orgs........ Eutheria................... 1634 hits 10 orgs [Mammalia; Theria] Search for eukaryotic and metazoan results. Build prokaryotic database for possible future use. Evolutional distance becomes relevant when dealing with EGF-like repeats. The program will receive the BLAST hit’s Taxonomy Report and manipulate it into a manageable hash table. A default Taxonomy Report will be available when BLASTing against ESTs. input Blast ReportTax Report ; rootroot; cellular organisms; Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa; Bilateria; Coelomata; Protostomia; Panarthropoda; Arthropoda; Mandibulata; Pancrustacea; Hexapoda; Insecta; Dicondylia; Pterygota; Neoptera; Endopterygota; Hymenoptera; Apocrita; Aculeata; Apoidea; Apidae; Apinae; Apini; Apiscellular organismsEukaryotaFungi/Metazoa groupMetazoa EumetazoaBilateriaCoelomataProtostomiaPanarthropodaArthropoda MandibulataPancrustaceaHexapodaInsectaDicondyliaPterygota NeopteraEndopterygotaHymenopteraApocritaAculeataApoideaApidae ApinaeApini Apis

15 Tenascin-m (odz) includes 8 EGF-like repeats The conserved EGF region gave problematic results. Many hits appear only due to their similarity to the EGF region. Query : Subject : EGF? High score!!!

16 There are three possible positions regarding the hit’s relation to the query’s EGF-like region - I. The hit is completely inside the query’s EGF-region 5252750804 Query Hit II. The hit is completely outside the query’s EGF-region 525804 Query Hit III. The hit is partially in the query’s EGF-region 804525 Query Hit

17 Get a better picture..

18  score & e-value are examined  Set low threshholds to ensure that very small hits are not missed - some times they are translocations Position I : The hit is completely outside the query’s EGF-like region Evalue<y? Score>x? Odz yes No EGF

19 Position II : The hit is completely inside the query’s EGF-like region Look up table example: In order to prevent acceptance of non-odz hits with high scores due to their egf-region, a look up table was established evolutionally close query & subject high id % demanded evolutionally distant query & subject low id % demanded Odz ParalogOdz OrthologHitQuery 70%95%Homo SapiensMus Musculus 55%75%Drosophila Melanogaster Mus Musculus Look up table Score>x? Evalue>y? Odz yes ? All EGF

20 Position III : The hit is partially inside the query’s EGF-like region 2 Possibilities: A. False call ! An EGF hit with insignificant similarity outside of EGF-domains. B. The Real Thing ! EGF with adjacent regions of significant similarity. AB Treat like II Is it more like A or like B? Treat like I Mixed EGF

21 DBI Update Database : Data flow through DBI  A database interface module for Perl  Enables Perl applications to access multiple database types  Provides a consistent database interface independent of the actual database being used DBD::MSQL MySQL RDBMS DBI Perl Script

22 speciesscoregi Xenopus14049256537 Apis mellifera63748096180 Gallus gallus61945382362 Homo sapiens12542658224 Rattus norvegicus 38434932761 Mus musculus46338087011 Drosophila melanogaster 41945446084 Caenorhabditis elegans 160432565715 Gasterosteus aculeatus 76041469033 Results!

23 Special thanks to our project adviser Dr. Ron Wides For his guidance, patience & Krispy Kreme donuts


Download ppt "Chani & Malki present: Project adviser: Dr. Ron Wides The OdzFinder."

Similar presentations


Ads by Google