An ontology-supported approach to predict automatically the proteases involved in the generation of peptides Mercedes Arguello Casteleiro 1, Julie Klein.

Slides:



Advertisements
Similar presentations
The use of Ontology in Organising and Managing Protein Family Resources Katy Wolstencroft, University Of Manchester.
Advertisements

RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,
1 OWL Instance Data Evaluation Li Ding, Jiao Tao, and Deborah L. McGuinness Tetherless World Constellation Computer Science Department.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
BIOINFORMATICS Ency Lee.
Protein databases Morten Nielsen. Background- Nucleotide databases GenBank, National Center for Biotechnology Information.
Archives and Information Retrieval
Storing and Retrieving Biological Instances with the Instance Store Daniele Turi, Phillip Lord, Michael Bada, Robert Stevens.
1 Organic Molecules Carbohydrates These pictures will give you background help with *Objectives
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
M.W. Mak and S.Y. Kung, ICASSP’09 1 Conditional Random Fields for the Prediction of Signal Peptide Cleavage Sites M.W. Mak The Hong Kong Polytechnic University.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Comprehensive Annotation System for Infectious Disease Data Alexander Diehl University at Buffalo/The Jackson Laboratory IDO Workshop /9/2010.
My contact details and information about submitting samples for MS
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
Evgeny Zolin, School of Computer Science, University of Manchester, UK, Andrey Bovykin, Department of Computer Science, University.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Predicting Missing Provenance Using Semantic Associations in Reservoir Engineering Jing Zhao University of Southern California Sep 19 th,
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
Provenance in my Grid Jun Zhao School of Computer Science The University of Manchester, U.K. 21 October, 2004.
Automatic methods for functional annotation of sequences Petri Törönen.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Protein Structure. Protein Structure I Primary Structure.
Computational Biology and Informatics Laboratory Development of an Application Ontology for Beta Cell Genomics Based On the Ontology for Biomedical Investigations.
Taverna in e-Lico  e-Lico is an EU Project ( ) to create a virtual laboratory for data mining and data-intensive sciences  Main partners: –University.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
© Copyright 2009 TopQuadrant Inc. Slide 1 TopQuadrant Metrics and QA Support TopBraid Suite Supporting the Complete Semantic Application Lifecycle.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
Biological Databases By : Lim Yun Ping E mail :
IPAW'08 – Salt Lake City, Utah, June 2008 Exploiting provenance to make sense of automated decisions in scientific workflows Paolo Missier, Suzanne Embury,
Spreadsheets to OWL with Populous 8/12/2011 Mikel Egaña Aranguren 3205 School of Computer Science Universidad Politécnica de Madrid (UPM) Boadilla.
Common parameters At the beginning one need to set up the parameters.
Pantelis Topalis and Emmanuel Dialynas.  Ontology content  Data annotation with ontologies  Tools to handle and visualize ontologies OWL – OBO parsers.
Semantic Web Ontology Design Pattern Li Ding Department of Computer Science Rensselaer Polytechnic Institute October 3, 2007 Class notes for CSCI-6962.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Cell Signaling Ontology Takako Takai-Igarashi and Toshihisa Takagi Human Genome Center, Institute of Medical Science, University of Tokyo.
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
Pour mieux affirmer ses missions, le Cemagref devient Irstea Catherine ROUSSEY (Irstea), Jean-Pierre CHANET (irstea), Vincent CELLIER (INRA),
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Delivering the Power of Net-Centric Data to Ensure Mission Success UCore Semantic Layer: A Logically Enhanced (OWL) Version of the UCore Taxonomy Session.
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud Ali Hasnain et. al Insight Center for Data Analytics National University of Ireland,
Spaghetti: Visualization of Observed Peptides in Tandem Mass Spectrometry Steven Lewis 1, Terry Farrah 1, Eric W Deutsch 1, John Boyle 1 1 Institute for.
Glycan database. Database of molecules Two models (of vocabularies) – Proteins / Nucleic Acids Residues (+ modifications) Genbank / Swissprot – Compounds.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Answering Gene Ontology terms to proteomics questions by supervised macro reading in MEDLINE Julien Gobeill 1, Emilie Pasche 2, Douglas Teodoro 2, Anne-Lise.
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation Bioinformatics, July 2003 P.W.Load,
©CMBI 2008 Databases Data must be in a certain format for software to recognize Every database can have its own format but some data elements are essential.
Protein sequence databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen This also includes old material from my thesis
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
The Cardiovascular Disease Ontology (CVDO) Mercedes Arguello Casteleiro 1, Julie Klein 2 and Robert Stevens 1 1 School of Computer Science, University.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Protein databases Henrik Nielsen
DReNIn_O “A high-level ontology for drug repositioning” Joseph Mullen
Enhanced Content Models
Data challenges in the pharmaceutical industry
George Demetriou1, Warren Read2, Goran Nenadic1,
Functional Annotation of Transcripts
Figure 6. Assessment of endosomal acidic insulinase activity after a gel-filtration HPLC protocol and immunodepletion procedures. Endosomal acidic insulinase.
PIR: Protein Information Resource
There are four levels of structure in proteins
Amino Acids An amino acid is any compound that contains an amino group (—NH2) and a carboxyl group (—COOH) in the same molecule.
Ontologies and Databases
Knowledge Representation Part VII Protégé / RDFS / OWL / ++
Semantic-Web, Triple-Strores, and SPARQL
Presentation transcript:

An ontology-supported approach to predict automatically the proteases involved in the generation of peptides Mercedes Arguello Casteleiro 1, Julie Klein 2 and Robert Stevens 1 1 School of Computer Science, University of Manchester, Oxford road, Manchester, United Kingdom. 2 Institut National de la Santé et de la Recherche Médicale (INSERM), U1048, Toulouse, France. And Université Toulouse III Paul-Sabatier, Toulouse, France.

Biomedical Background Complex diseases cannot be adequately described by single features: -Interindividual differences -Mechanism multiplicity why omics? Omics: studying “all” molecules collectively

Tools & Applications Peptides represent a new source of very useful biomarkers in kidney and other disease. Peptides can help better understanding the pathophysiological mechanisms : Parental proteins Proteases Protease Parent protein Peptide

Tools & Applications HELP ? Peptides represent a new source of very useful biomarkers in kidney and other disease. Peptides can help better understanding the pathophysiological mechanisms : Parental proteins Proteases

Proteasix Ontology (PxO) reuse ontologies

chemical entity molecular entity Alanine

Proteasix Ontology (PxO) reuse ontologies Human Rat Mouse

Proteasix Ontology (PxO) reuse ontologies molecular_function cellular_component biological_process

Proteasix Ontology (PxO) reuse ontologies Protein amino acid chain proteolytic cleavage product

Current Work: Proteasix Ontology (PxO) UniProt KB proteins (Swiss-prot and TrEMBL)  Organised by Taxons following PRO ontology  Annotated with GO biological_process; GO cellular_component; and GO molecular_function Model cleavage sites patterns for  Exopeptidase activity  Endopeptidase activity

For using Peptides as Biomarkers, we need more data.. and data linkage.. Class: polypeptide_region SubClassOf: biological_region part_of some protein, associated_with some 'proteolytic cleavage product', only_in_taxon some organism OWL (ontologies) OWL (ontologies)

In Swiss-Prot there are proteins annotated with GO: , which is serine-type endopeptidase activity SELECT ?x FROM { ?x rdf:type owl:Class; rdfs:subClassOf [ a owl:Restriction ; owl:onProperty pr:has_function ; owl:someValuesFrom obo:GO_ ]. } SPARQL (queries) SPARQL (queries) For using Peptides as Biomarkers, we need more data.. and data linkage..

Can we retrieve from Swiss-Prot ALL the proteins annotated with any DESCENDANT of Peptidase activity GO_ ? SPARQL (queries) SPARQL (queries) GRAPH {?C rdfs:subClassOf+ obo:GO_ }. For using Peptides as Biomarkers, we need more data.. and data linkage..

Can we retrieve from Swiss-Prot ALL the proteins annotated with any DESCENDANT of Peptidase activity GO_ ? SPARQL (queries) SPARQL (queries) For using Peptides as Biomarkers, we need more data.. and data linkage.. SELECT ?x FROM FROM NAMED { GRAPH {?C rdfs:subClassOf+ obo:GO_ }. { ?x rdf:type owl:Class; rdfs:subClassOf [ a owl:Restriction ; owl:onProperty pr:has_function ; owl:someValuesFrom ?C ]. } }

Concluding remarks TopFIND2 and Proteasix can help to automatically predict modification of protease activity Current work: exploring the benefits of using Proteasix ontology (PxO) in the Semantic-Web version of Proteasix Hard Limits: Prediction always need validation Data available (over-representation of some proteases, other are missing)

 Biomedical background  Comparison of Current Tools & Applications ACKNOWLEDGMENT: to Julie Klein for providing