Information Representation Working Group WG Meeting September 5, 2008.

Slides:



Advertisements
Similar presentations
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Advertisements

Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
On line (DNA and amino acid) Sequence Information Lecture 7.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
NUCLEIC ACIDS {DNA;RNA} w 1. What are they? w 2. Where are they found? w 3. What are their functions? w 4. What is a nucleotide? Draw one. w (pages 219.
Gene Ontology John Pinney
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Protein databases Morten Nielsen. Background- Nucleotide databases GenBank, National Center for Biotechnology Information.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
The Cell, Central Dogma and Human Genome Project.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Richard, Rochelle, Zohal, Angie
How to access genomic information using Ensembl August 2005.
BI420 – Course information Web site: Instructor: Gabor Marth Teaching.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
An Introduction to Bioinformatics Molecular Biology Databases.
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
On line (DNA and amino acid) Sequence Information
CASIMIR Networking Meeting Heathrow, July 2007 CASIMIR WP4 Data Representation John Hancock Duncan Davidson.
CS 790 – Bioinformatics Introduction and overview.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
Doug Raiford Lesson 3.  More and more sequence data is being generated every day  Useless if not made available to other researchers.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
Web Apollo and the VectorBase user community Gloria I. Giraldo-Calderón March 31, 2015.
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Online Mendelian Inheritance in Man (OMIM): What it is & What it can do for you Knowledge Management & Eskind Biomedical Library January 27, 2012 helen.
1 LS DAM Overview and the Specimen Core February 16, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund,
Professional Development Course 1 – Molecular Medicine Genome Biology June 12, 2012 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services.
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Bioinformatics and Computational Biology
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Molecular Diagnostics Certificate Program January 23, 2008 Information Session.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
RNA Makin’ Proteins DNAMutations Show off those Genes!
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
CaNanoLab Data Curation Overview NCI Nano WG June 6, 2013.
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
COURSE OF BIOINFORMATICS Exam_30/01/2014 A.
1 LS DAM Overview August 7, 2012 Current Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Mervi Heiskanen, NCI-CBIIT, Joyce.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Catalog of human genes and genetic disorders Online version of the book Mendelian Inheritence in Man maintained by Johns Hopkins University and located.
IRWG June 10, Agenda Datatypes and associations Discuss example experiment Scope of LS DAM Discuss the LS DAM Experiment model In context of scope.
Using DNA Subway in the Classroom Genome Annotation: Red Line.
生物資料庫搜尋 ( 第八組 ) 連威森 王鼎 黃智楹 張鈞淵
Information Representation Working Group: Kickoff ‘08 IRWG Working Group May 13, 2008.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
Genome organization and Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Ensembl Genome Repository.
By Stitziel, Tseng, Pervouchine, Goddeau, Kasif, Liang
Working in the Post-Genomic C. elegans World
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
SNPs and CNPs By: David Wendel.
Introduction to Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Information Representation Working Group WG Meeting September 5, 2008

Agenda ICR F2F Announcements Review point after September 30 Results of review of “things”/”object classes” by participants Next Steps/Tasks

ICR F2F September 23-24,2008 Day 1: Tuesday, July 23rd, 2008, 8:00 AM - 6:00 PM Day 2: Wednesday, June 24th, 2008, 8:00 AM - 3:00 PM The Broad Institute of MIT and Harvard in Cambridge, MA More info: Sept08 Sept08 Elaine and Li for additional comments on IRWG participant roles

Announcements Time change First Friday 3-4pm EST (Telecon info?) Third Friday 2-3pm EST Charter review point after Sept 30 – October 3 meeting Draft DAM deadline for F2F no longer valid Align charter deadlines with due dates: Oct 27 and Feb 6 More items

Proposals for Methods CDE-centric approach (Bottom-up) Use caDSR Identify heavily used CDEs using caDSR => Identify “characteristics” first Include standard CDEs Build model based on CDE-related Object Classes and their associations Preserve CDE mappings Object Class-centric approach (Top-down) Use caDSR Identify heavily used object classes using caDSR => Identify “things” first Build model in order of Object Classes, associations and attributes Use standard CDEs as applicable Preserve CDE mappings Model-centric approach Use information models Use classes and manually curate Use standard CDEs as applicable 1 2

Results of review of “things”/”object classes” by participants Most to least popular: Chromosome Protein Organization Person Gene Ontology Microarray Nucleotide Sequence Messenger RNA Organism Image Protocol Address Gene Protein Sequence Protocol Single Nucleotide Polymorphism Participant

Results of review of “things”/”object classes” by participants Biochemical Pathway Scientific Publication Protein Biomarker Gene Expression Microarray Reporter Exon DNA Specimen GenBank Accession Number Specimen Transcript

Results of review of “things”/”object classes” by participants Pathway UniProtKB Accession Number Data File Protein Domain Species Single Nucleotide Polymorphism Assay Online Mendelian Inheritance in Man Protein Alternative Name Scientific Publication Source UniProtKB Primary Accession Number Genomic Identifier Principal Component Analysis Anatomic Site Single Nucleotide Polymorphism Annotation Messenger RNA Genomic Identifier Orthologous Gene Peptide Molecular Genetic Abnormality Gene Alias Nucleic Acid Hybridization Variation Reporter Experiment Chromosome Location Microarray Reporter Population Group Protein Genomic Identifier Nucleic Acids Biospecimen Gene Biomarker Disease or Disorder Term Exon Microarray Reporter Molecular Specimen Homologous Gene Single Nucleotide Polymorphism Microarray Reporter Histology Database Cross-Reference Gene Genomic Identifier Procedure Study Intron

Results of review of “things”/”object classes” by participants “Things” of interest based on results received so far (from Elaine): Protocol/study/procedure/treatment/experiment Data/finding/evidence Protein, Gene, RNA Pharmacologic substance/compound Chromosome/Gene Location SNP Disease/disorder/phenotype Biomarker Pathway/Biochemical Pathway Publication/Reference GO Term/Gene Ontology/ Anatomic site/tissue Microarray Assay Reporter/probe Measurements/Quantitation Investigator/research personnel/site investigator Organization/Institution/Clinical trial site

Results of review of “things”/”object classes” by participants Comments from Sue: It is much easier to make the decision if we discuss the use cases from the sub-domains that got selected and how the different use-cases from the different sub-domains interact with each other, and perhaps decide one set of core use cases for the DAM from which we could select the core sets of domain objects Apart from the core domain objects, for the LIMS related objects (Storage, Container, Laboratory, etc.), I think we should reference what caLIMS2 provides. Even though it's at version 0.5 and has not submitted for silver-level review. It already has extensive study on objects used within a lab setting Remove the object classes that somewhat implementation specific, e.g. Identifiable Class, Extendable Class, Describable Class, Parameterizable Application. For object classes that are similar in nature, can we integrate? e.g. Measurement, Unit, Data, Data Set, Biology Data Cube, Derived Bioassay Class Data, Measured Bioassay Data, Value, etc. From a drug-discovery pipeline perspective, the DAM heavily focuses on target discovery and validation (gene, protein, pathway, disease, animal models, microarray, etc) aspect and is missing subdomains that discuss hit identification and assay developments. Is this on the list of future directions? I know caNanoLab discuss about in-vitro and in vivo assays from a nanoparticle perspective, but not sure if there are other models that we could already make use of.

Next Steps/Tasks