Analysis Environments For Scientific Communities From Bases to Spaces Bruce R. Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign.

Slides:



Advertisements
Similar presentations
Zoology 305 Library Databases/Indexes Lab Goals for session: 1) Meet your librarian Kevin Messner 2) Understand.
Advertisements

Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Pathways analysis Iowa State Workshop 11 June 2009.
1.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
NATIONAL LIBRARY OF MEDICINE The PubMed ID and Entrez, PubMed and PubMed Central Edwin Sequeira National Center for Biotechnology Information June 21,
Contents of this Talk [Used as intro to Genome Databases Seminar, 2002] Overview of bioinformatics Motivations for genome databases Analogy of virus reverse-eng.
Bioinformatics Director Lecture University of Michigan Medical School February 7, 2000 Building Analysis Environments Beyond the Genome and the Web Bruce.
Michigan Life Sciences Corridor Bioinformatics, University of Michigan March 14, 2001 Building Analysis Environments Beyond the Genome and the Web Bruce.
Archives and Information Retrieval
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Bioinformatics and Phylogenetic Analysis
DI FC UL1 Gene Function Prediction by Mining Biomedical Literature Pooja Jain Master in Bioinformatics Supervisor - Mário Jorge Costa Gaspar.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
1 FACS Data Management Workshop The Immunology Database and Analysis Portal (ImmPort) Perspective Bioinformatics Integration Support Contract (BISC) N01AI40076.
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
BeeSpace: An Interactive Environment for Analyzing Nature and Nurture in Societal Roles Bruce Schatz Institute for Genomic Biology University of Illinois.
Bioinformatics Seminar Department of Computer Science, UIUC February 25, 2005 Analysis Environments For Functional Genomics Bruce R. Schatz CANIS Laboratory.
Grant Number: IIS Institution of PI: Arizona State University PIs: Zoé Lacroix Title: Collaborative Research: Semantic Map of Biological Data.
1 Enhancing Organism Based Disease Knowledge Using Biological Taxonomy, and Environmental Ontologies Ken Baclawski Northeastern University Neil Sarkar.
University of Illinois at Urbana-Champaign INSTITUTE FOR GENOMIC BIOLOGY BeeSpace: An Interactive Environment for Functional Analysis of Social Behavior.
International Conference on Digital Libraries November 16, 2000 Kyoto, Japan Digital Libraries of Community Knowledge: The Coming World of the Interspace.
IEEE Knowledge Media Networking KMN’02 Keynote Address, CRL, Kyoto Japan, July 11, 2002 Concept Switching in the Interspace: Networking Infrastructure.
Automatically Generating Gene Summaries from Biomedical Literature (To appear in Proceedings of PSB 2006) X. LING, J. JIANG, X. He, Q.~Z. MEI, C.~X. ZHAI,
CNI Spring Meeting April 26, 1999 Washington, DC THE NET OF THE 21st CENTURY: Concepts across the Interspace Bruce Schatz CANIS Laboratory Graduate School.
8 October 2009Microbial Research Commons1 Toward a biomedical research commons: A view from NLM-NIH Jerry Sheehan Assistant Director for Policy Development.
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Cell Signaling Ontology Takako Takai-Igarashi and Toshihisa Takagi Human Genome Center, Institute of Medical Science, University of Tokyo.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Department of Computer Science seminar University of Illinois, February 14, 2005 The Evolution of the Net: Predicting Global Infrastructure Bruce R. Schatz.
University of Illinois at Urbana-Champaign BeeSpace Navigator v4.0 and Gene Summarizer beespace.uiuc.edu `
BeeSpace: An Interactive Environment for Analyzing Nature and Nurture in Societal Roles Bruce Schatz Institute for Genomic Biology University of Illinois.
Protein Information Resource Protein Information Resource, 3300 Whitehaven St., Georgetown University, Washington, DC Contact
1 CS 430: Information Discovery Lecture 25 Cluster Analysis 2 Thesaurus Construction.
BeeSpace Informatics Research: From Information Access to Knowledge Discovery ChengXiang Zhai Nov. 14, 2007.
CODE (Committee on Digital Environment) July 26, 2000 Rice University THE NET OF THE 21st CENTURY: Concepts across the Interspace Bruce Schatz CANIS Laboratory.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Workshop on The Transformation of Science Max Planck Society, Elmau, Germany June 1, 1999 TOWARDS INFORMATIONAL SCIENCE Indexing and Analyzing the Knowledge.
Graduate School of Informatics Kyoto University, November 21, 2001 Technologies of the Interspace Peer-Peer Semantic Indexing Bruce Schatz CANIS Laboratory.
Bioinformatics and Computational Biology
Revolutionary System Models, The Net, & The Public Interest The Interspace Prototype ( ) Digital Libraries Initiative ( ) Worm Community.
Revolution & Kids: Building the Future of the Net & Understanding the Structures of the World Bruce R. Schatz CANIS - Community Systems Laboratory University.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
BeeSpace Informatics: Interactive System for Functional Analysis Bruce Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
BeeSpace: An Interactive Environment for Functional Analysis of Social Behavior Bruce Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign.
Analysis Environments For Functional Genomics Bruce R. Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign
Phenotype And Trait Ontology (PATO) and plant phenotypes
University of Illinois at Urbana-Champaign. BeeSpace Project 5-year NSF-funded project Project Goals  Develop open bioinformatics resources  Support.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
Semantic (web) activity at Elsevier Marc Krellenstein VP, Search and Discovery Elsevier October 27, 2004
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
High throughput biology data management and data intensive computing drivers George Michaels.
Genomic Medicine Grid Juan Pedro Sánchez Merino Instituto de Salud Carlos III
Graduate School of Informatics Kyoto University, November 14, 2001 Functions of the Interspace Infrastructure for Concept Spaces Bruce Schatz CANIS Laboratory.
Towards a unified MOD resource: An Overview
Biological Databases By: Komal Arora.
Applications of the Interspace Analysis for Community Repositories
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Department of Genetics • Stanford University School of Medicine
Functional Annotation of the Horse Genome
Introduction to Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Analysis Environments For Scientific Communities From Bases to Spaces Bruce R. Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign Baker Center for Bioinformatics Iowa State University October 6, 2006

What are Analysis Environments Functional Analysis Find the underlying Mechanisms Of Genes, Behaviors, Diseases Comparative Analysis Top-down data mining (vs Bottom-up) Multiple Sources especially literature

Building Analysis Environments Manual by Humans Interactionuser navigation Classificationcollection indexing Automatic by Computers Federationsearch bridges Integrationresults links

Trends in Analysis Environments Central versus Distributed Viewpoints The 90s Pre-Genome Entrez (NIH NCBI) versus WCS (NSF Arizona) The 00s Post-Genome GO (NIH curators) versus BeeSpace (NSF Illinois)

Pre-Genome Environments Focused on Syntax pre-Web WCS (Worm Community System) Search words across sources Follow links across sources Words automatic, Links manual Towards Integrated Searching

Post-Genome Environments Focused on Semantics post-Web BeeSpace (Honey Bee Inter Space) Navigate concepts across sources Integrate data across sources Concepts automatic, Links automatic Towards Conceptual Navigation

Worm Community System WCS Information: Literature BIOSIS, MEDLINE, newsletters, meetings Data Genes, Maps, Sequences, strains, cells WCS Functionality Browsingsearch, navigation Filteringselection, analysis Sharinglinking, publishing WCS: 250 users at 50 labs across Internet (1991)

WCS Molecular

WCS Cellular

WCS invokes gm

WCS vis-à-vis acedb

from Objects to Concepts from Syntax to Semantics Infrastructure is Interaction with Abstraction Internet is packet transmission across computers Interspace is concept navigation across repositories Towards the Interspace

THE THIRD WAVE OF NET EVOLUTION PACKETS OBJECTS CONCEPTS

Technology Engineering Electrical FORMAL INFORMAL (manual) (automatic) IEEE communities groups individuals LEVELS OF INDEXES

Post-Genome Informatics I Comparative Analysis within the Dry Lab of Biological Knowledge Classical Organisms have Genetic Descriptions. There will be NO more classical organisms beyond Mice and Men, Worms and Flies, Yeasts and Weeds. Must use comparative genomics on classical organisms Via sequence homologies and literature analysis.

Post-Genome Informatics II Functional Analysis within the Dry Lab of Biological Knowledge Automatic annotation of genes to standard classifications, e.g. Gene Ontology via homology on computed protein sequences. Automatic analysis of functions to scientific literature, e.g. concept spaces via text extractions. Thus must use functions in literature descriptions.

Informatics: From Bases to Spaces data Bases support genome data e.g. FlyBase has sequences and maps Genes annotated by GeneOntology and linked to biological literature information Spaces support biological literature e.g. BeeSpace uses automatically generated conceptual relationships to navigate functions

BeeSpace FIBR Project BeeSpace project is NSF FIBR flagship Frontiers Integrative Biological Research, $5M for 5 years at University of Illinois Analyzing Nature and Nurture in Societal Roles using honey bee as model (Functional Analysis of Social Behavior) Genomic technologies in wet lab and dry lab Bee Bee [Biology] gene expressions Space Space [Informatics] concept navigations

System Architecture

Concept Navigation in BeeSpace

V1 BeeSpace Community Collections Organism Honey Bee / Fruit Fly Song Bird / Soy Bean Behavior Social / Territorial Foraging / Nesting Development Behavioral Maturation Insect Development Insect Communication Structure Fly Genetics / Fly Biochemistry Fly Physiology / Insect Neurophysiology

CONCEPT SWITCHING “Concept” versus “Term” set of “semantically” equivalent terms Concept switching region to region (set to set) match term Semantic region Concept Space

BeeSpace Analysis Environment Build Concept Space of Biomedical Literature for Functional Analysis of Bee Genes -Partition Literature into Community Collections -Extract and Index Concepts within Collections -Navigate Concepts within Documents -Follow Links from Documents into Databases Locate Candidate Genes in Related Literatures then follow links into Genome Databases

Well Characterized Gene

Poorly Characterized Gene

Gene Summarization, BeeSpace V2

Collaboration across Users

Category Browse (Collection)

Category Browse (Search)

PlantSpace Examples

Interactive Functional Analysis BeeSpace will enable users to navigate a uniform space of diverse databases and literature sources for hypothesis development and testing, with a software system beyond a searchable database, using literature analyses to discover functional relationships between genes and behavior. Genes to Behaviors Behaviors to Genes Concepts to Concepts Clusters to Clusters Navigation across Sources

BeeSpace Information Sources General for All Spaces: Scientific Literature -Medline, Biosis, CAB Abstracts Genome Databases -GenBank, ProteinDataBank, ArrayExpress Special for BeeSpace : Model Organisms (heredity) -Gene Descriptions (FlyBase, WormBase) Natural Histories (environment) -BeeKeeping Books (Cornell, Harvard)

XSpace Information Sources Organize Genome Databases (XBase) Compute Gene Descriptions from Model Organisms Partition Scientific Literature for Organism X Compute XSpace using Semantic Indexing Boost the Functional Analysis from Special Sources Collecting Useful Data about Natural Histories e.g. CowSpace Leverage in AIPL Databases

Towards SoySpace Organize Genome Databases (SoyBase) Partition Scientific Literature for SoyBean Gene Descriptions from Models (TAIR) Natural Histories from Population Databases Key to Functional Analysis is Special Sources Collecting Appropriate Text about Genes Extracting Adequate Data about Histories Leverage is National Archives of germplasm and Historical Records for soybean crops

Towards the Interspace The Analysis Environment technology is GENERAL ! BirdSpace? BeeSpace? PigSpace? CowSpace? BehaviorSpace? BrainSpace? SoySpace? PlantSpace? BioSpace … Interspace