BeeSpace Informatics: Interactive System for Functional Analysis Bruce Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign.

Slides:



Advertisements
Similar presentations
A Researcher’s Workbench in 2020: Intelligent Information Systems for Knowledge Synthesis and Discovery ChengXiang (“Cheng”) Zhai Department of Computer.
Advertisements

Biological literature mining
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Pathways & Networks analysis COST Functional Modeling Workshop April, Helsinki.
Ontology Notes are from:
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
COG and GO tutorial.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Digital Library Service Integration (DLSI) --> Looking for Collections and Services to be DLSI Testbeds
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Basic IR Concepts & Techniques ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
BeeSpace: An Interactive Environment for Analyzing Nature and Nurture in Societal Roles Bruce Schatz Institute for Genomic Biology University of Illinois.
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Review of Ondex Bernice Rogowitz G2P Visualization and Visual Analytics Team March 18, 2010.
MINING MULTI-FACETED OVERVIEWS OF ARBITRARY TOPICS IN A TEXT COLLECTION Xu Ling, Qiaozhu Mei, ChengXiang Zhai, Bruce Schatz Presented by: Qiaozhu Mei,
Analysis Environments For Scientific Communities From Bases to Spaces Bruce R. Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign.
Bioinformatics Seminar Department of Computer Science, UIUC February 25, 2005 Analysis Environments For Functional Genomics Bruce R. Schatz CANIS Laboratory.
BeeSpace Informatics Research: From Information Access to Knowledge Discovery ChengXiang Zhai Nov. 7, 2007.
Concept Clustering, Summarization and Annotation Qiaozhu Mei.
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
University of Illinois at Urbana-Champaign INSTITUTE FOR GENOMIC BIOLOGY BeeSpace: An Interactive Environment for Functional Analysis of Social Behavior.
Outline Quick review of GS Current problems with GS Our solutions Future work Discussion …
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
International Conference on Digital Libraries November 16, 2000 Kyoto, Japan Digital Libraries of Community Knowledge: The Coming World of the Interspace.
IEEE Knowledge Media Networking KMN’02 Keynote Address, CRL, Kyoto Japan, July 11, 2002 Concept Switching in the Interspace: Networking Infrastructure.
Automatically Generating Gene Summaries from Biomedical Literature (To appear in Proceedings of PSB 2006) X. LING, J. JIANG, X. He, Q.~Z. MEI, C.~X. ZHAI,
CNI Spring Meeting April 26, 1999 Washington, DC THE NET OF THE 21st CENTURY: Concepts across the Interspace Bruce Schatz CANIS Laboratory Graduate School.
Department of Computer Science seminar University of Illinois, February 14, 2005 The Evolution of the Net: Predicting Global Infrastructure Bruce R. Schatz.
University of Illinois at Urbana-Champaign BeeSpace Navigator v4.0 and Gene Summarizer beespace.uiuc.edu `
BeeSpace: An Interactive Environment for Analyzing Nature and Nurture in Societal Roles Bruce Schatz Institute for Genomic Biology University of Illinois.
1 CS 430: Information Discovery Lecture 25 Cluster Analysis 2 Thesaurus Construction.
Indexing Mathematical Abstracts by Metadata and Ontology IMA Workshop, April 26-27, 2004 Su-Shing Chen, University of Florida
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Artificial Intelligence Research Center Pereslavl-Zalessky, Russia Program Systems Institute, RAS.
BeeSpace Informatics Research: From Information Access to Knowledge Discovery ChengXiang Zhai Nov. 14, 2007.
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
CODE (Committee on Digital Environment) July 26, 2000 Rice University THE NET OF THE 21st CENTURY: Concepts across the Interspace Bruce Schatz CANIS Laboratory.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
Workshop on The Transformation of Science Max Planck Society, Elmau, Germany June 1, 1999 TOWARDS INFORMATIONAL SCIENCE Indexing and Analyzing the Knowledge.
Graduate School of Informatics Kyoto University, November 21, 2001 Technologies of the Interspace Peer-Peer Semantic Indexing Bruce Schatz CANIS Laboratory.
Bioinformatics and Computational Biology
Revolutionary System Models, The Net, & The Public Interest The Interspace Prototype ( ) Digital Libraries Initiative ( ) Worm Community.
Revolution & Kids: Building the Future of the Net & Understanding the Structures of the World Bruce R. Schatz CANIS - Community Systems Laboratory University.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
Opportunities for Text Mining in Bioinformatics (CS591-CXZ Text Data Mining Seminar) Dec. 8, 2004 ChengXiang Zhai Department of Computer Science University.
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, 龙星计划课程 : 信息检索 Course Summary ChengXiang Zhai ( 翟成祥 ) Department of.
BeeSpace: An Interactive Environment for Functional Analysis of Social Behavior Bruce Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign.
Analysis Environments For Functional Genomics Bruce R. Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign
University of Illinois at Urbana-Champaign. BeeSpace Project 5-year NSF-funded project Project Goals  Develop open bioinformatics resources  Support.
DISCUSSION Using a Literature-based NMF Model for Discovering Gene Functional Relationships Using a Literature-based NMF Model for Discovering Gene Functional.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
Genomic Medicine Grid Juan Pedro Sánchez Merino Instituto de Salud Carlos III
Ingenuity Pathway Analysis Alex Pico. Description "IPA is a software application that enables researchers to analyze and understand the complex biological.
Graduate School of Informatics Kyoto University, November 14, 2001 Functions of the Interspace Infrastructure for Concept Spaces Bruce Schatz CANIS Laboratory.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Applications of the Interspace Analysis for Community Repositories
Genomics research paper presentation
Semantic Processing with Context Analysis
Course Summary (Lecture for CS410 Intro Text Info Systems)
A Researcher’s Workbench in 2020: Intelligent Information Systems for Knowledge Synthesis and Discovery ChengXiang (“Cheng”) Zhai Department of Computer.
Course Summary ChengXiang “Cheng” Zhai Department of Computer Science
Presentation transcript:

BeeSpace Informatics: Interactive System for Functional Analysis Bruce Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign Fifth Annual Project Workshop IGB, Urbana IL May 22, 2009

Concept Navigation in BeeSpace

Informatics: From Bases to Spaces data Bases support genome data e.g. FlyBase has sequences and maps Genes annotated by GeneOntology and linked to biological literature information Spaces support biological literature e.g. BeeSpace uses automatically generated conceptual relationships to navigate functions

System Architecture

System Versions V1 FilterConcept Graph Search, Expand, Merge, Switch, Visualize V2 ClusterConceptual Groupings Small Worlds (Natural), Language Model (Steerable), Concepts/Documents V3SummarizeGene Descriptions Gene Extraction, Sentence Classification V4 AnalyzeFunctional Concepts Concept Identification, Category Grouping V5 AnswerEntity Relationships Entities, Relations, Templates

Informatics Researchers (Faculty) Investigators: Bruce Schatz, systems (Medical Information Science) ChengXiang Zhai, algorithms (Computer Science) Collaborators (students): Saurabh Sinha, Computer Science Jiawei Han, Computer Science Sheng Zhong, Bioengineering Nathan Price, Chemical & Biomolecular Engineering Collaborators (advices): John MacMullen, Library & Information Science Dan Roth, Computer Science Roxana Girju, Linguistics Karrie Karahalios, Computer Science

Informatics Researchers (Staff) V1-V3 Todd Littell, research programmer Jim Buell, research coordinator Nyla Ismail, biology postdoc Moushumi Sen Sarma, biology postdoc V4-V5 David Arcoleo, research programmer Barry Sanders, research programmer Moushumi Sen Sarma, biology postdoc Radhika Khetani, biology postdoc

Informatics Researchers (Students) V1 Filter (parse) Jing Jiang, Azadeh Shakery, Yuanhua Lv V2 Cluster (group) Brant Chee, Qiaozhu Mei, Peixiang Zhao V3 Summarize (classify) Xu Ling, Jing Jiang, Qiaozhu Mei, Xin He V4 Analyze (annotate) Xin He, Brant Chee, Moushumi Sarma, Xu Ling V5 Answer (extract) Xu Ling, Xin He, Yanen Li, Yue Lu

Analysis Environment: Features SPACE is a Paradigm not a Metaphor! Point of View for YOUR Problem Externally: -Dynamically describe custom Region of Space -Merge Regions to form Hypothesis Space -Differentially express genes against Space

Analysis Environment: System Concepts and Genes are Universal Entities! Uniformly Represented Uniformly Manipulated Internally: -Extract and Index Concepts within Collections -Navigate Concepts within Documents -Follow Genes from Documents into Databases

Automatic Categorization v2 Sorting of Spaces based on Metadata Sorting of Spaces based on Ontology MeSH for Medline Abstracts Gene Ontology computed for documents Sorting of Spaces based on Clustering Natural Maps from Small Worlds Steerable Maps from Language Models Semantic Indexing of Dynamic Spaces Fast System enables Interactive Sorting!

Small World Graph

Semantics Deeper and Faster Semantic Indexing across all of Medline Previous Attempts used Word Co-Occurrence Now Phrase Parser works general-purpose Now Mutual Information full differential Parallel Optimization of MI Graph Real-time Computation Shared Memory Cluster Interactive on our 16PC 256GB RAM workerbee Dynamic Spaces then Dynamic Semantic Indexing Interactive Clustering Natural Map Heuristic Approximation Small Worlds Graphs

Dynamic Clustering

Automatic Curation v3 Automatic Summarization of Genes Retrieve relevant sentences about gene Classify sentences into important aspects protein domain, homolog/ortholog expression pattern, phenotype function regulatory element, genetic interaction Generalizing to Biology Entities Genes, anatomical, behavior, chemical Question answering from biology factoids Computed Curation from Literature

Gene Summary (FlyBase) GP EL SI GI MP WFPI

Gene Summary (BeeSpace) Structured summary consists of relevant sentences covering 6 aspects of a gene Gene Products (GP) Expression Location (EL) Sequence Information (SI) Wild-type Function & Phenotypic Information (WFPI) Mutant Phenotype (MP) Genetical Interaction (GI)

Drosophila gene Abelson (Abl) tyrosine kinase

Tribolium gene Scr

Gene Summarizer New Aspects New categories (proposed by FlyBase curators) GP + SI => PS (protein domain or structure) SI => HO (homologs or orthologs) EL => EP (spatial/temporal expression patterns) SI => RE (regulatory element information) WFPI + MP => PF (wild-type or mutant phenotype and function) GI => IT (genetic or physical interaction) New (beyond FlyBase) => PG (population genetics) Utilize cross-domain information for improving the GS on other organisms.

BeeSpace System v3 SPACES and REGIONS Dynamic and Relative Space is collection of documents Region is collection of terms Extract creates new Region from old Space Map creates new Space from old Region New from Old Spaces and Regions via merges Summarize classifies Gene within Space Annotate finds differential functional expression

BeeSpace Semantic Operations Merge (S1,S2) into S3 Summarize (S) into Gene classify

New Interface v4 Single Window, Multiple Panes Space Panel, Service Tabs SPACEScustom, system FILTERsearching, sorting CLUSTERmap natural and steerable SUMMARIZEcategorize using space ANALYZEannotate using space

Functional Analysis v4 The software system goes beyond a searchable database, using statistical literature analyses to discover functional relationships between genes and behavior. This research will enable all scientists who study bee genes to live on the frontier of integrative biology, where biotechnology enables routine expression analysis and bioinformatics enables functional analysis unconstrained by pre-existing categories. Genelist Analyzer v4 -Differential Expression of Gene Names against Space -Background is custom made Literature Space -Produces Concept List from Gene List -Analyze using Concept Navigation and Gene Summarization

Question Answering v5 Entities and Relations Question Answering templates Entity Gene, Anatomical Behavior, Chemical Relation Regulation (Gene-Gene) Expression (Gene-Anatomy) Function (Gene-Behavior) Biological Process Function (Gene-Chemical) Molecular Function

Towards the Interspace The Analysis Environment technology is GENERAL ! BirdSpace? BeeSpace? PigSpace? CowSpace? ArthropodSpace? AnimalSpace? BioSpace? MedSpace?