BeeSpace Informatics Research: From Information Access to Knowledge Discovery ChengXiang Zhai Nov. 7, 2007.

Slides:



Advertisements
Similar presentations
A Researcher’s Workbench in 2020: Intelligent Information Systems for Knowledge Synthesis and Discovery ChengXiang (“Cheng”) Zhai Department of Computer.
Advertisements

Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
SFU, CMPT 741, Fall 2009, Martin Ester 418 Outlook Outline Trends in KDD research Graph mining and social network analysis Recommender systems Information.
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
A Flexible Workbench for Document Analysis and Text Mining NLDB’2004, Salford, June Gulla, Brasethvik and Kaada A Flexible Workbench for Document.
TRANSFAC Project Roadmap Discussion.  Structure DNA-binding domain (DBD)  The portion (domain) of the transcription factor that binds DNA Trans-activating.
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
August 29, 2002InforMax Confidential1 Vector PathBlazer Product Overview.
Architecture for Pattern- Base Management Systems Manolis TerrovitisPanos Vassiliadis National Technical Univ. of Athens, Dept. of Electrical and Computer.
How can Computer Science contribute to Research Publishing?
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Building Knowledge-Driven DSS and Mining Data
The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques.
B IOMEDICAL T EXT M INING AND ITS A PPLICATION IN C ANCER R ESEARCH Henry Ikediego
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
Data Mining Techniques
Knowledge Integration for Gene Target Selection Graciela Gonzalez, PhD Juan C. Uribe Contact:
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Chapter 14: Artificial Intelligence Invitation to Computer Science, C++ Version, Third Edition.
计算机科学概述 Introduction to Computer Science 陆嘉恒 中国人民大学 信息学院
Grant Number: IIS Institution of PI: Arizona State University PIs: Zoé Lacroix Title: Collaborative Research: Semantic Map of Biological Data.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Information Systems: Databases Define the role of general information systems Describe the elements of a database management system (DBMS) Describe the.
Information Extraction from Literature Yue Lu BeeSpace Seminar Oct 24, 2007.
Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Decision Support Systems Chapter 10.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
南台科技大學 資訊工程系 A web page usage prediction scheme using sequence indexing and clustering techniques Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2010/10/15.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
Page 1 Alliver™ Page 2 Scenario Users Contents Properties Contexts Tags Users Context Listener Set of contents Service Reasoner GPS Navigator.
University of Illinois at Urbana-Champaign BeeSpace Navigator v4.0 and Gene Summarizer beespace.uiuc.edu `
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
BeeSpace Informatics Research: From Information Access to Knowledge Discovery ChengXiang Zhai Nov. 14, 2007.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
Mining the Biomedical Research Literature Ken Baclawski.
Domain Adaptation for Biomedical Information Extraction Jing Jiang BeeSpace Seminar Oct 17, 2007.
Bioinformatics and Computational Biology
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Introduction to biological molecular networks
Opportunities for Text Mining in Bioinformatics (CS591-CXZ Text Data Mining Seminar) Dec. 8, 2004 ChengXiang Zhai Department of Computer Science University.
Information Design Trends Unit Five: Delivery Channels Lecture 2: Portals and Personalization Part 2.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
CYUT ISKM 2004/01/13 1 Fuzzy logic methods in recommender systems Author: Ronald R. Yager Source:Fuzzy set and systems, Vol. 134, 2003, pp Presented.
Artificial Intelligence
Information Visualization Theresa Nguyen 4/10/2001.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Artificial Intelligence
Automatic cLasification d
Development of the Amphibian Anatomical Ontology
A Researcher’s Workbench in 2020: Intelligent Information Systems for Knowledge Synthesis and Discovery ChengXiang (“Cheng”) Zhai Department of Computer.
High-throughput Biological Data The data deluge
Artificial Intelligence
Data Warehousing and Data Mining
Citation-based Extraction of Core Contents from Biomedical Articles
Supporting High-Performance Data Processing on Flat-Files
The Data Civilizer System
Personalization Personalized System Traditional System 3 2 1
Presentation transcript:

BeeSpace Informatics Research: From Information Access to Knowledge Discovery ChengXiang Zhai Nov. 7, 2007

BeeSpace Technology: From V3 to V4 Literature Search & Navigation Query Docs Function Analysis Entities Relations ER Graph Mining Question Answers Knowledge Base Inference Engine Question Answers Expert Knowledge Genes Function

New Functions in V4 Massive Entity/Relation Extraction Graph Indexing and Mining Integration of Expert Knowledge & Reasoning Personalization & Info/Knowledge Sharing “Plug and Play” (PnP)

Massive Entity Recognition Class1: Small Variation (Dictionary/Ontology) –Organism, Anatomy, Biological Process, Pathway, Protein Family Class2: Medium Variation –Gene, cis Regulatory Element Class3: Large Variation –Phenotype, Behavior

Massive Relation Extraction Expression Location –the expression of a gene in some location (tissues, body parts) Homology/Orthology –one gene is homologous to another gene Biological process –one gene has some role in a biological process Genetic/Physical/Regulatory Interaction –one gene interacts with another gene in a certain fashion (3 types of relations) –a simple case: Protein-Protein Interaction (PPI)

Entity Relation Graph Mining The extracted entities and relations form a weighted graph Need to develop techniques to mine the graph for knowledge –Store graphs –Index graphs –Mining algorithms (neighbor finding, path finding, entity comparison, outlier detection, frequent subgraphs,….) –Mining language

Integration of Expert Knowledge How can we combine expert knowledge with knowledge extracted from literature? Possible strategies: –Interactive mining (human knowledge is used to guide the next step of mining) –Trainable programs (focused miner, targeting at certain kind of knowledge) –Inference-based integration

Inference-Based Discovery Encode all kinds of knowledge in the same knowledge representation language Perform logic inferences Example –Regulate (GeneA, GeneB, ContextC). [Literature mining] –SeqSimilar(GeneA,GeneA’) [Sequence mining] –Regulate(X,Y,C)  Regulate(Z,Y,C) & SeqSimilar(X,Z) [Human knowledge] –? Regulate(GeneA’,GeneB,ContextC)

Personalization & Workflow Management Different users have different tasks  personalization –Tracking a user’s history and learning a user’s preferences –Exploiting the preferences to customize/optimize the support –Allowing a user to define/build special function modules Workflow management

Information/Knowledge Sharing Different users may perform similar tasks  Information/Knowledge sharing –Capturing user intentions –Recommend information/knowledge –How do we solve the problem of privacy? Massive collaborations? –Each user contributes a small amount of knowledge –All the knowledge can be combined to infer new knowledge

Plug and Play Users’ tasks vary significantly Need flexible combinations of basic modules Need to move toward a “discovery workbench” –How do we design basic modules? –How do we support synthesis of information and knowledge?

BeeSpace V4 Literature Search & Navigation Text Mining Entities Relations ER Graph Mining, Peixiang Knowledge Base Inference Engine, Yue, Xin, Bio Expert Knowledge Vertical Search Services Xin Xu, PnP Function Analyzers, Peixiang, Bio Customized Knowledge Base User Yue

Discussion Task Model? PnP Modules? Massive Collaboration?