Download presentation
Presentation is loading. Please wait.
Published byNorma Campbell Modified over 9 years ago
1
BeeSpace Informatics Research: From Information Access to Knowledge Discovery ChengXiang Zhai Nov. 14, 2007
2
BeeSpace Technology: From V3 to V4 Literature Search & Navigation Query Docs Function Analysis Entities Relations ER Graph Mining Question Answers Knowledge Base Inference Engine Question Answers Expert Knowledge Genes Function
3
New Functions in V4 Massive Entity/Relation Extraction Graph Indexing and Mining Integration of Expert Knowledge & Reasoning Personalization & Info/Knowledge Sharing “Plug and Play” (PnP)
4
Massive Entity Recognition Class1: Small Variation (Dictionary/Ontology) –Organism, Anatomy, Biological Process, Pathway, Protein Family Class2: Medium Variation –Gene, cis Regulatory Element Class3: Large Variation –Phenotype, Behavior
5
Massive Relation Extraction Expression Location –the expression of a gene in some location (tissues, body parts) Homology/Orthology –one gene is homologous to another gene Biological process –one gene has some role in a biological process Genetic/Physical/Regulatory Interaction –one gene interacts with another gene in a certain fashion (3 types of relations) –a simple case: Protein-Protein Interaction (PPI)
6
Entity Relation Graph Mining The extracted entities and relations form a weighted graph Need to develop techniques to mine the graph for knowledge –Store graphs –Index graphs –Mining algorithms (neighbor finding, path finding, entity comparison, outlier detection, frequent subgraphs,….) –Mining language
7
Integration of Expert Knowledge How can we combine expert knowledge with knowledge extracted from literature? Possible strategies: –Interactive mining (human knowledge is used to guide the next step of mining) –Trainable programs (focused miner, targeting at certain kind of knowledge) –Inference-based integration
8
Inference-Based Discovery Encode all kinds of knowledge in the same knowledge representation language Perform logic inferences Example –Regulate (GeneA, GeneB, ContextC). [Literature mining] –SeqSimilar(GeneA,GeneA’) [Sequence mining] –Regulate(X,Y,C) Regulate(Z,Y,C) & SeqSimilar(X,Z) [Human knowledge] – Regulate(GeneA’,GeneB,ContextC) –ADD: InPathway(GeneB, P1) –InPathway(X,P) Regulate(X,Y,C) & InPathway(Y,P) [Human knowledge] – InvolvedInPathway(GeneA’,P1)
9
Personalization & Workflow Management Different users have different tasks personalization –Tracking a user’s history and learning a user’s preferences –Exploiting the preferences to customize/optimize the support –Allowing a user to define/build special function modules Workflow management
10
Information/Knowledge Sharing Different users may perform similar tasks Information/Knowledge sharing –Capturing user intentions –Recommend information/knowledge –How do we solve the problem of privacy? Massive collaborations? –Each user contributes a small amount of knowledge –All the knowledge can be combined to infer new knowledge
11
Plug and Play Users’ tasks vary significantly Need flexible combinations of basic modules Need to move toward a “discovery workbench” –How do we design basic modules? –How do we support synthesis of information and knowledge?
12
BeeSpace V4 Literature Search & Navigation Text Mining Entities Relations ER Graph Mining Knowledge Base Inference Engine Expert Knowledge Vertical Search Services PnP Function Analyzers Customized Knowledge Base User
13
Discussion Task Model? PnP Modules? Massive Collaboration?
14
BeeSpace V4: System Architecture Literature Search & Navigation Entities Relations ER Graph Mining Machine Learning NLP Expert Knowledge Special Search PnP Function Analyzers User Information Extraction User Modeling & Personalization Topic Modelng NCBI Genome Databases … Hypothesis Knowledge Base Inference Engine User Interface/ Workflow Manager
15
BeeSpace V4: System Architecture Literature Search & Navigation Entities Relations ER Graph Mining Machine Learning NLP Expert Knowledge Special Search PnP Function Analyzers User Information Extraction User Modeling & Personalization Topic Modelng NCBI Genome Databases … Hypothesis Knowledge Base Inference Engine User Interface/ Workflow Manager Yue Peixiang Xin, Xu, Yue Xin, Xu, Moushumi Peixiang Yuanhua Xu, Yue Moushumi Yuanhua Xin, Yuanhua Yuanhua, Moushumi Yue, Xin, Moushumi
16
Modules Navigation & Search (Improve V3) [Yuanhua] Information Extraction [Yue] ER Graph Mining [Peixiang] Specialized Search [Xu] Function Analyzers [Xin] User Modeling, Personalization, Workflow [Yuanhua] Inference Engine [Yue]
17
Informatics Research Themes Specialized Search –Hypothesis search Information Extraction –Entities, relations Graph Mining –Indexing, query language, mining algorithms Function analyzers –Gene set annotator Personalization –User model Inference engine –Knowledge representation language, uncertainty
18
Example of Interactive Graph Mining Gene A2 Gene A1 Gene A4 Gene A3 Gene A4’ Gene A1’ Behavior B4Behavior B3 Behavior B2 Behavior B1 isa Co-occur-fly Orth-mos Co-occur-mos Co-occur-bee Co-occur-fly Reg orth Reg 1.X=NeighborOf(B4, Behavior, {co-occur,isa}) {B1,B2,B3} 2. Y=NeighborOf(X, Gene, {c-occur, orth} {A1,A1’,A2,A3} 3. Y=Y + {A5, A6} {A1,A1’, A2, A3,A5,A6} 4. Z=NeighborOf(Y, Gene, {reg}) {A4, A4’} Gene A5 Reg X= PathBetween({A4,A4’}, B4, {co-occur, reg,isa})
19
Inference-Based Discovery Encode all kinds of knowledge in the same knowledge representation language Perform logic inferences Example –Regulate (GeneA, GeneB, ContextC). [Literature mining] –SeqSimilar(GeneA,GeneA’) [Sequence mining] –Regulate(X,Y,C) Regulate(Z,Y,C) & SeqSimilar(X,Z) [Human knowledge] – Regulate(GeneA’,GeneB,ContextC) –ADD: InPathway(GeneB, P1) –InPathway(X,P) Regulate(X,Y,C) & InPathway(Y,P) [Human knowledge] – InvolvedInPathway(GeneA’,P1)
20
PnP Function Analyzers Basic objects –GeneSet, DocSet, SentSet, TermSet Basic operators –Gene summarizer –GeneSet annotator –…
21
EntitySet GeneSet BehaviorSet … Doc/SentSet ModelOrg …. Splitter Filter/Attractor Converter …. GeneSearch: GeneSet Doc/SentSet DocSplitter: Doc/SentSet {Set1, …,Setk}
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.