The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Towards Adaptive Web-Based Learning Systems Katerina Georgouli, MSc, PhD Associate Professor T.E.I. of Athens Dept. of Informatics Tempus.
Object-Oriented Analysis and Design
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Text Tango: A New Text Data Mining Project Text Tango: A New Text Data Mining Project Marti A. Hearst GUIR Meeting, Sept 17, 1998.
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
1 Interfaces for Intense Information Analysis Marti Hearst UC Berkeley This research funded by ARDA.
Text Data Mining Prof. Marti Hearst UC Berkeley SIMS ABLE May 7, 1999.
Text Mining Tools: Instruments for Scientific Discovery Marti Hearst UC Berkeley SIMS Advanced Technologies Seminar June 15, 2000.
A Flexible Workbench for Document Analysis and Text Mining NLDB’2004, Salford, June Gulla, Brasethvik and Kaada A Flexible Workbench for Document.
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
UCB HCC Retreat Search Text Mining Web Site Usability Marti Hearst SIMS.
An Overview of Text Mining Rebecca Hwa 4/25/2002 References M. Hearst, “Untangling Text Data Mining,” in the Proceedings of the 37 th Annual Meeting of.
QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Advance Information Retrieval Topics Hassan Bashiri.
UCB CS Research Fair Search Text Mining Web Site Usability Marti Hearst SIMS.
Concept Mapping. What is Concept Mapping ? Concept mapping is a technique for representing knowledge in graphs. This technique was developed by Professor.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
BeeSpace Informatics Research: From Information Access to Knowledge Discovery ChengXiang Zhai Nov. 7, 2007.
THEME 1: Improving the Experimentation and Discovery Process Unprecedented complexity of scientific enterprise Is science stymied by the human bottleneck?
The Yellow Group Design Informatics (Regli, Stone, Kusiak, Leifer, Gupta, Chung, Fenves, Law, Kopena)
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 20 Object-Oriented.
Master Thesis Defense Jan Fiedler 04/17/98
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.
SemSearch: A Search Engine for the Semantic Web Yuangui Lei, Victoria Uren, Enrico Motta Knowledge Media Institute The Open University EKAW 2006 Presented.
Personal Information Management Vitor R. Carvalho : Personalized Information Retrieval Carnegie Mellon University February 8 th 2005.
A Model for Fast Web Mining Prototyping Nivio Ziviani UFMG – Brazil Álvaro Pereir a Ricardo Baeza-Yates Jesus Bisbal UPF – Spain.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
Text Mining Tools: Instruments for Scientific Discovery Marti Hearst UC Berkeley SIMS IMA Text Mining Workshop April 17, 2000.
Facilitating Document Annotation using Content and Querying Value.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Unified Modeling Language. Object Oriented Methods ► What are object-oriented (OO) methods?  OO methods provide a set of techniques for analyzing, decomposing,
Research Design. Selecting the Appropriate Research Design A research design is basically a plan or strategy for conducting one’s research. It serves.
CS3041 – Final week Today: Searching and Visualization Friday: Software tools –Study guide distributed (in class only) Monday: Social Imps –Study guide.
BeeSpace Informatics Research: From Information Access to Knowledge Discovery ChengXiang Zhai Nov. 14, 2007.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
Mining the Biomedical Research Literature Ken Baclawski.
Ch- 8. Class Diagrams Class diagrams are the most common diagram found in modeling object- oriented systems. Class diagrams are important not only for.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Virtual Information and Knowledge Environments Workshop on Knowledge Technologies within the 6th Framework Programme -- Luxembourg, May 2002 Dr.-Ing.
Design-Directed Programming Martin Rinard Daniel Jackson MIT Laboratory for Computer Science.
Class Diagrams. Terms and Concepts A class diagram is a diagram that shows a set of classes, interfaces, and collaborations and their relationships.
Concept mining for programming automation. Problem ➲ A lot of trivial tasks that could be automated – Add field Patronim on Customer page. – Remove field.
RESEARCH METHODS Lecture 12. THE RESEARCH PROCESS.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Innovative Novartis Knowledge Center
Databases and Database User ch1 Define Database? A database is a collection of related data.1 By data, we mean known facts that can be recorded and that.
Introduction to DBMS Purpose of Database Systems View of Data
RESEARCH METHODS Lecture 12
Visual Information Retrieval
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Text Tango: A New Text Data Mining Project
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
OUTLINE Basic ideas of traditional retrieval systems
Interfaces for Intense Information Analysis
Introduction to DBMS Purpose of Database Systems View of Data
Citation-based Extraction of Core Contents from Biomedical Articles
Batyr Charyyev.
RESEARCH METHODS Lecture 12
Presentation transcript:

The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques for extracting propositions Two Main Thrusts:

LINDI: Target Components 1. Special UI for retrieving appropriate docs 2. Language analysis on docs to detect causal relationships between concepts 3. Probabilistic representation of concepts and relationships 4. UI + User: Hypothesis creation

Design Goals of LINDI UI Support for the development of extended search strategies 1. Text filtering and manipulation tool to help the development of strategies 2. Text visualization and analysis tool to help the formulation of hypotheses

The User Interface l A general search interface should support –History –Context –Comparison –Operators: Intersection, Union, Slicing –Operator Reuse –Visualization (where appropriate) l We have an initial implementation l It needs lots of work

Scenario: Explore Functions of a Gene l Objective –Determine the functions of a newly sequenced Gene X. l Known facts –Gene X co-expresses (activated in the same cell) with Gene A, B, C –The relationship of Gene A, B, C with certain types of diseases (from medical literature) l Question –What types of diseases are Gene X related to?

Medical Literature Explore Functions of New Gene X Possible Function For Gene-X Gene-A Keywords Gene-B Keywords Slide adapted from K. Patel Slicing Gene-C Keywords Projection Keywords Intersection Mapping Query

Medical Literature Explore Functions of New Gene X Possible Function For Gene-X Gene-A Keywords Gene-B Keywords Slide adapted from K. Patel Slicing Gene-C Keywords Projection Keywords Intersection Mapping Query

Architecture of LINDI UI l Data Layer l Annotation Layer l User Interface Layer

Data Layer l Purpose –Hide different formats of text collections l Components –Data: Abstractions representing records of a text collection –Operations: performed on the data l Data –A set of records –Each record is a set of tuples with types l Operations –union, intersection, projection, mapping

Annotation Layer l Purpose –Associate data set with operations that produced them (history) –History is a first class object l Advantage –Streamline a sequence of operations –Reuse operations –Parameterize operations

User Interface l This version completed Aug 10, 2000 –Designed by Marti Hearst and Hao Chen –Code written by Hao Chen l Direct manipulation of information objects and access operations –Query –Intersection –Union –Mapping –Slicing l Record and reuse of past operations l Parameterization of operations l Streamlining of operations

Initial Palette

Query Structure Determined by Collection Type

Query Operation Results

Projection Operation and Subsequent Results

Parameterized Query: Repeat operations with different values GC GB GA

Intersection over Projected Attribute

Example Interaction with UI Prototype 1 Query on Gene names 2 Project out only mesh headings 3 Intersect the results 4 Map to create a ranking 5 Slice out the top-ranked.

Second Version of UI l LINDI Miner l Circa May 2002 –Designed by Marti Hearst –Implemented by Melody Ivory l Emphasize reusing results of prior text analysis l See lindi-miner.ppt

The Language Analysis Component l Goal: Extract Propositions from Text and Make Inferences l Why Extract Propositions from Text? –Text is how knowledge at the propositional level is communicated –Text is continually being created and updated by the outside world

Example: Etiology l Given –medical titles and abstracts –a problem (incurable rare disease) –some medical expertise l find causal links among titles –symptoms –drugs –results

Traditional Semantic Grammars l Example (Burton & Brown 79) –Interpreting “What is the current thru the CC when the VC is 1.0?” := when := what is := := is := VC –Resulting semantic form is: (RESETCONTROL (STQ VC 1.0) (MEASURE CURRENT CC))

Example: Statistical Semantic Grammar l To detect causal relationships between medical concepts –Title: Magnesium deficiency implicated in increased stress levels. –Interpretation: related-to –Inference: »Increase(stress, decrease(mg))

Statistical Semantic Grammars l Empirical NLP has made great strides –But mainly applied to syntactic structure l Semantic grammars are powerful, but –Brittle –Time-consuming to construct l Idea: –Use what we now know about statistical NLP to build up a probabilistic grammar