Please have a seat. Our program will commence shortly.

Slides:



Advertisements
Similar presentations
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 7: Scoring and results assembly.
Advertisements

Md. Ahsan Arif, Assistant Professor, Dept. of CSE, AUB
WEB MINING. Why IR ? Research & Fun
Chapter 5: Introduction to Information Retrieval
INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
Introduction to Information Retrieval
INSTRUCTOR: DR.NICK EVANGELOPOULOS PRESENTED BY: QIUXIA WU CHAPTER 2 Information retrieval DSCI 5240.
Key-word Driven Automation Framework Shiva Kumar Soumya Dalvi May 25, 2007.
Scott julian Xiaojie Jiang Dr. Ngu EARTH MOVER’S WEB SERVICE SEARCHER E.M.W.S.S.
By Angela Brooks and David Chapman Mentor: Dr. Garry Larson Molecular Medicine, City Of Hope Southern California Bioinformatics Institute 2004.
Aug. 20, JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
1 Ranked Queries over sources with Boolean Query Interfaces without Ranking Support Vagelis Hristidis, Florida International University Yuheng Hu, Arizona.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Bioinformatics Tools for Microarray Analysis Connie Wu Dr. Jim Breaux Dr. Sandeep Gulati ViaLogy Southern California Bioinformatics Institute Summer 2004.
Recommender systems Ram Akella November 26 th 2008.
Vocabulary Spectral Analysis as an Exploratory Tool for Scientific Web Intelligence Mike Thelwall Professor of Information Science University of Wolverhampton.
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Social Mining Social Computing.
Query Expansion.
Welcome to Scopus Training by : Arash Nikyar June 2014
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
XP Class Objectives – 9/10 and 9/12 Learn how to design a small database Understand the goals of a database Understand the terminology of database design.
© What do bioinformaticians do?
THOMSON SCIENTIFIC Web of Science Using the specialized search and analyze features Jackie Stapleton, librarian Fall 2006.
BioKnOT Biological Knowledge through Ontology and TFIDF By: James Costello Advisor: Mehmet Dalkilic.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
No. 1 Classification and clustering methods by probabilistic latent semantic indexing model A Short Course at Tamkang University Taipei, Taiwan, R.O.C.,
25/03/2003CSCI 6405 Zheyuan Yu1 Finding Unexpected Information Taken from the paper : “Discovering Unexpected Information from your Competitor’s Web Sites”
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Discovering Gene-Disease Association using On-line Scientific Text Abstracts. Raj Adhikari Advisor: Javed Mostafa.
D AFFODIL Strategic Support Evaluated Claus-Peter Klas Norbert Fuhr Andre Schaefer University of Duisburg-Essen.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
Search Engine Architecture
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Retrieval of Highly Related Biomedical References by Key Passages of Citations Rey-Long Liu Dept. of Medical Informatics Tzu Chi University Taiwan.
Web- and Multimedia-based Information Systems Lecture 2.
Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al.
Iana Atanassova Research: – Information retrieval in scientific publications exploiting semantic annotations and linguistic knowledge bases – Ranking algorithms.
National Technical University of Ukraine “Kiev Polytechnic Institute” Heat and energy design faculty Department of automation design of energy processes.
A Novel Visualization Model for Web Search Results Nguyen T, and Zhang J IEEE Transactions on Visualization and Computer Graphics PAWS Meeting Presented.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.
LogTree: A Framework for Generating System Events from Raw Textual Logs Liang Tang and Tao Li School of Computing and Information Sciences Florida International.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
A RESEARCH SUPPORT SYSTEM FRAMEWORK FOR WEB DATA MINING Jin Xu, Yingping Huang, Gregory Madey Department of Computer Science and Engineering University.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Information Retrieval and Web Search IR models: Vector Space Model Term Weighting Approaches Instructor: Rada Mihalcea.
No. 1 Classification Methods for Documents with both Fixed and Free Formats by PLSI Model* 2004International Conference in Management Sciences and Decision.
Data warehousing AND Data mining PRESENTED by N.GANESH (10QF1A0447)
HINARI/Health Information on the Internet (module 1.3 Part A)
Tutorial#3.
Information Retrieval and Web Search
Search Engine Architecture
Efficient Ranking of Keyword Queries Using P-trees
Implementation Issues & IR Systems
Information Retrieval and Web Search
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
สมชาย ประสิทธิ์จูตระกูล
Learning Literature Search Models from Citation Behavior
Citation-based Extraction of Core Contents from Biomedical Articles
CS246: Information Retrieval
Search Engine Architecture
FREERIDE: A Framework for Rapid Implementation of Datamining Engines
Motivation It can effectively mine multi-modal knowledge with structured textural and visual relationships from web automatically. We propose BC-DNN method.
Presentation transcript:

Please have a seat. Our program will commence shortly.

Biomarker Automated Retrieval Tool Ronny Chan, Kim Ngo Earth Science Data Systems Dept.

Bioinformatics Relationship Science produces massive amounts of data Data needs to be analyzed, stored, & retrieved  This is data-mining We want to apply computer science to improve this process

Motivation Problems with conventional data mining Time consuming Accuracy not defined (subjective) No objective scientific info retrieval tool Where are the Biomarkers?

Cancer Biomarkers An indicator of cancerous growth.

Proposed Solution Create a program that allows people to quickly scan literature for the most relevant keywords/biomarkers B.A.R.T. HER-2 HPEBP4 EP-CAM ERBB2 BAG-1

Significance What is the need of the project? More efficient research Save time conventional enhanced B.A.R.T.

Goals Make biomarker/keyword searches more efficient Learn Java Learn SQL

Approach Write a program Read in articles Use part of Vector Space Model algorithm to rank terms Output relevant terms in statistical rankings they BRCA1 VS.

Vector Space Model Information Retrieval System Introduced by Gerald Salton in the 60’s. Used widely in different search engines

Algorithm for B.A.R.T. Keywords Input PubMed Query Agent Data Store Data Retrieval and Output Content Analyzer Keyword Parser Content Ranker

DCIS CU-TP3982 ERBB2 HER-2 HPEBP4 BAG-1 EP-CAM 99M Results

Lessons & Difficulties Deciding on algorithm choice Ease of implementation and effectiveness Limited knowledge & experience Java, SQL Initial implementation is slow 5 ARTICLES=160 sec UPDATE: AUGUST 18, 2004  100 ARTICLES=8^19 years 20 ARTICLES=1904 sec 100 ARTICLES=8^38 years

Future work Apply different term weight functions to make results more robust Optimize the program for speed

Citations 1. SpaceImplementation-6per.PDF /cs419/ rank.pdf /Lectures/04-BooleanVectorSpaceB.pdf 5. Biomarkers Definitions Working Group. Biomarkers and surrogate endoints: preferred definitions and conceptual framework. Clin. Pharmacol. Ther. 69(3), (2001).

Acknowledgements Earth Science Data System, JPL Tina Xiao Paul Ramirez Chris Mattmann Roshanak Roshandel Sean Hardman ALL SoCalBSI Colleagues National Institute of Health (NIH) National Science Foundation (NSF) Southern California Bioinformatics Summer Institute (So Cal BSI) SoCalBSI Professors Jacqueline Heras

Q :malignant breast cancer D 1:detection of malignant level in the cell D 2:sighting of breast stage in the breast cancer D 3:detection of malignant stage in the cancer docthestagelevelsightingcellmalignantinofbreastdetectioncancer D11(0)01(.477)0 1(.176)1(0) 01(.176)0 D21(0)1(.176)01(.477)001(0) 2(.477)01(.176) D31(0)1(.176)000 1(0) 01(.176) Q VSM Example IDTERMDFIDF 1the30 2stage level sighting cell malignant in30 8of30 9breast detection Cancer2.176

Example Continued… Keyword tf * idf