Important Task in Patents Retrieval Recall is an Important Factor Given Query Patent -> the Task is to Search all Related Patents Patents have Complex.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Analyzing Document Retrievability in Patent Retrieval Settings Shariq Bashir, and Andreas Rauber DEXA 2009, Linz,
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Information Retrieval Visualization CPSC 533c Class Presentation Qixing Zheng March 22, 2004.
ICASSP, May Arjen P. de Vries Thijs Westerveld Tzvetanka I. Ianeva Combining Multiple Representations on the TRECVID Search Task.
Information Retrieval in Practice
Search Engines and Information Retrieval
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Presentation Title Presentation Subtitle and/or Conference Name Place Day Month Year First Name Last Name Job Title.
Low/High Findability Analysis Shariq Bashir Vienna University of Technology Seminar on 2 nd February, 2009.
Modern Information Retrieval
Large Scale Findability Analysis Shariq Bashir PhD-Candidate Department of Software Technology and Interactive Systems.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
MANISHA VERMA, VASUDEVA VARMA PATENT SEARCH USING IPC CLASSIFICATION VECTORS.
Patent Search QUERY Log Analysis Shariq Bashir Department of Software Technology and Interactive Systems Vienna.
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.
Evaluating Retrieval Systems with Findability Measurement Shariq Bashir PhD-Student Technology University of Vienna.
CS246 Basic Information Retrieval. Today’s Topic  Basic Information Retrieval (IR)  Bag of words assumption  Boolean Model  Inverted index  Vector-space.
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Search Engines and Information Retrieval Chapter 1.
TREC 2009 Review Lanbo Zhang. 7 tracks Web track Relevance Feedback track (RF) Entity track Blog track Legal track Million Query track (MQ) Chemical IR.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
COMPUTER-ASSISTED PLAGIARISM DETECTION PRESENTER: CSCI 6530 STUDENT.
Finding Similar Questions in Large Question and Answer Archives Jiwoon Jeon, W. Bruce Croft and Joon Ho Lee Retrieval Models for Question and Answer Archives.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
A Study on Query Expansion Methods for Patent Retrieval Walid MagdyGareth Jones Centre for Next Generation Localisation School of Computing Dublin City.
1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
The PATENTSCOPE search system: CLIR February 2013 Sandrine Ammann Marketing & Communications Officer.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010.
Interpreting Advertiser Intent in Sponsored Search BHANU C VATTIKONDA, SANTHOSH KODIPAKA, HONGYAN ZHOU, VACHA DAVE, SAIKAT GUHA, ALEX C SNOEREN 1.
Chapter 6: Information Retrieval and Web Search
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Modeling term relevancies in information retrieval using Graph Laplacian Kernels Shuguang Wang Joint work with Saeed Amizadeh and Milos Hauskrecht.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.
Evaluation of (Search) Results How do we know if our results are any good? Evaluating a search engine  Benchmarks  Precision and recall Results summaries:
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Chapter 23: Probabilistic Language Models April 13, 2004.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
CONCLUSIONS & CONTRIBUTIONS Ground-truth dataset, simulated search tasks environment Multiple everyday applications (MS Word, MS PowerPoint, Mozilla Browser)
Performance Measurement. 2 Testing Environment.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
Autumn Web Information retrieval (Web IR) Handout #14: Ranking Based on Click Through data Ali Mohammad Zareh Bidoki ECE Department, Yazd University.
1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Rey-Long Liu Dept. of Medical Informatics Tzu Chi University Taiwan
An Empirical Study of Learning to Rank for Entity Search
Multimedia Information Retrieval
موضوع پروژه : بازیابی اطلاعات Information Retrieval
Text Categorization Document classification categorizes documents into one or more classes which is useful in Information Retrieval (IR). IR is the task.
Chapter 5: Information Retrieval and Web Search
CS246: Information Retrieval
Relevance and Reinforcement in Interactive Browsing
Information Retrieval and Web Design
Topic: Semantic Text Mining
Presentation transcript:

Important Task in Patents Retrieval Recall is an Important Factor Given Query Patent -> the Task is to Search all Related Patents Patents have Complex Contents and Technical Structure Diverse/Large Vocabulary Writers often intentionally use Vague Terms and Expression Query Terms are extracted from Query Patents Selecting Relevant Query Terms is a difficult Task Document-Terms Mismatch in Queries IR Systems Bias effects the Retrievability of Patents A Subset of patents become more Retrievable at the expense of others A large number of Patents either become Low Findable or Could not be Findable via any Query Missing Terms are Identified from Query Expansion Pseudo-Relevance Feedback Documents (PRF) used for identifying Expansion Terms Relevant PRF are identified using Query Patents Similarity with Retrieve Documents of Queries Different Fields of Query Patents used For Similarity Computation Title, Abstract, Claims, Background Summary, Figures Tags, Descriptions Patent Description, Background Summary gives Best Results Using full text Query Patent for PRF similarity computation has several limitations Full text Query Patent may have large number of Irrelevant Terms Our Approach We compute PRF similarity with only Relevant Terms of Query Patents Relevant Terms are identified based upon their Terms closeness/compactness with Queries terms Closeness/Compactness is identified based upon different Features and Trained Classifier Measures Low/High findable documents in collection (D) Defined as d  D c denotes the rank user willing to proceed k dg is the rank of document d in query q  Q f(k dg,c) is cost function, returns 1 if k dg <=c, otherwise 0 Lorenz Curve For Visualizing retrievability inequality between documents More skewed the Curve, greater the amount of bias Patents downloaded from US Patents and trademark office website ( USPC class 422 ad 423 with 54,353 Patents Retrieval Systems TFIDF, BM25, Exact Match, Language Modeling (LM) Each Patent used as a seed for Query Generation Every Patent is considered as a Query Patent, and rest are considered for searching Related Patents Retrieval System are evaluated based upon Retrievability Measurement Prior-Art Search Prior-Art Retrieval Challenges Prior-Art Search Prior-Art Queries Construction Prior-Art Search Improving Patents Retrievability with Query Expansion Prior-Art Search Relevant PRF Identification with Query Patent Similarity Prior-Art Search Retrievability Measurement QueriesTotal Queries Average Retrievability/Query 3 Terms Queries2,908, Terms Queries2,876, Prior-Art Search is a Challenging Task Without Query Expansion, we experienced large Retrievability inequality We Improved Patents Retrievability using Query Expansion with PRF PRF Documents are identified via Query Patent Similarity and Query Documents Dataset and Experimental Setup Prior-Art Search Conclusion 32 nd European Conference on Information Retrieval (ECIR’10), Milton Keynes, UK 28 th -31 st March, 2010